# Pearce Wiki

### Site Tools

notes:bayesian_classification

# Differences

This shows you the differences between two versions of the page.

 notes:bayesian_classification [2013/03/15 13:37]andy [Combining words] notes:bayesian_classification [2013/03/15 14:06]andy [Combining words] Both sides previous revision Previous revision 2013/03/15 14:13 andy 2013/03/15 14:11 andy [Combining words] 2013/03/15 14:06 andy [Combining words] 2013/03/15 13:37 andy [Combining words] 2013/03/15 13:36 andy [Combining words] 2013/03/15 13:12 andy [Combining words] 2013/03/15 12:18 andy 2013/03/15 11:43 andy [Combining words] 2013/03/15 10:29 andy [Classification based on a word] 2013/03/14 16:33 andy [Combining words] 2013/03/14 16:16 andy 2013/03/14 15:21 andy 2013/03/14 11:56 andy created 2013/03/15 14:13 andy 2013/03/15 14:11 andy [Combining words] 2013/03/15 14:06 andy [Combining words] 2013/03/15 13:37 andy [Combining words] 2013/03/15 13:36 andy [Combining words] 2013/03/15 13:12 andy [Combining words] 2013/03/15 12:18 andy 2013/03/15 11:43 andy [Combining words] 2013/03/15 10:29 andy [Classification based on a word] 2013/03/14 16:33 andy [Combining words] 2013/03/14 16:16 andy 2013/03/14 15:21 andy 2013/03/14 11:56 andy created Next revision Both sides next revision Line 94: Line 94: \begin{equation*} P(C_i|W_a \cap W_b \cap ... \cap W_z) = \frac{\frac{1}{N}N_{C_i}\prod\limits_{j=a}^z{\frac{N_{C_i}(W_j)}{N_{C_i}}}}{\frac{1}{N}\sum\limits_{k=1}^n{N_{C_k}\prod\limits_{j=a}^z{\frac{N_{C_k}(W_j)}{N_{C_k}}}}} \end{equation*} \begin{equation*} P(C_i|W_a \cap W_b \cap ... \cap W_z) = \frac{\frac{1}{N}N_{C_i}\prod\limits_{j=a}^z{\frac{N_{C_i}(W_j)}{N_{C_i}}}}{\frac{1}{N}\sum\limits_{k=1}^n{N_{C_k}\prod\limits_{j=a}^z{\frac{N_{C_k}(W_j)}{N_{C_k}}}}} \end{equation*} - $$P(C_i|W_a \cap W_b \cap ... \cap W_z) = \frac{\prod\limits_{j=a}^z{N_{C_i}(W_j)}}{N_{C_i}^{x-1}\sum\limits_{k=1}^n{\frac{1}{N_{C_k}^{x-1}}\prod\limits_{j=a}^z{N_{C_k}(W_j)}}}$$ + ​\Rightarrow ​P(C_i|W_a \cap W_b \cap ... \cap W_z) = \frac{\prod\limits_{j=a}^z{N_{C_i}(W_j)}}{N_{C_i}^{x-1}\sum\limits_{k=1}^n{\frac{1}{N_{C_k}^{x-1}}\prod\limits_{j=a}^z{N_{C_k}(W_j)}}} - Where $x$ is the total number of words. + Where $x$ is the total number of words. This version keeps the values relatively large so should hopefully reduce problems with floating point underflow (although may be susceptible to overflow if the number of tokens becomes excessive). ==== Two-category case ==== ==== Two-category case ====