User Tools

Site Tools


notes:bayesian_classification

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
notes:bayesian_classification [2013/03/15 12:18]
andy
notes:bayesian_classification [2013/03/15 14:11]
andy [Combining words]
Line 91: Line 91:
 Please forgive the slightly loose use of notation, there are a few too many dimensions over which to iterate for clarity. Please forgive the slightly loose use of notation, there are a few too many dimensions over which to iterate for clarity.
  
 +One slight simplification to note results from the fact that $P(C_i)$ is presumably determined by dividing a number of trained messages by the total number of messages trained. Let $N_{C_i}$ indicate the number of messages trained in category $C_i$, $N$ indicate the number of messages trained overall and $N_{C_i}(W_a)$ indicate the number of messages containing token $W_a$ that were trained in category $C_i$. Thus the equation above becomes:
 +
 +\begin{equation*} P(C_i|W_a \cap W_b \cap ... \cap W_z) = \frac{\frac{1}{N}N_{C_i}\prod\limits_{j=a}^z{\frac{N_{C_i}(W_j)}{N_{C_i}}}}{\frac{1}{N}\sum\limits_{k=1}^n{N_{C_k}\prod\limits_{j=a}^z{\frac{N_{C_k}(W_j)}{N_{C_k}}}}} \end{equation*}
 +\begin{equation} \Rightarrow P(C_i|W_a \cap W_b \cap ... \cap W_z) = \frac{\prod\limits_{j=a}^z{N_{C_i}(W_j)}}{N_{C_i}^{x-1}\sum\limits_{k=1}^n{\frac{1}{N_{C_k}^{x-1}}\prod\limits_{j=a}^z{N_{C_k}(W_j)}}} \end{equation}
 +
 +Where $x$ is the total number of words. This version may help avoid underflow, but may instead be susceptible to overflow due to the exponentiation involved.
 ==== Two-category case ==== ==== Two-category case ====
  
notes/bayesian_classification.txt ยท Last modified: 2013/03/15 14:13 by andy