User Tools

Site Tools


notes:bayesian_classification

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
notes:bayesian_classification [2013/03/15 12:18]
andy
notes:bayesian_classification [2013/03/15 14:13] (current)
andy
Line 90: Line 90:
  
 Please forgive the slightly loose use of notation, there are a few too many dimensions over which to iterate for clarity. Please forgive the slightly loose use of notation, there are a few too many dimensions over which to iterate for clarity.
 +
 +One slight simplification to note results from the fact that $P(C_i)$ is presumably determined by dividing a number of trained messages by the total number of messages trained. Let $N_{C_i}$ indicate the number of messages trained in category $C_i$, $N$ indicate the number of messages trained overall and $N_{C_i}(W_a)$ indicate the number of messages containing token $W_a$ that were trained in category $C_i$. Thus the equation above becomes:
 +
 +\begin{equation*} P(C_i|W_a \cap W_b \cap ... \cap W_z) = \frac{\frac{1}{N}N_{C_i}\prod\limits_{j=a}^z{\frac{N_{C_i}(W_j)}{N_{C_i}}}}{\frac{1}{N}\sum\limits_{k=1}^n{N_{C_k}\prod\limits_{j=a}^z{\frac{N_{C_k}(W_j)}{N_{C_k}}}}} \end{equation*}
 +\begin{equation} \Rightarrow P(C_i|W_a \cap W_b \cap ... \cap W_z) = \frac{\prod\limits_{j=a}^z{N_{C_i}(W_j)}}{N_{C_i}^{x-1}\sum\limits_{k=1}^n{\frac{1}{N_{C_k}^{x-1}}\prod\limits_{j=a}^z{N_{C_k}(W_j)}}} \end{equation}
 +
 +Where $x$ is the total number of words. This version may help avoid underflow, but may instead be susceptible to overflow due to the exponentiation involved. As a result, it may be preferable to move the divisions back inside the iterations.
  
 ==== Two-category case ==== ==== Two-category case ====
notes/bayesian_classification.1363349910.txt.gz ยท Last modified: 2013/03/15 12:18 by andy