User Tools

Site Tools


notes:bayesian_classification

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
notes:bayesian_classification [2013/03/15 14:06]
andy [Combining words]
notes:bayesian_classification [2013/03/15 14:13]
andy
Line 96: Line 96:
 \begin{equation} \Rightarrow P(C_i|W_a \cap W_b \cap ... \cap W_z) = \frac{\prod\limits_{j=a}^z{N_{C_i}(W_j)}}{N_{C_i}^{x-1}\sum\limits_{k=1}^n{\frac{1}{N_{C_k}^{x-1}}\prod\limits_{j=a}^z{N_{C_k}(W_j)}}} \end{equation} \begin{equation} \Rightarrow P(C_i|W_a \cap W_b \cap ... \cap W_z) = \frac{\prod\limits_{j=a}^z{N_{C_i}(W_j)}}{N_{C_i}^{x-1}\sum\limits_{k=1}^n{\frac{1}{N_{C_k}^{x-1}}\prod\limits_{j=a}^z{N_{C_k}(W_j)}}} \end{equation}
  
-Where $x$ is the total number of words. This version ​keeps the values relatively large so should hopefully reduce problems with floating point underflow ​(although ​may be susceptible to overflow ​if the number of tokens becomes excessive).+Where $x$ is the total number of words. This version ​may help avoid underflow, but may instead ​be susceptible to overflow ​due to the exponentiation involved. As a result, it may be preferable to move the divisions back inside ​the iterations. 
 ==== Two-category case ==== ==== Two-category case ====
  
notes/bayesian_classification.txt · Last modified: 2013/03/15 14:13 by andy