notes:bayesian_classification

This shows you the differences between two versions of the page.

Both sides previous revision Previous revision Next revision | Previous revision | ||

notes:bayesian_classification [2013/03/15 12:18] andy |
notes:bayesian_classification [2013/03/15 14:13] (current) andy |
||
---|---|---|---|

Line 90: | Line 90: | ||

Please forgive the slightly loose use of notation, there are a few too many dimensions over which to iterate for clarity. | Please forgive the slightly loose use of notation, there are a few too many dimensions over which to iterate for clarity. | ||

+ | |||

+ | One slight simplification to note results from the fact that $P(C_i)$ is presumably determined by dividing a number of trained messages by the total number of messages trained. Let $N_{C_i}$ indicate the number of messages trained in category $C_i$, $N$ indicate the number of messages trained overall and $N_{C_i}(W_a)$ indicate the number of messages containing token $W_a$ that were trained in category $C_i$. Thus the equation above becomes: | ||

+ | |||

+ | \begin{equation*} P(C_i|W_a \cap W_b \cap ... \cap W_z) = \frac{\frac{1}{N}N_{C_i}\prod\limits_{j=a}^z{\frac{N_{C_i}(W_j)}{N_{C_i}}}}{\frac{1}{N}\sum\limits_{k=1}^n{N_{C_k}\prod\limits_{j=a}^z{\frac{N_{C_k}(W_j)}{N_{C_k}}}}} \end{equation*} | ||

+ | \begin{equation} \Rightarrow P(C_i|W_a \cap W_b \cap ... \cap W_z) = \frac{\prod\limits_{j=a}^z{N_{C_i}(W_j)}}{N_{C_i}^{x-1}\sum\limits_{k=1}^n{\frac{1}{N_{C_k}^{x-1}}\prod\limits_{j=a}^z{N_{C_k}(W_j)}}} \end{equation} | ||

+ | |||

+ | Where $x$ is the total number of words. This version may help avoid underflow, but may instead be susceptible to overflow due to the exponentiation involved. As a result, it may be preferable to move the divisions back inside the iterations. | ||

==== Two-category case ==== | ==== Two-category case ==== |

notes/bayesian_classification.1363349910.txt.gz ยท Last modified: 2013/03/15 12:18 by andy