February 17, 2017 by Pieter Arntz


                                             Explained: Bayesian spam filtering


Bayesian spam filtering is based on Bayes rule, a statistical theorem that gives you the probability of an event. In Bayesian filtering it is used to give you the probability that a certain email is spam.


The name


Named after the statistician Rev. Thomas Bayes who provided an equation that basically allows new information to update the outcome of a probability calculation. The rule is also called the Bayes-Price rule after the mathematician Richard Price, as he recognized the importance of the theorem, made some corrections to Bayes’ work and put the rule to use.




When dealing with spam the theorem is used to calculate a probability whether a certain message is spam based on words in the title and message, learning from messages that were identified as spam and messages that were identified as not being spam (sometimes called ham).


