Book Image

SpamAssassin: A practical guide to integration and configuration

Book Image

SpamAssassin: A practical guide to integration and configuration

Overview of this book

As a busy administrator, you know Spam is a major distraction in todays network. The effects range from inappropriate content arriving in the mailboxes up to contact email addresses placed on a website being deluged with unsolicited mail, causing valid enquiries and sales leads to be lost and wasting employee time. The perception of the problem of spam is as big as the reality. In response to the growing problem of spam, a number of free and commercial applications and services have been developed to help network administrators and email users combat spam. Its up to you to choose and then get the most out of an antispam solution. Free to use, flexible, and effective, SpamAssassin has become the most popular open source antispam application. Its unique combination of power and flexibility make it the right choice. This book will now help you set up and optimize SpamAssassin for your network.
Table of Contents (24 chapters)
SpamAssassin
Credits
About the Author
About the Reviewers
Introduction
Glossary

Statistical Tests


Various statistical techniques can be used to identify spam. These generally involve a training phase, where a database of spam and ham emails is taught to the filter or passed through it to identify typical characteristics of spam and ham. This allows future emails to be identified based on the learning from past emails. The various statistical techniques vary in their choice of tokens and the algorithms they use to predict whether an email is spam or ham. The tokens used are normally words, but can include email headers, HTML markup within emails, and other characters such as punctuation marks.

Statistical filters rely on regular training. They use the knowledge gained in training to estimate the probability that new emails are spam. As spam changes, the filter must adapt in order to continue to detect the spam.

SpamAssassin contains a statistical filter based on Bayesian analysis. This is enabled by default and, if trained properly, aids in the correct recognition of...