Book Image

SpamAssassin: A practical guide to integration and configuration

Book Image

SpamAssassin: A practical guide to integration and configuration

Overview of this book

As a busy administrator, you know Spam is a major distraction in todays network. The effects range from inappropriate content arriving in the mailboxes up to contact email addresses placed on a website being deluged with unsolicited mail, causing valid enquiries and sales leads to be lost and wasting employee time. The perception of the problem of spam is as big as the reality. In response to the growing problem of spam, a number of free and commercial applications and services have been developed to help network administrators and email users combat spam. Its up to you to choose and then get the most out of an antispam solution. Free to use, flexible, and effective, SpamAssassin has become the most popular open source antispam application. Its unique combination of power and flexibility make it the right choice. This book will now help you set up and optimize SpamAssassin for your network.
Table of Contents (24 chapters)
SpamAssassin
Credits
About the Author
About the Reviewers
Introduction
Glossary

Chapter 5. Detecting Spam

Although humans can easily distinguish between spam and ham, detecting spam with computer programs is not simple. Over the years, several methods have been developed to filter spam from ham. Some anti-spam tools use only a subset of these methods, but SpamAssassin uses almost all of them.

Content Tests

Content tests analyze the message part of the email, and sometimes the headers. These tests typically look for key words or phrases within emails. Usually, when using content tests, a scoring system is used. It is not uncommon for words normally associated with spam emails to also appear in legitimate emails, so a score or count of suspicious words is accumulated for each email. Each word associated with spam increases the overall score of an email. The final score is compared with a predefined threshold; this is used to decide whether an email is spam or ham.

Content tests need not focus on single words; phrases and sequences of punctuation are used. The words, phrases...