Spam does not need to be saved on the server, except as a corpus for training the Bayesian database and for score regeneration. Generally, the reason that spam emails are stored is so that any false positives can be reclaimed by users. If auto-learning is used, you also can use these stored spam emails to ensure that false positives have not been learned as spam. This involves checking the folder of spam on a daily or weekly basis.
One technique to lower the number of spam emails to be examined is to divide them into two folders: one for high-scoring spam emails and another for comparatively low-scoring spam emails. False positives are unlikely to be in the high scoring category, so the user need not examine emails in this folder.
This filtering can be effected using a Procmail recipe. The X-Spam-Level
header contains a number of asterisks to indicate the score of the email. Emails that score between one and two get one asterisk, while emails that score between 12...