home atari_400 1966 NY Mozilla Info
Home Atari c-body Mozilla


You are here:  gunnars.net - Mozilla Help - article




Mozilla's Spam filter
Summary of Hard Tecs 4U's review

Note: Netscape 7.1 contains the same Spam filter as Mozilla (1.4), so this summary and the instructions apply to Netscape 7.1 as well.

The German site "Hard Tecs 4U" has reviewed Mozilla's Spam filter and contrasted it with Mailshield. They give Mozilla's Spam filter the thumbs up and find Mozilla's filtering noticeably superior To Mail Shield's*. This is especially interesting since Mail Shield's desktop version costs $60 per license vs. $0 for Mozilla.
Mozilla's Mail fiter was trained with 300 Spam messages and 2700 "good" messages. Then, they let Mozilla automatically analyze 280 new Email messages. Here is a summary of their findings:
  • The analysis of the 2700 "good" training messages took about 1 Minute.
  • The subsequent automatic classification (Spam/not Spam) of the 280 new Email messages took only about 2-3 seconds
  • 107 messages were correctly identified as Spam and moved to the "Spam" folder. Most importantly, not a single "good" message was classified as Spam!
  • The remaining 173 messages were identified as not being Spam and moved to the Inbox. Of those, 30 messages (17 per cent) were Spam but not identified as it.
These results are already very encouraging, especially when considering that Mail Shield's initial error ratio was 50 per cent and even after manual tweaking, it was still at 25 per cent.

So what makes Mozilla's Spam filter perform better than Mail Shield's?
According to the review, Mail Shield uses a points-based filter where each word is assigned a point value. If a message exceeds a certain number of points, it is automatically classified as Spam. Mozilla's Spam filter, on the other hand weighs words, i.e. it analyzes, how often certain words or combinations of words appear in Spam and in "good" messages. This is why it is so important to train Mozilla with Spam as well as with messages that are not Spam. To do that, mark messages that are already in your inbox (Windows users: hold down the "Shift" key and select the first and last message to mark all messages in between) and select Message -> Mark -> As not Junk from Mozilla Mail's menu bar. Alternatively, you can right-click on a message and select Mark -> As not Junk.
In my opinion, having a "Junk" icon but no "Not Junk" icon in the toolbar is a great usability shortcoming of Mozilla. This way, many users only train the Spam filters with Spam but not with "good" messages, leading to results that are less good than those obtained with a properly trained Spam filter.

Getting back to the summary: So where Mail Shield just counts individual words, Mozilla's Spam filter analyzes words in context. The disadvantage of Mail Shield's approach is the fact that if a Spammer e.g. writes "e.n.l.a.r.g.e" instead of "enlarge" Mail Shields's filter will not detect it, i.e. not accord it any points. Mozilla, on the other hand is not thrown off by unknown words since the rest of the message will most likely follow a Spam pattern. Hard Tecs 4U also observed that after a while Mail Shield's results actually got worse - it started identifying all messages as Spam.
According to Hard Tecs 4U, the second big advantage of Mozilla's Spam filter vs. static filters is that it contiuously learns - if Spammers change individual words, Mozilla will learn to recognize them due to the context they are used in.
In short, they think Mozilla's Spam filter works great and they highly recommend both Mozilla as a browser and a mail client.

My personal results with Mozilla's Spam filter after training it properly are as follows:
- False positives: Zero
- Spam automatically recognized: on average 70-80%

The important thing here is that no "good" message was falsely recognized as Spam. For me, Mozilla does not automatically recognize all Spam messages as such. This is due to the following reasons:

- There is a trade-off between recognizing Spam and having false positive results. This means that a very agressive filter will probably recognize close to 100% of Spam messages, but also give you many false positives, i.e. you may accidentally lose important messages. I personally prefer Mozilla's slightly less agressive settings.
- Sometimes, there are new variants that Mozilla's Spam filter does not (yet) recognize. Once it has learned their composition, they, too will be detected.
- Some "Spoof" messages, like "update your account information" (never enter your password or follow links in those types of messages! are virtually indistinguishable from offical messages since the criminals who send them try their best to make these messages look like they originate from Ebay, Paypal, your bank,...

In conclusion, I am very impressed by the capabilities of Mozilla's Spam filter, but it is important to note that it is not 100% bulletproof. Before the Spam problem has not been fixed on a regulatory level, you still need to be careful who you give your email address to, avoid posting it on the Internet (e.g. message boards) and be careful with attachments (virii), "Click here to opt-out" links in Spam and with entering personal information (Spoof messages).
But by using Mozilla's Spam filter and following the advice in my Privacy and Security tutorial, the web and email will be safer and more pleasant to use.

Further reading:
- You can find the entire article (in German) here: Der Spamfilter von Mozilla 1.3.
- Using the Junk Mail Control (official tutorial at Mozilla.org)
- If you have ever wondered about the economics of Spam, i.e. why Spammers go on Spamming in spite of the fact that (almost) no one seems buy something because of Spam, I would recommend the following article by the Register: The Economics of Spam.



*Disclaimer: All statements regarding Mail Shield and how well it works (or not) are quoted from Hard Tec 4U's review of Mozilla's Spam Filter. As I have not used this program, I cannot verify the validity of these statements. The price quoted for a one seat license of Mail shield was checked on May 22nd, 2003.




disclaimer     contact     about     links