Now it is time to look at the various technologies out there, which allow
you to effectively manage a Spam problem. Many of those work if you use
some e-mail client on your own computer. Users of a web-based system have
somewhat less choices.
I would like to point out that there are now MANY different commercial
anti-spam products available. Just go to Google and search for "anti spam"
or "spam blocking", and you will see the sponsored (commercial) links on the
side, which lead to those products. I have not tested any of them, and I
therefore cannot comment on their value. I would imagine that many of them
use pretty much the same algorithms as many of the public-domain (free)
solutions that you can download. So, I would think they should work well.
But then, if a solution is available for free, why not use it? If you are
into karting or any form of motor sport, you will BY DEFINITION be short of
funds, so something free seems to have exactly the right price, as far as
I am concerned. :-)
Bayesian filtering
This is one of the most promising and effective ways to fight Spam. I
recently downloaded a Bayesian filtering plugin for Outlook, and am now
routinely delegating about 90% of my Spam to a special Spam folder,
in the moment I receive it without me ever seeing it, and with all my
legitimate mail remaining in my Inbox. And this detection rate is going
to improve as the system is trained...
... that's right, the system is trained. The advantage of this is
quite simple: Instead of relying on someone else's definition of what
is Spam and what is not, you can train the system to recognize what
YOU consider to be Spam. The system works so amazingly well, and is
so easy to use, it is surprising that not more people are using it.
As a first step, you probably need to collect a good-sized sample of your
good messages and some of the Spam you have received. You should have AT
LEAST a few dozen of messages of each category, but the more the better.
For example, ideally you have a few hundred examples of each group. But
any number is a start. The more examples you have, the more accurate the
system is.
After you install the system, you typically tell it which mail folders
contain Spam and which contain good messages. It then performs a statistical
analysis of those messages, and trains itself how to recognize which. With
every new message you are receiving, it will learn more. If a message should
be classified incorrectly, you can tell the system about it easily, so that
after a while it won't make the mistake again. A nice side effect of this
is that the system will always adapt to new trends in Spam, gently aided
by some occasional human training.
Right now I am still checking my Spam folder once a day to see if something
was misclassified. However, that usually only happens to certain newsletters
that I have subscribed to. After a single 'teaching' about that particular
message, the next newsletters already are recognized correctly. I am still
receiving a few Spam messages each day in my Inbox, which the system was not
100% sure about yet. If that happens, I simply move them into the Spam folder,
and the system automatically learns that this kind of message was Spam, and
will likely not make this mistake again.
You can download several free implementations of Bayesian Spam filters. Many
of those system perform additional analysis to increase accuracy. There is
also a host of commercial products available. Some of the free ones can be
found here:
- SpamBayes.org
This is a completely free and very effective solution, which works on
Windows, Linux/Unix and MacOS. If you are a Windows user, and use Outlook,
it will even provide a very convenient plugin for you
here,
which is fully integrated into the Outlook application, and allows you to perform all
the training without ever having to leave Outlook. I am using this
to handle my e-mail messages with great effect.
- SpamAssassin
Similar to SpamBayes, but using even more ways to detect Spam. SpamAssassin
for Unix is for free, and is somewhat of an industry standard. You can also get
a free 'proxy' version here (also offering a
commercial version with support), which you can
install on your Windows machine, but which is not integrated into your
e-mail client program. On the other hand, it allows you to work with pretty
much any e-mail client program you wish. You can get a plugin for the Eudora mail client
here, though,
but that is a commercial product, with a 30 day free trial.
- Spammunition
This is another completely free anti-Spam filter plugin for Outlook. I have
used it as well, and it works very nicely.
- Mozilla
Mozilla is of course the successor of the Netscape web-browser, and is not only
a web-browser, but also a news reader, mail client and more. Without having to
install anything additionally, Mozilla's mail client has a Bayesian Spam filter
already fully integrated, which you can simply train from within Mozilla. Very
convenient. Mozilla is a good browser, too.
These are just some examples of the Bayesian filtering solutions that are available.
Many more can be found on the Internet, but those should give you a first start. If
you are interested to learn more about the background of Bayesian filtering, and where
it was introduced as a Spam fighting concept, I recommend the article
A plan for Spam by
Paul Graham. It explains what Bayesian Spam filtering is all about.
Using people to recognize Spam
This is an interesting concept, which relies also on plugins in mail clients. Anyone
who installs the plugin can report back to a central server if he/she received a Spam
message with a single click of the mouse. If a sufficient number of people have reported
a particular message as Spam, the server will update all the client installations with an
additional 'fingerprint' for this message, so that all the clients can recognize it when
it arrives and can filter it out. I have not myself used such a system yet, but I am
assuming that it is intelligent enough not to be confused by random characters which may
be added randomly to spam.
An example of a system like this is
Cloudmark.
This should probably work quite well. The advantage is that it is useful
even for those occasional e-mail users, which do not have a sufficiently large
body 'good' e-mail to train a Bayesian system with (my Bayesian filtering worked well,
even though I did not have many messages to train with, though...) The problem with many
of those 'user driven' systems is, that they are not for free. Even though the monthly
fee is quite low, it is still a cost. You can download a free trial version, though,
if you are an Outlook user.
The same idea is also used by iHateSpam, which is another commercial product, in order
to supplement any filtering capabilities it has built in. This costs about $19.95 and
is available as Outlook and Outlook Express plugin.
Challenge - Response
This is a system, which at first glance should be an absolutely accurate way to
eliminate Spam: Instead of making a message visible to you in your Inbox, the
system will hold the message for a while and will send an e-mail back to the
sender. This works something like this:
"Hi! You have just sent an e-mail to racing_at_domain-name-of-this-site.com. To confirm that you are
really human, would you mind filling out this form, by specifying your name? Once
you have done this, your original message will be sent on to the intended recipient.
Thank you!"
An example of a major provider who offers this system is Earthlink.
The idea is that the sender of legitimate e-mail will perform the challenge that is
described in the e-mail, while a spammer will either not bother, or will not receive
this challenge anyway, since they are normally using fake 'from' addresses.
A challenge-response system is very effective against Spam. However,
the big problem lies with the fact that it puts a burden on a legitimate sender of
e-mail. If you are an online business, you want to make initial customer contact as
easy and pain free as possible. Sending a challenge back may just frustrate some
potential customers, so that you will never hear from them again.
Also, what if the person that tried to contact you does not speak English?
However, if you use your e-mail for personal communication only, and you know that
people who contact you will be able to respond properly to a challenge, then a
challenge-response system may be right for you.
Web-based e-mail
Web-based e-mail systems have become popular, since you can read your mail from
anywhere, and the mail account is completely free. Many of those systems now
also offer Spam filtering. For example Yahoo or Hotmail. While these Spam
filters can be effective, the problem with them remains that this is a centrally
administered filter. Therefore, this filter cannot be trained by YOU, and it
will always remain more or less inaccurate. Nevertheless, you should consider
switching to those web-mail providers which do offer Spam filters, since something
is better than nothing. You may have to read the messages classified as 'Spam'
a bit more carefully, though, to make sure you can catch false positives.
Many of these web-based mail services allow you to access you mail via POP (the
Post Office Protocol, which is the standard protocol on the Internet to receive
e-mail). In that case then, you can use client software, such as Outlook,
Eudora or Mozilla, and all the filtering capabilities that come with it. In order
to receive your e-mail via POP, the web-based services usually require you to
sign up to their 'premium' services, which cost a certain amount per month. That
makes sense, since you will not be viewing their advertisements anymore, which you
would normally see while checking your messages.
However, if you have to deal with a lot of Spam, and you need the flexibility of a system
which can operate as web-based e-mail when you need it (when you travel, for
example) it may be worth it.