It seems that spammers have introduced a new twist on an old tactic. When Paul Graham's A Plan for Spam article introduced Bayesian filtering principles to the antispam world, spammers were quick to react to this new threat. Since their spam was now being scored by full content (and not just naive keyword matching), they started including snippets of legitimate text along with their spam messages. This legitimate text, since it wasn't part of their marketing campaign, was typically displayed in an impossibly small font or in invisible (ie. white on white) colors.
Anyway, I recall seeing text pulled from such works as Moby Dick, Ulysses, and various Shakespeare. It didn't matter what the text was, as long as it didn't look very much like spam. As far as I can tell, there are at least two goals involved here:
Recently, several people (cetan, leroy_brown242, Amy) who have journals, have received messages from other Internet users wondering why some of their journal text was included in the spam message. Obviously, the journal authors don't have anything to do with the sending of the spam. It seems that the spammers are now scraping text off the Internet instead of using text from the classics.
Perhaps this approach is intended to more closely match the kind of text that people normally receive in email. Because the text is written by today's Internet users and not 19th century authors, the vocabulary will be better suited to confuse spam filters.
This new technique is surprising and annoying to those users whose text is used in spam. Most recipients of the spam will either not see the message at all, or not see the small/obscured text, or just ignore it. The few who do look at the whole message and google for key words or phrases to find the original author's journal, seem skilled enough at that point to not accuse the user whose journal text was used.
Fortunately, Bayesian filtering techniques are just one weapon in the fight against spam. With blacklists, SPF, virus scanners, and the battery of tests provided by SpamAssassin, I now get, on average, about 5 spam messages in my inbox per day. Since my mail server receives about 1000 spam messages per day, that's less than a 1% miss rate on my spam filters.
2005-04-26T05:58:44Z