Date: 2006-01-24 21:02:00
Tags: web
email address obfuscation

At the bottom of each of my web pages, I have a footer that looks something like this:

Greg Hewgill greg@hewgill.com

Having my email address so easily accessible for so many years (hewgill.com celebrates its 10th anniversary next year) has undoubtedly helped contribute to the incredible volume of spam I get. However, I don't want to reduce the usability and accessibility of my site, and if people would like to click on my email address to email me, they should be able to. Therefore I have refused to obfuscate my email address as it is published on my own web site.

It occurred to me that the risk of an automated email harvester robot picking up my email address could be reduced significantly by using a bit of javascript. If I avoid mentioning the actual email address as literal text in the HTML source, then the email harvester robots will probably miss it. Here's the code I used:

function email_address(a) {
    for (i = 0; i < a.length; i++) {
        document.write(a[i]);
        if (i == 0) {
            document.write('@');
        } else if (i+1 < a.length) {
            document.write('.');
        }
    }
}
function email_link(a) {
    document.write("<a href=\"mailto:");
    email_address(a);
    document.write("\">");
    email_address(a);
    document.write("</a>");
}

Then, in the HTML source where you want to use a clickable email address, do something like the following:

Greg Hewgill <script type="text/javascript">email_link(['greg', 'hewgill', 'com'])</script>

In this way, users whose browsers understand Javascript will be able to see a clickable link just as the example at the top of this post. However, automated email harvester robots won't see anything resembling an email address, so they won't pick it up. It seems unlikely to me that an email harvester would go to the trouble of actually executing embedded javascript code (if they do, I can think of a lot more fun things to do to them).

One risk with this method is it makes the email address unusable for somebody whose browser does not understand Javascript. However, with so many sites using "Ajax" and other Javascript-based technologies to offer routine functionality, I think this is a small risk today.

Having said all that, I still haven't deployed this method on my own web site. But it's live on Amy's web site if you want to see it in action.

[info]cowbert
2006-01-25T06:37:18Z
Yeah, I like how the LJ maintenance people have noticed that "some users of Netscape 4.7 and 7 cannot stay logged in". Ok, you don't deserve to be on the web these days if you are using Netscape 4.7.
[info]lithiana
2006-01-25T08:37:04Z
hmm. i recently turned off javascript after deciding that the benefits of having it enabled don't make up for the XSS problems in so many websites (not just recent LJ problems - no-one seems able to do it properly).

however, you could probably use <noscript> to emit an obfuscated (but still present) textual address?
[info]ghewgill
2006-01-26T05:45:37Z
I thought about the <noscript> section to offer a slightly-obfuscated version for non-javascript browsers. My initial thought was that sector would be safe to ignore, but I think I'm going to reconsider.
(anonymous)
2006-02-04T10:06:37Z
Another option is to provide a link to a special page in the <noscript> block. The destination page would have an HTML form that lets people send an email through the web page interface.

It doesn't let people use their favorite email program, but it's probably sufficient for people who only want to send a quick note and don't have javascript enabled.
[info]goulo
2006-01-25T12:08:05Z
I think some people intentionally disable javascript when browsing for whatever security/privacy/paranoia reasons, right?

Going to this sort of javascript solution seems contrary to your oft-stated goal to make your email address as easily accessible as possible, spammers be damned. Javascript obfuscation seems like the thin edge of the wedge. :)
[info]ghewgill
2006-01-26T05:46:43Z
I think adding a <noscript> version (obfuscated) for non-javscript browsers is a reasonable middle ground. Spam is getting rather annoying again.
[info]mskala
2006-01-25T13:42:22Z
I don't use Javascript when I can possibly avoid it, and I think email address obfuscation is very rude.
[info]cetan
2006-01-25T14:43:49Z
Are bots really (still) sucking up email addresses from web pages? I was under the impression that most spam is blasted to a domain with many combinations of letters in hopes that it reaches an inbox.

gref@ greg@ greh@ grei@ etc...

[info]ghewgill
2006-01-26T05:47:52Z
They absolutely are. I started getting spam to the email subscription addresses posted on http://hewgill.com/nwr/ (the Subscribe links). I don't see a lot of dictionary-style spam attempts on my domain.
[info]cetan
2006-01-26T14:43:23Z
"dictionary-style"

That's the phrase I couldn't think of! That brain of mine really needs an overhaul.

Thanks for the info.
[info]ehintz
2006-01-27T22:42:36Z
I'm in the same boat, IIRC mine first went online 10 yrs ago come April. Though I haven't really bothered, I just let the filters do their thing. I'm on so damn many lists now that it seems somewhat pointless to attack it from that end. Fortunately thus far the spammers aren't generally smart enough to outsmart the hackers. Yet.
Greg Hewgill <greg@hewgill.com>