Date: 2009-11-07 21:38:00
lnk.nu and google app engine
Last week, I was quietly working on my web server when all of a sudden the whole thing ground to a near-halt. It wasn't completely dead, because it would still ping and every few minutes I would receive another packet of characters (I was literally in the middle of refreshing a screen I was looking at). Not knowing what was happening and not having any way to find out, I went and did something else for a while.

45 minutes later, my server returned to normal operation as if nothing had happened. This did not appear to be just a network congestion problem, it was definitely something my server was busy doing. A bit of investigation showed that the culprit was in fact lnk.nu. Hundreds of machines all across Canada had all accessed the same short link at the same time, and completely pegged the PostgreSQL database processes, and also run my server out of memory. It's quite a testament to both FreeBSD and PostgreSQL that they survived at all.

What I believe happened was that somebody had sent an email containing a lnk.nu link to a mailing list to which lots of people from Canada were subscribed. (The link in question happened to be a job opening at who.int.) Looking up the reverse PTR records for the machines that loaded the URL, there are names like "mail", "barracuda", "filtre", "antispam", "mx1", "incoming-smtp", "guardian", etc. It seems that they all accessed the link for purposes of virus checking, all at pretty much exactly the same time. This was not good for my poor server.

I decided that it might be time to move lnk.nu to a different server. It's written in Python, so it's an ideal candidate for Google App Engine, and I've been looking for an excuse to play with GAE. So I downloaded the SDK, converted the code over to GAE (using Google's datastore instead of SQL), and made it work locally. This part was refreshingly easy and worked well.

The next step is to set up the Google site so it responds to http://lnk.nu and handles the requests appropriately. Given that I've already got the code working locally, that should be straightforward. However, there is one gigantic caveat when using Google web site services (that I've actually already run into for another project): You cannot have Google's servers respond to a "naked" domain name that doesn't have a hostname. This means that having Google respond to http://lnk.nu is not possible.

(There is in fact a good technical reason for the above restriction. When you set up a site with Google hosting, you add a CNAME record to the DNS for your hostname, ie. "www.example.com. CNAME ghs.google.com.". This lets Google completely manage the association between "ghs.google.com" and any particular IP address(es), which is critical for their load balancing setup. The caveat is that a record with a CNAME must not have any other DNS records associated with it, including an SOA record. The SOA record is required on a "naked" domain name like lnk.nu, so you can't add a CNAME there.)

To work around this, I'll have to set up a hostname that Google can respond to, something like http://a.lnk.nu. Of course, that's a pretty lame name for a link shortener to use, so I'll still want the published link to be http://lnk.nu/blahblah. This means that I'll have to have some other, non-Google server respond to a http://lnk.nu/blahblah request with a redirect to http://a.lnk.nu/blahblah. This adds another level of indirection to the resolution process for a shortened link, which adds another browser round-trip, which might slow the whole experience down no matter how fast GAE hosting ends up being.

It turns out that Namecheap (my registrar for lnk.nu) offers "URL redirection" where their server will respond to a particular hostname and redirect the browser somewhere else. It can also be configured to retain URL path information, so http://lnk.nu/blahblah would redirect to http://a.lnk.nu/blahblah. This would completely take my own server out of the loop, hopefully avoiding any more problems like those last week.
[info]taral
2009-11-07T18:14:22Z
Well, you *could* have a DNS server proxy out rewritten versions of the x.google.com records...
[info]ghewgill
2009-11-07T18:26:52Z
I don't think I can be sure that Google won't return custom responses depending on who's asking. Also, this post says something about Google occasionally forcing TCP resets when an app is running on a naked domain, which I don't quite understand but nevertheless I'd like to try to avoid doing unsupported things.

Aha, here's the support request related to this: http://code.google.com/p/googleappengine/issues/detail?id=777 I've added my support by "starring" it.
[info]taral
2009-11-07T20:08:55Z
The TCP resets are to force people to reconnect when they have a persistent connection to the GAE. It's really nothing to do with naked domains, and everything to do with directing connections to a single IP in the old way that is no longer supported.
[info]ghewgill
2009-11-08T07:22:41Z
Well that makes sense. But I can't think of a use case where you'd want a persistent connection to a GAE server without also building it to handle a disconnect at any time.

Even if I did mirror a resolved A record in the naked domain, I'm not sure how I would get GAE to respond correctly when the naked domain appears in the Host: header from the browser. The help page here shows a form at the bottom that you use to tell GAE which hostnames to respond on, and it doesn't accept an empty field.
[info]taral
2009-11-08T18:33:41Z
Wow, that's annoying. You might open a ticket with them and say "I can make it work, just let me create the damn things." :)
[info]decibel45
2009-11-12T16:41:06Z
See, I would have just setup an appropriately sized connection pool so that it wouldn't run the box out of memory... :P
Greg Hewgill <greg@hewgill.com>