Date: 2008-12-04 23:00:00
data points

Probability distributions can be a funny thing. I read today that only three pieces of information: A zip code, date of birth, and gender, are enough to uniquely identify 87 percent of the U.S. population [see Cryptography: How to Keep Your Secrets Safe, page 5].

Consider that the New York Times requires almost this much information in its registration:

I'm not saying that the NYT is necessarily trying to identify its subscribers, I'm just using this as an example of what's statistically possible with a few innocuous-looking questions.

There's quite a difference between "date of birth" and "year of birth". :)

That's why I said "almost", of course. With the amount of information out there, it's probably not hard to correlate with some other data source that does have full birthdate, eg. Facebook or Livejournal or even Amazon.

How's it going there, anyway? :)
Pretty well, actually. :) I mean, it's work. There are issues. I probably would LJ-gripe about more of them if I didn't spend 2.5 hours on the train without connectivity every day. :p

But for the most part I'm enjoying it. Good working atmosphere. Neat problems to tackle (mostly dealing with scaling -- As you might imagine, we get some traffic). :)

For what it's worth, I haven't seen any evidence that we attempt to uniquely identify users, apart from those whom we bill for certain non-free services.
I'm pretty sure there's quite a bit of 33-year old males living in my neighborhood....
Hey! Wrong house photo!
I concur with the above comments.

Also, I usually use the "main" ZIP code for my city and have an "alternate" DOB for such queries.
Yeah, somewhere I registered recently learned that I was born on 1/1. Which I wasn't, but that's not important.
Greg Hewgill <>