FAILCamp Continued: Word List Fail

2008-12-07 19:00:00 -0500

Over the Summer, Stephen and I attended the excellent RubyFringe conference in Toronto, hosted by Unspace. One of the best events was FAILCamp, hosted by Joey Devilla. In FAILCamp we all shared our stories of failure, big and small. It was an interesting way to get to know everyone and to share strategies for working through screw-ups and taking something decent away from them.

So, in keeping with that experience, we’re going to share a bit of FAIL; today’s story is brought to us by the PingMe Address generator.

When we need to provide a user with a remote e-mail address on one of our services, we usually want it to be somewhat obscure to that it would be difficult to guess, like The user can change it, but we set an initial one for her as a convenience, and include it in the welcome e-mail. The design goal was to ensure the address was somewhat random, but somewhat easy to remember, so we’ve been using a big word dictionary to smash two words together with a number in between. We do this as a security measure to prevent attacks that would try to deliver spam using PingMe’s messaging transports

Now, as you might expect, we did go through the dictionary to take out some words that would pose obvious problems. But we missed a really obvious one and weren’t as imaginative as our address generator can be, and as a result we got a pretty angry letter yesterday:

I won’t be using your sevice as it lacks professionalism. Scroll to the bottom of this e-mail and look at the generated pingme e-mail address. – Frank [not real name]

Ouch! Fail! Apparently Frank had been assigned, something that we really ought to have caught.

Taking a real close look at our word list I started seeing lots of possibly problematic combinations with regular words in the dictionary: “closet55bugger”, “douche44monger”, “goat11fakir”. Now some folks are probably fine setting their own address, or clicking the “suggest another” button, but some folks are bound to be a tad more sensitive.

So I started going through the word list looking for any classic cusses that we might have missed, but also things that might be sensitive to some folks and are just best avoided. While I was doing this, two important things occurred to me:

  1. I need an intern.
  2. I can’t guess at every potentially disastrous or offensive word combination in the dictionary. No way.

Everyone’s got different quirks and different cultural backgrounds, and I honestly make a poor censor. The chances of me personally (or even a team of three of us) knocking out every word that might lead to disastrous results in a list of thousands are very poor indeed. I’m willing to bet that this is an NP problem, but I’m not about to draw up a proof.

I don’t think the problem here is that we are using a large dictionary, nor do I think we really should have to cleanse the list. The trick is to get the obvious offending words out of the way, and to also provide the user with a bit more context. In Frank’s case, he doesn’t know that these are two random words, he didn’t have any exposure to the process by which it was selected. Adding something like this to our welcome message will probably cut down on this lack of understanding in the future (I hope):

If you’d like to create Pings on the road, add to your mobile’s contacts and your e-mail address book. This address was generated randomly to protect you from SPAM. If you find it undesirable you can change it at any time on your profile page in PingMe.

You can’t make everyone happy (would that be NP-Complete?), but a little clarification can go a long way.

So that’s our story of fail. Perhaps we can make a meme of this – have you any stories you want to write-up and share? Feel free to post links in the comments.

Update: Question to our users: Would anyone like to see purely random addresses like we generate for Tempo? There we use use a pretty long series of random numbers converted to Base26, so they end up looking like “”

Later Update: I almost forgot: the FAIL address assigned to the user was

blog comments powered by Disqus