A small victory in the war on form spam

2006-02-19 19:00:00 -0500

Yesterday we discovered 100+ notifications for spam messages contributed through an online form by a malicious bot. We usually get a few of these per day at identicentric, because of this blog and various other unauthenticated forms, but the volume has never been enough to warrant decisive action. Nevertheless, Friday night’s activity “stepped over a line” and, much to our chagrin, spam continued to pour in over the course of Saturday morning at a rate of 15-20 per hour.

There are several established approaches to battling form spam. Some techniques requiring a user to enter in random characters displayed in a embedded image on the page. Others rely on logging IP addresses on form load, so that the processing script can reject bulk form submissions. Some attempt to use mod-rewrite to block form spam based on missing or specific Referrers or known blacklisted IP segments, with mixed results.

We wanted a dead-simple, general purpose solution that could be used to block spam on any form submissions, without dependencies on the back-end processor. Conceptually, mod-rewrite seemed like a nice fit because it could be implemented on Unix or windows (using ISAPIRewrite), and it was completely externalized from the form-backing application. Yet, the referrer and IP filtering techniques were unsuitable as they could result in long rewrite configurations, frequent ongoing maintenance, or incompatibility with many personal-firewall packages.

Our solution wound up being very simple, and involved setting a cookie using JavaScript that could be detected using mod-rewrite. It relies on the fact that spam-bots are dumb, not cookie-aware, and certainly aren’t JavaScript aware.
Here’s how it works.

Start off by creating a small .js file and including it in the page with the form. Expose a single function called setFormAllowCookie(), or something similar. This function, when called, will set a browser cookie named “formallowed” to a value of “true”.

function setFormAllowCookie() {
  var cookieName = “formallowed”;
  var cookieValue = “true”;
  document.cookie= cookieName + “=” + escape(cookieValue) + “; path=/”;
  return true;
}

Include the .js file into the page with the form. This is easily embedded in practically any html page. Next, add an onload oronsubmit to the bodytag or formtag respectively that calls setFormAllowCookie().

The final step is to configure a rewrite rule that redirects form submissions to an error page if the cookie is not present in the request, like this (show protecting WordPress comments):

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} /wp-comments-post.php
RewriteCond %{HTTP_COOKIE} !formallowed=true
RewriteRule (.*) http://blog.xyz.com/error.html
</IfModule>

Pros:

  • This solution should continue to work until form-scrapers become cookie and JavaScript aware.
  • This approach does not introduce dependencies on the form processing application.

Cons:

  • Requires JavaScript and Cookies, potentially interfering with a subset of form submitters.
  • Might not work against bots that are manually configured to attack a site, as a human could easily figure out the appropriate cookie to set.

It’s a judgement call as to whether the pros outweigh the cons with this approach, with the answer depending largely on the form’s target user base. In our minds the results speak for themselves: it took about 15 minutes to implement this approach to stop the initial barrage of spam and, according to logs, has operated with a 100% success rate against subsequent attempts.

blog comments powered by Disqus