As your WordPress blog gets noticed and generates traffic, it becomes a natural target for spammers. If you’re noticing posts on your site that you don’t expect, or see users in the Dashboard that you didn’t create, you have other security problems. Most likely, your blog posts will accrue a variety of spam comments as a side effect of being popular.
You can recognize spam by a list of links within the comment or content-free comments saying that the poster enjoying your writing, with an attached URL or source address that invites you to a less-than reputable destination. In either case, the goal of comment spam is to generate more web content that points back to the spammer’s site, taking advantage of the page popularity ranking algorithms used by Google and others that give weight to incoming links. The best way to deal with spam is to simply get rid of it, denying spammers the opportunity to use your site to boost their own visibility.
There are three basic approaches to dealing with the problem: make it impossible for anyone to leave comments, increase the difficulty of a spammer sneaking a comment onto your site, and enable auto-detection of common spam patterns. Obviously, disabling comments (through the Dashboard) is a bit harsh, and defeats the goals of establishing conversation with your readers. On the other hand, if you decide to take this drastic step, remember that changing the settings for posts on the control panel only affects future posts; anything already on your blog will still have comments enabled unless you go through the Dashboard and turn them off individually. If you don’t mind an even greater bit of brute-force effort, you can remove the wp-comments.php file from the WordPress core, which somewhat unceremoniously puts an end to the ability to comment on your posts.
Comment Moderation & CAPTCHAs
One approach to comment spam is to slow down the spammers; however, the simple approach slows down valid commenters as well. You can require commenters to register as site users before being allowed to post comments, as we discuss later in this chapter, but that has the downside of preventing passing-by users from adding their thoughts. It also requires that you stay on top of the user registration, as you may see seemingly valid users that are created purely for the purpose of posting spam to your blog.
Moderation is another tool in the slow-but-don’t-stop vein; you can hold all comments for moderation or require all commenters to have a previously approved comment. In effect, you’re putting the burden of spam detection on yourself, looking at each comment as it appears and deciding whether to post it to your blog or flush it. Again, an innocuous looking comment may be the approval stepping stone for an avalanche of spam later on from the same user. As with many security mechanisms, the bad guys are continually getting smarter and more automated, and testing the edge protection and response of the systems they want to infiltrate.
A variation of the brute-force and moderation method is to blacklist IP addresses that seem to be the primary sources of spam; the access controls can be put in your .htaccess file. Again, this is perhaps a bit of hunting bugs with an elephant gun, as you’re likely to block valid IP sources from common carriers who are unfortunately home to some low-limit spammers.
Enter CAPTCHA methods meaning ‘‘Completely Automated Public Turing test for telling Computers and Humans Apart’’ . It’s goal is to impede spammers’ ability to post unwelcome comments by requiring them to enter some additional, dynamic piece of information. There are quite a few CAPTCHA generating plugins for WordPress, all of which add a displayed word or math problem to the end of the comment posting form, requiring the user to enter the correct information before the form is submitted. The simplest of these, the Math Test plugin, displays a two-term addition problem that must be solved by the user. The basic idea is that an automated spamming process won’t be able to recognize the distorted words or solve the problems, alleviating the spam at the point of insertion. There’s some debate as the effectiveness of CAPTCHAs, with their failure rates suggested as high as 20 percent. You’re also adding a step for commenters, albeit a trivial one. If your site attracts a large, non-English speaking audience, CAPTCHAs depending upon wavy English words will be effective, but only in preventing valid comments from frustrated users.
Automatic SPAM Detection & Elimination
The first step in automating spam detection is blacklisting certain types of posts or particular words. In the Dashboard’s Options -> Discussion -> Comment Moderation box, you’ll find an option to block any comment that contains more than a particular number of links. Don’t set this to zero, or anyone who includes their own blog URL in a comment is going to filtered. This cuts down on the obvious spam messages, however. Similarly, adding words to the blacklist like ‘‘Vicodin’’ will eliminate the pharmacy spam, but if you’re perturbed by offers of fake Rolexes, don’t add ‘‘watches’’ to the blacklist or you’ll drop any comment that uses ‘‘watches’’ as a verb as well as a fake product noun. Word blacklists are universally effective in blocking comments with those words, irrespective of context.
Fortunately, WordPress has the Akismet plugin built-in for dealing with comment spam that relies on a crowd sourced blacklist and is transparent to users. Go to http://akismet.com/personal to register for an API key for the service; when you open up the Dashboard and configure the Akismet plugin you’ll need this to make sure your instance of WordPress can connect to the Akismet service. Effectively, Akismet takes each comment as posted, runs it through a database of spam comments hosted by Automattic, and decides whether or not to mark the comment as spam. Statistics on the akismet.com site claim that upwards of 80 percent of all comments are spam, and that they have caught and marked more than 14 billion spam comments.