Getting rid of Spam on Websites

Protecting websites from receiving all kinds of spam.

Contact forms, feedback forms, guest books, blog comments, in short any web form that allows visitors to enter information will be abused by spammers. E-mail addresses on websites are scanned and added to spam databases. Even a website's log files are abused to dump links in.

Most of these ways of contact can to some extent be protected by website owners against the practices of spammers.

E-mail harvesting

Spammers using bots to scan websites for e-mail addresses, is called e-mail harvesting. If someone leaves their address on an internet forum, a social network, or in a blog comment, it will be found by spammers, and added to their databases.

E-mail harvesting bots are often not sofisticated enough to decipher Javascript. Thus, e-mail addresses can be obfuscated by using Javascript code to display them, instead of having them in plain sight in the HTML code.

Read more information about protecting e-mail addresses from harvesting.

Better is not to have e-mail addresses at all on a website, but to use a contact form. While contact forms are spammed, this happens less than on ordinary e-mail addresses, and they can more effectively be protected.

Contact form spam

Contact forms receive a lot less spam than published e-mail addresses, but they are targeted. Web developers can protect them with several measures.

The most rigorous one is simply not allowing website addresses (URL's) to be posted, as posting URL's is the whole point for spammers. One can also add fields that are hidden with CSS, that automated spam bots will try to fill, but that are checked for being empty.

The omnipresent CAPTCHA's, consisting of weird looking letters and numbers that you have to type over, are not safe anymore for modern spam bots. They know how to decipher them, using the same OCR techniques that scanner software uses to convert scanned pages to actual text.

Read more information about protecting contact forms from spamming.

A note next to a contact form that commercial messages are not appreciated should keep away marketers who do have some integrity.

Log spam

Spammers even manipulate the log files of websites. They use the "Referrer" that a browser sends for that. The "Referrer" is the webpage that linked to the webpage you're currently visiting. Spammers make web robots act like regular visitors, doing as if these visitors come from links on webpages that are in actuality on spammer's websites that do not contain such links. The website's logs accumulate those.

People checking the logs for websites that link to their website will thus see the referrers. These act like a kind of ad.

But sometimes log files are available on the web, and present links to websites. Spammers apparently believe that search engines count such links to determine the poplularity of a website, ranking it higher. In reality, the engineers at search engines are well aware of this, and filter it out.

If you find referrer spam to be particularly annoying, you can protect your site from it using Referrer Karma (works only on PHP sites). This software checks whether a link is actually on the referring page, with referrer spam, there isn't. You may want to find a solution to referrers from online e-mail applications, though, as it is not possible to check for links in other people's e-mail.