Scott Hanselman

Exposed: A Blog Comment Spammer's Source Template

April 22, '13 Comments [53] Posted in Musings
Sponsored By

I've been getting a LOT of Blog Comment Spam lately, just in the at two weeks. I run all my comments through the Akismet Service, and I pay for it. However, this particular flavor of spam has been making it through consistently. It has a pattern, through, and I'd been trying to figure it out when this LARGE comment showed up.

Apparently while they were messing about trying to spam me, they posted their entire source template.

I'm embedding it below as a Gist, rather than copy/pasting it into my blog engine. It's so spammy, I'd hate to get delisted from Google looking rather like a splog.

Note the comments for the Gist as well.

One fellow says

"I used to do comment spam and this is not the most advanced one."

Really? Does one put Comment Spammer on their resume?

Another comment says that we're hating on spammers. We should embrace them because:

"Sure for the 1% of super popular blogs out there this might be unnecessary, but in a world filled with bloggers blogging blogs most people never read, the fake recognition and pleasantry might be just what these writers need."

I'm pretty sure that fake comment spam isn't as emotionally uplifting as you think.

Start scrolling down! If you are viewing this in an RSS reader, you MAY need to visit this post directly to see it.

Your comments, Dear Reader? Cue spam comment-related jokes...now.


Sponsor: The Windows Azure Developer Challenge is on.  Complete 5 programming challenges for a chance at spot prizes, Stage prizes and the Grand Prize. Over $16,000 is up for grabs with 65 chances to win!

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. I am a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Monday, April 22, 2013 11:21:30 PM UTC
Whatever happened to just "Spam Spam Spam Humbug"?
Monday, April 22, 2013 11:27:43 PM UTC
Oh well, from the looks of the template, they're still quite easily recognizable.

The other day I had a spammer who had actually read the article and did comment on the actual content (although it was a quite superficial comment). Anyhow, looking at the link and the user it was obviously spam.

Lately you have to be quite vigilant to detect spam if they are really going to start commenting on content.
Monday, April 22, 2013 11:35:39 PM UTC
Kenneth - Yes, there are actual humans now commenting on videos with positive comments, etc. The URLs are spammy though. Sigh. It's so tedious!
Monday, April 22, 2013 11:37:20 PM UTC
I'm really surprised you get as much comment spam, considering your blog has no follow tags in it. So even if it gets thru Akismet, and doesn't get caught by you it still does nothing for their rankings. I would think they'd be smart enough to spend their time on blogs that may at least give them some link juice.

Oh well, nobody says spammers are smart.
Monday, April 22, 2013 11:57:18 PM UTC
I think bloggers need to define some sort of tag or meta that is not visible on page but sent to you on your first comment to answer a question or something to check for your humanability and interest in blog. On reply it can be marked as authorised, subject to authorise or require another test etc. It should then do this after every 5 or 6 comments if they appear in pattern.
There are many ways to reduce dimentions of this problem but the fact is that it should define a connection between author article and comment and reader. If connection is not strong then flag it.
Monday, April 22, 2013 11:58:41 PM UTC
Farrukh - There's lots of techniques like Captcha, OpenID, etc, but the spammers ALWAYS find a way around it. Always. I've been blogging for over 10 years and it ebbs and flows like the ocean. We're at high tide right now.
Tuesday, April 23, 2013 12:24:39 AM UTC
Ha! I was waiting for this post Scott! ;-)
As i was wondering there was no security/moderation to Comments section of your blog. As you said...
There's lots of techniques like Captcha, OpenID, etc, but the spammers ALWAYS find a way around it. Always. I've been blogging for over 10 years and it ebbs and flows like the ocean. We're at high tide right now.
Yeah i agree, but what about using some real good commenting systems like Disqus, Intense Debate or the new way of embedding Facebook Commenting System?
Hope this will make the visitors transparent and also will reduce our stress to a open platform and their authentication ;-)
Tuesday, April 23, 2013 12:26:35 AM UTC
I use Akismet, as I said. It gets 99% of spam and has for years. Until this week. Moving to Disqus or something else is possible, but then my comments are located somewhere else, which means I'm not in full control.
Tuesday, April 23, 2013 12:28:22 AM UTC
Even if, the odd spam does occasionally make it through, Akismet is really helpful for filtering and they react quickly as the spammers' templates evolve due to the number of people who use it and feed their data sets.

>re:Captcha, OpenId, etc..

Blogger recently added support for Google+ comments in their posts - maybe requiring an account with a verified "real name" could be sufficient to deter bad behavior.

I always think of when the mechanical turk came out in the discussion of spam / spammers. No matter what you do, a spammer can pay people to spam if it's sufficiently inexpensive - or, in cases like this, author templates that get plugged into a bot.
Tuesday, April 23, 2013 12:32:56 AM UTC
Well, clearly the answer to mechnical turk comment spam is mechanic turk comment curation.
hurfdurf
Tuesday, April 23, 2013 3:20:46 AM UTC
Why do we still have a "home page" field in our comments form? Nowadays, it is like pingbacks, only ever used for spam and interesting to no-one in its legit form. I think I might propose we remove it from the standard comment form in Orchard. Whadyathink?
Tuesday, April 23, 2013 4:00:49 AM UTC
Does one put Comment Spammer on their resume?


Apparently, yes. I did have a resume sent to me from a guy who's crowning achievement was writing and selling software to spam forums. No attempt to hide it either, he seemed very proud of the ingenuity involved in hijacking other peoples communities for profit.

Needless (I hope) to say... no hire.
Andy
Tuesday, April 23, 2013 4:15:50 AM UTC
Way cool! Some {very|extremely} valid points! I
appreciate you {writing this|penning this} {article|post|write-up}. I {saw|have seen} {many|several|lots|a lot} of these on my own blog.
Tuesday, April 23, 2013 4:41:22 AM UTC
Instead of Comment Spammer on the resume, perhaps they could use Entry Level Natural Language Processing Engineer. ;-)

What I really need is a way to interest you in doing a blog post about my personal open source project at {project link self deleted} that makes it {benefits self deleted} for .NET developers.

Hmmm... Is that too spammy?
Tuesday, April 23, 2013 6:26:42 AM UTC
Although I have only a small blog (especially compared to yours) and it is mostly in Hungarian, in that last weeks I receive similar comments that don't get auto filtered by wordpress.com. The text seems human but the URL is suspicious. Maybe removing the Home page field from the comment form or the comment listing could mitigate this. Why we need that field at all?

Thanks for blogging.

György
Tuesday, April 23, 2013 6:57:38 AM UTC
The best way I have found to get rid of automated spam like above is to change the id of one of your fields every so often and ignore the ones that post to the old id.

But still send them a 200 so that the software thinks they have succeeded. I need to change it about once a year. You can even see when they are in the process of changing their scripts.
chrissie1
Tuesday, April 23, 2013 7:21:18 AM UTC
The template could be easily turned into a massive regex? Problem solved?

// Ryan
Ryan Heath
Tuesday, April 23, 2013 7:28:36 AM UTC
They are just full of compliments, who would have thought spam bots would become so charming.
Tjaart Blignaut
Tuesday, April 23, 2013 8:00:59 AM UTC
I can recognize many spam comments from that template. The only thing is, for me, it's pretty easy to distinguish spam, even sophisticated, from real comments, as I blog in French.... :-)
Tuesday, April 23, 2013 9:13:53 AM UTC
FWIW, i have a weblog that has only one article (one day it will have more) ... 100% moderation is enabled ... the only comments that i've gotten to date are spam.

i really wonder why the spammers bother ... there must be betters ways to get SEO such as writing their own articles (AFAIK some to many do just that).

my point: the prolific Scottha's of the world with significant and frequent content have garnered 99% control via their use prophylactic plugins like Askimet ... for the less significant authors, simply by turning on full moderation, we can likely create an environment where spamming articles becomes not worthwhile because there will be virtually nowhere left for such spam in the blogoshere. or, maybe not?
Tuesday, April 23, 2013 11:07:33 AM UTC
Did... did that just happen? Was the last comment I read on this post the very same spam comment that the post was talking about? HA!

I like @chrissie1's suggestion, changing the ID every now and again sounds simple enough. It wont help anything that's using a scraper (or something more advanced like phantomjs to load the page) but it's a start if you are running an off the shelf system.
Dan F
Tuesday, April 23, 2013 12:16:38 PM UTC
Richard
Tuesday, April 23, 2013 12:47:48 PM UTC
Hi Scott,

I also had some bad time dealing with spam, I have been forced to disable comments in my blog, in order to not have to deal with it.

I have just re-enabled comments (Disqus), and so far so good, I am seeying not spam as months ago. Maybe spammers have not yet discovered that comments are enabled again.

There is no good type of spam. Spam is just that Spam/Trash
Tuesday, April 23, 2013 1:07:25 PM UTC
Thank you for the {auspicious|good} writeup. It in fact was a amusement account it.
Sid DeLuca
Tuesday, April 23, 2013 1:52:43 PM UTC
I feel for you Scott,

My previous web host seemed to think that setting the spf record information on a domain constituted as spam protection and any spam after that was clearly email I had asked for.

After various lengthy debates of this nature I conceeded that these people are beyond help and decided to get my domain off their infrastructure.

Recently a friend of mine has been having this same problem and upon mentioning this issue to said host their suggestion was "ok we will setup the spf record for domain" ... i'd love to know what spf records have to do with blog posts?

Spam has been a major problem since practically the birth of the internet and I don't think it's likely to be possible to remove spam without making the "originating account" at the ISP of the sender accountable for every bit of traffic they send to everyone else online.

The issue there of course is that this requires some level of "rating" an account at an ISP, I'm suprised there isn't a service that does this type of thing out there somewhere but the issue there of course is that anyone can basically "anonnymously" setup a website / server on a cloud somewhere and spam from that.

Internet standards need to fundamentally change before the issue will go away.
In the meantime it's up to the programmers to try and filter the good from the bad which screams "fix the symptom not the problem" to me.

Oh well :(
Tuesday, April 23, 2013 2:18:37 PM UTC
Bloody vikings.
Tuesday, April 23, 2013 3:11:33 PM UTC
Waitress: "Baked Beans are off!"

Customer: "Well, can I have spam instead of baked beans then?"

But seriously folks. I just love the justifications of the spammers: "the fake recognition and pleasantry might be just what these writers need."

Wow. That's like stabbing someone in the back and then saying you were helping them get over their iron deficiency.
Tuesday, April 23, 2013 5:19:41 PM UTC
I'm surprised no one has yet mentioned: http://xkcd.com/810/
latin_programmer
Tuesday, April 23, 2013 7:37:51 PM UTC
Go team Scott. I would complain about spam, but since I rotated my captcha keys, nothing seems to get through that level of verification.

Keep up the good work (when is the next Developers Life? ;) )
Tuesday, April 23, 2013 9:18:21 PM UTC
I just...this baffles me. I don't blog a ton, but I don't understand why you would want to create a bot that spammed blog comments and looked legit. I mean, unless you were phishing links or something.
Joe Morgan
Wednesday, April 24, 2013 3:49:49 AM UTC
Really? Does one put Comment Spammer on their resume?


Probably yes. People sitting at the places like Nigeria may have 'Spammer' experience in their resume. Because this is what they might be being paid for. Spammer's don't do this for fun. Somebody is paying them.
Wednesday, April 24, 2013 8:12:03 AM UTC
I think same happening with weblogs.asp.net. It goes thousand spam comment everyday. Are you guys planning to do anything about it.

I stopped blogging there because of the spamming.

Regards,
Jalpesh
Wednesday, April 24, 2013 11:03:34 AM UTC
I always figured as a semi retirement income (.03$ a hit) that 'spam' was a viable income source. Don't know for sure. When I was listing on code forums(vb,vs.net mostly) I had the most spam- most of the blog spam is easily avoided by not blogging.
So have you tried turkey bacon?
Tastes like real bacon for the first day or so- then it tastes like turkey...
kWAZAI
Wednesday, April 24, 2013 2:36:21 PM UTC
I've seen some spam lately that takes this a step further: it actually sounds like it's a real comment on the post. That is, it mentions something said in the post and responds to it in a reasonably grammatical sentence. Akismet has caught some, but not all, of these, and when I've checked both the comments and the filter, I've been unsure. Except that most of these are coming from something that purports to be a moving company, sometimes Houston movers, sometimes Dallas. I'd really like to know how they're doing this.

This is not the same as an earlier round in which the spammers just took other comments on the post and reposted them with their spammy links.
Wednesday, April 24, 2013 7:28:39 PM UTC
I'm with Guruprasad on this one. The closer you get to requiring a real name (or a proxy for it like Disqus or Facebook - something with a reputation that you care about) the better the quality of discussion and the lower the spam (not always but usually).

The internet has changed from the wild west days of usenet flamewars. I get that there are times where anonymity is required (e.g., whistleblowing) but think that anonymity should be the exception - real attribution should be the norm.

ps-somewhat surreal to read through that template. It's like looking into the dark underbelly of social media.
Thursday, April 25, 2013 3:10:45 PM UTC
I've been seeing this style of spam for a few years now. Always has a spammy link in the home page field. That was back when I had a good page rank and stuff. I need to start blogging again but life has taken over.
Thursday, April 25, 2013 6:50:48 PM UTC
what if you regex out the html link tags and have just the unclickable link text? If they can't measure the effectiveness of the targeted site and making it just a little more inconvenient to follow a link they might stop?
Sean Lin
Thursday, April 25, 2013 8:31:43 PM UTC
When I'm bored I enjoy removing the links from the more readable comment spam messages and replacing them with {link removed}, posting the comment anyway and responding in a sarcastic way.
Wednesday, May 01, 2013 9:41:50 PM UTC
@Jeremy Morgan:

I'm really surprised you get as much comment spam, considering your blog has no follow tags in it.... it still does nothing for their rankings.

While Google officially doesn't consider "nofollow" links, there's lots of discussion on SEO forums that Google may use them to determine which sites are spamvertised. If a site has a high percentage of inbound links marked as "nofollow" it could be argued these are likely to be a result of comment spam or similar, and thus the target site could actually receive lower ranking as a result.

If this theory is correct, then perversely you're helping to defeat spammers by publishing their spammy comments!

Which also makes me wonder if I shouldn't have put my home page link on this post?
Monday, May 06, 2013 12:55:00 AM UTC
Thus the power of Scrapebox and spinning technology.

Personally, the blackhat community is actually one of the most interesting ones in regards to development of technologies and strategies to counter the ever evolving landscape of anti-spam mechanisms. Seeing what these people do is quite impressive if you remove the moral aspects of what they are actually doing and focus merely on the business case scenario.

The example template you have is rather a low level spinning, swapping out synonyms and uses no nested spinning techniques for sentences and paragraphs. As a template itself, there isn't much work put into it that TBS (The Best Spinner) or SpinnerChiefII could do instantaneously. The interesting part comes in is what they are achieving in scraping and n-gram analysis.

Software like WordAI (which is web based, google the youtube vid of it) can automatically spin comments and with API use of other software which gathers comments from scraped sources (other blogs or articles) that are mathematically similar to yours in regards to language, technology, verbiage and even tonality you can create unique looking comments purely on autopilot with little costs to the person running the software.

What fascinates me is the mechanisms that evolve naturally within this blackhat economy to counter Akismet, Mollom, Honeypot, CAPTCHA and other mechanisms involved. Capthcha services like DeathByCaptcha streamlines the captcha problem by sending it to a microworker overseas who manually input the CAPTCHA for fractions of a cent (which adds up over time over long campaigns)combining the efforts of manual processing with automated software.

It provides interesting challenges for people fighting spam, learning how to get rid of footprints (having the words "Leave a comment" on a blog post is open season for scrapers and list builders), creating sophisticated Captcha which do not deter from the User Experience (knowing that most Captcha solving software converts the Captcha to greyscale to help with OCR and manual input for readability, so simply tell the user not to input the letter in Red for example), embedding footprint type text and form labels into Data URIs so the crawlers who read like robots cannot decipher it yet is still presentable to the user (if your email appears at all on a website, you should put it into a Data URI formate so you don't receive email spam).

Once you understand and respect the methodology of the webspam industry, you can devise methods to combat it and reduce spam by several orders of magnitude by targeting specific areas (removing footprints for scrapers, adding a invisible form for the honeypot method to avoid comment posting software) until you are only left with Manual VAs you research and craft their comments which is a fraction of a fraction of the web spam available.

Tuesday, May 28, 2013 5:09:57 PM UTC
too, I stopped blogging there because of the spamming.

Thanks
Wednesday, June 05, 2013 11:45:00 AM UTC
Blogging is a very useful for marketing and for personal use ,....Just monitor blog regularly to prevent spam.
Monday, June 10, 2013 10:13:38 PM UTC
I have almost 800 spam comments each day on my site. Fortunately we have akismet.:0)
Sunday, June 16, 2013 8:13:06 PM UTC
Ha! I just got the following on my blog:

Hi there, i read your blog occasionally and i own a similar one and i was just wondering if you get a lot of spam comments? If so how do you prevent it, any plugin or anything you can advise? I get so much lately it's driving me insane so any help is very much appreciated.


I thought maybe it was legit (and intelligent considering the correct spelling of "advise"), but I followed the link back to a blog with little content and nothing but spam comments with links. Akismet didn't catch this one.

So I did a Google search (which I do whenever I'm suspicious) of the first two lines and your blog came up first.

I especially love when spammers say how much they love reading my articles when my site is audio-driven.

Idiots!
Tuesday, July 09, 2013 8:24:15 AM UTC
Well researched I love it and the writing was done perfectly.
Friday, July 12, 2013 7:32:01 AM UTC
This blog was written perfectly thank you for your opinions.
Friday, July 12, 2013 8:33:46 AM UTC
The "source" you posted is really just Spintax, its used to spin comments and articles.
Used in pretty much any SEO Tool.

{hey|hi|hello} there {human|alien}

Would produce various comments like:


hey there human
hey there alien
hi there human
hi there alien
hello there human
hello there alien

etc

Easier for spammers to get "unique" comments rather than create 500+ variations.
Sunday, August 04, 2013 4:27:19 AM UTC
.Kate Houston...

oh my god, when I was reading some testimonies on the web, I thought they were just some sort of lies been fabricated by some stupid people, not until I decided to give one of the great prophet called prophet Ozanga the chance to help me after lot of testimony to his credit and to my heart shocking breath, he surprised me by actually bringing back my lover within just seven days as he promised.
now I believed, some of these testimonies are real when you actually meet the right priest to help. for help you can contact him by yourself with this mail :prophetozanga@yahoo.com
Kate Houston
Saturday, September 21, 2013 7:48:46 PM UTC
It would be nice to get some comments on websites that actually make sense. But it seems most of the spammers about these days just put a list into scrapebox and away they go.

Some of the comments I read are ridiculous, surely the idea is to get the link from the comments to stick?

I mean, I know it mostly software like scrapebox spinning nonsense but it is so frustrating waking up logging into your blog and seeing 20k comments awaiting moderation.
Sunday, September 29, 2013 1:18:34 PM UTC
With my new website about lanyards , I got 100 + spam in 24 hours ! How do they know I am live ???

Its incredible not one human comment ..

I am amazed web owners haven't figured out just to pay some one in India $ 5 an hour to write comments. they probably could post 20 per hour to generate 500 -800 a week of good back links . one year 40,000 links !

My site is B to B for real physical merchandise so I need to find blogs on marketing, trade shows and work places so that my links are relevant. Hope google cares !
Sunday, September 29, 2013 1:20:14 PM UTC
To the blog owners out there , I use wp plugin but how can you go thru 100's of comments daily Akismet stopped all of them .
Wednesday, October 16, 2013 5:34:14 PM UTC
Thanks for posting this! I'm relatively new to blogging, and got a comment that followed that exact formula! I knew it was spam, but I figured I'd look it up anyway.

Thanks!
Saturday, October 19, 2013 8:28:23 AM UTC
A black or white petticoat is a nice touch for this costume. They are mentally ahead of the pack. People who give memorial wind chimes as presents are at the same time expressing their desire to be of help to those who are suffering and their gifts are for the departed as well. However, there is confusion regarding its origins due to conflicting reports reporting on its origins. The fact that companies such as Louis Vuitton and Target are turning to India for answers is not a good thing for IBM. 'Very few and famous companies provide the sewage system which acts in contrast to gravity to move out the dirt very easily.The agency can literally groom a model, but getting under the skin of the event and making it a success needs a lot of hard work as well. The entire shampoo market is entirely covered in all directions by their hair shampoo brand. Even so, the software program being utilised to retrieve deleted text messages from a SIM Card. They each have their own American collections as well.Beyond the few items of hers found throughout the wooded region surrounding her home, nothing else of hers has been found.
Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.