Those who are possessed of a definite doctrine and of deeply rooted convictions upon it will be in a much better position to deal with the shifts and surprises of daily affairs than those who are merely taking short views, and indulging their natural impulses as they are evoked by what they read from day to day. —Winston Churchill

The Strange World of Blogspot Spam Blogs

Posted by Justin under blogging View recent posts with the tag blogging on Technorati 

I’ve been cruising the Blogspot world lately looking for cool stuff that the bigger geeklogs might have missed (and I found some cool knitting sites [1, 2] as a result last time I did this). What I’ve found, though, is that a large percentage (maybe up to a third) of all Blogspot blogs are spam-logs - sites created to increase the Google ranking of some other site (which is itself usually a Google-spamming site). The ultimate purpose of these spamlogs is usually to drive traffic to a commission-paying pharmacy, pr0n, or casino site.

Some of the spamlogs hosted at Blogspot, which apparently does not have a policy against them, are obvious in their intent (for example). It requires a human to start a new Blogspot-hosted site, but after the initial setup (which can be partially aided by scripts, I’m sure), bots can post like crazy. Usually the posts are strings of highly searched terms (like the names of celebrities, TV shows, or something Google Adsense pays a lot for, like asbestos litigation), with a link to the external site that the spammer is trying to bump up in the Google rankings.

None of this is new; Matt Mullenweg posted on it in March, as did SpamBlogging. But a few things have changed since their posts on the topic.

First, Blogspot now requires CAPTCHA authentication to start a new account, which many people said would fix the problem. It hasn’t. Entering a CAPTCHA sequence takes a human about five seconds, and you only have to do it once.

Second, spammers are becoming less obvious by creating posts that link to actual news articles (complete with excerpts); by all appearances, these blogs are just like scores of real blogs. But if you look at the code of the page, there are tons of external spam links, cleverly hidden by CSS. Here is an example: Mario’s News Archive Posts (to which I’m avoiding giving Google-props by using the rel=”nofollow” attribute). With this additional layer of subterfuge, it’s remotely possible that someone will even link to “Mario’s” blog from their highly-ranked site. Check it out - it’s quite slick. It even auto-reloads every few seconds (though I’m not sure why).

A peer under the hood of Mario’s spamlog reveals something like this at the end of every post:

<style>.lin {visibility:hidden;}</style><div class="lin" style="position:absolute;top:-50;left:-3000;"><font size=1>Links of Interest:<BR><a href="http://www.treadmills100.info/York-Treadmill/Where-To-Get-A-Cheap-Treadmill.cfm">Where To Get A Cheap Treadmill</a>

I realized this when I tried to leave a comment on a news item that I found interesting, and clicked the Blogger “show original post” link, which uses some JavaScript to show the text of the post. However, this breaks the embedded CSS, letting the secret out.

The only thing that will stop BlogSpot from becoming 99% spamlogs is for posting to require CAPTCHA each time. This would be a pain, and it shouldn’t be necessary for people who host their own sites, but BlogSpot users should have to prove they’re human each time. Other thoughts on how to stop this, or whether it should be stopped?

No Responses to “The Strange World of Blogspot Spam Blogs”


I posted about Blogger’s use of captcha and basically found it frustrating t to the point that I almost gave up two blogs. I post TN’s lottery results at http://www.tnlotteryresults.com but out of curiosity throught I’d try blogging them also. I had always been curious about whether or not posting directly to blogger, as in http://tnlotteryresults.blogspot.com/, or having it post the blog to your own server, as in http://www.tnlotteryresults.com/blog/, would impact search engine ranking or visitors. I never had the ranking question answered but viewing the log files I have found that each of the 3 methods attract a hugely different audience to the point that I cannot end my experiment.

I continue posting by hand but the day Blogger turned on Captcha I considered giving up the two blogs. I have considered using the Blogger API to automate the posting process which I think would be legitimate. I think Captcha would stop the spam blogs (and I would like to see them stopped) but I think legitmate blogs like mine would be viewed as spam blogs and also stopped and that would be unfortunate.

7

After reading all the comments and trackbacks, I’m realizing that it’s much more difficult than saying that Google should do something.

Google can’t monitor content, but they could have a button on the floating “next blog” bar that lets you report suspected spam blogs. It would require manpower from Google to process these submissions, though, so I doubt they’ll do it. And false positives are a near-certainty.

But perhaps Google itself has stumbled upon the solution: the rel=”nofollow” attribute for external links. How about this: Blogger stays as-is, but if you use Blogspot, all your links get the rel=”nofollow” attribute, so it’s impossible for you to contribute to Google rankings.

It may seem harsh to block out a huge number of bloggers from contributing to the PageRank system, but consider where we are now. Legitimate Blogspot users are already at a disadvantage in terms of influence on PageRank. Doing away with all PageRank influence from Blogspot sites would level the playing field.

This problem would be replicated anywhere that free hosting can be taken advantage of. I assume that the same problem exists on message boards running popular software such as PHPBB2; surely scripts can be created to mass-post to these sites. All you have to do is Google “Powered by PHPBB2″ and start posting like mad.

That’s why Blogger has a throttle on posting speed, just as djuggler said above. Most forum software has this feature too. But if you’re a script, you can just wait 30 seconds or whatever, and post to another site in another window.

18

Sorry .. I’m quite of the opposite viewpoint here. If the sole purpose of these types of blogs is to increase traffic to their other websites, good for them! There has to be better and cheaper ways to advertise our websites besides paying for it. As far as the internet should be (to me) .. I should only have to pay for access to the internet. I don’t want to pay access to find people or have people find me.

I am a bloggie newbie, but that’s my understanding of what a blog can do for my business. I am trying to develop an information and community website, but I still want referrals to my website. Is that wrong?

The only thing these blogging sites did, that caught your attention in the first place, was not disable the Comment option, in my opinion.

At least it’s not coming to your email box or stealing your email address. How can it be spam? Because you don’t approve?

I am more concerned with ‘professional blogging sites’ like this one, who believe they are the majority of the blogging world and should dictate what is right and wrong before the rest of humanity get a chance to experience it.

But - I do enjoy reading your blogs and value your opinions and views - just this time, I’m on the other side of the fence.

Take care

HART

20



Get RC Via Email



FriendFeed

    Tagegories

    Browse by category:

    Explore by tag:

    Recent Posts

  • Blogroll

  • Archives


    Use the calendar below to find posts by day (mouseover a day on the calendar to see all posts from that day). If you're looking for a specific post, it's much faster to use the search box above.

    June 2005
    S M T W T F S
    « May   Jul »
     1234
    567891011
    12131415161718
    19202122232425
    2627282930  

      Recent Comments


      Creative Commons License
      We aren't very into all that copyright stuff. Creative Commons licenses are better, so RC is licensed under this one.
      Quote Radical Congruency at will. Inbound links are appreciated, and required for direct quotations.