Skip navigation

Want to Help Google Clean Up Splogs?

In response to Matt Cutts’ request on how Google should work on web spam, a friend of mine gives him a very good summary of how Google can put an end to one the biggest blights on the web: splogs. In A big free clue for Google, he points out:

Like many bloggers I can spot a splog in less than 10 seconds. The common features:
* Every entry has “wrote an interesting post” “read the rest of the post here” “..talked today about”
* Most entries are uncategorised
* There is an absence of comments…

Now if I can join those dots why can’t Google? Why can’t the other search engines?

He’s so right. If we can quickly spot a splog when we see one, why can’t Google, the omnipotent profiling algorithm, figure this out and put a stop to these? They have plenty to work with, overrun as they are with tons of Blogpost splogs in desperate need of some serious housekeeping. Why not use these splog spotting techniques to clean up their own house first?

Until then, we can report spam blogs (splogs) when we find them. If you want to do more, why not tell the world (and Google) how to clean up splogs on your blog. Let your voice and ideas be heard. We’re a creative lot when we put our blogs to it. Why not tell Google what you recommend to clean up it’s act.

Related Articles


Site Search Tags: , , , , , , , , , , ,

Feed on Lorelle on WordPress Subscribe Feedburner iconVia Feedburner Subscribe by Email Visit
Copyright Lorelle VanFossen, the author of Blogging Tips, What Bloggers Won't Tell You About Blogging.

23 Comments

  1. Posted July 16, 2008 at 5:25 am | Permalink

    Very true .. is google doing something on this regard ?

  2. hara
    Posted July 16, 2008 at 6:55 am | Permalink

    …seems like they still haven´t found the right technique against splogs…no automatic detection of spam in Google is working properly…

  3. Posted July 16, 2008 at 7:47 am | Permalink

    You are so right about these splogs but I think it will be a little tricky to setup rules that wouldn’t rule out perfectly legit blogs. E.g. not allowing comments doesn’t mean that a blog is a splog. Some people are using e.g. WP as a free CMS system and doesn’t allow comments.

  4. Posted July 16, 2008 at 7:49 am | Permalink

    Lorelle, I love your blog, I’m a long time reader but this my first post on your blog.

    I think that when it comes to spam and splogs there is a very important element that a lot of people are forgetting about. I used to work for a privet religious university and one of the major issues on campus was content filtering. While a majority of the people were of the moral opinion that the university should provide a content filtered web and e-mail system, almost all of the Faculty were dead set against it. The reasons were varied and valid, but the most compelling in my opinion was that you can only support filtering so much before you have to be willing to allow someone you don’t know to have access to your intellectual property.

    Do I think that Google is in the perfect place to be able to start filtering a lot of the smut and other undesirable content from the internet, if any one can do it they can. However, the rub about being an open blogging community is that you have to be open. And as such, I don’t like the idea of Google deciding what posts are and are not valid, based on an algorithm that some all knowing computer is applying to the situation. Sounds too much like something from 1984.

    All in all, even though I know it’s a pain in the butt, I’d rather use something that I am responsible for. Spam Assassin, Akisment, and other programs exist so that I and the one who controls what goes on in my domain of influence, not some great unknown. Yeah, it sucks to have to do it myself, but at least I know that the buck stops with me.

    This is defiantly a hot topic for internet users, and I’m really interested to see how thing develop in the next few years.

  5. Posted July 16, 2008 at 8:41 am | Permalink

    @ Trinity777:

    Smut? Was I talking about smut? No. The discussion here is about splogs, not comment spam. Splogs are advertising filled, no original content, waste of time blogs that serve no other purpose than to market whatever they are offering through link bait, content theft, phishing, and…oh, stop me now. The least they are doing is offering porn. The most they are doing is loading up search results with time-wasting nothing.

    Splogs make money by having hundreds and thousands of websites with the same or similar content. They use all types of auto bots and Plugins to grab other people’s content, twist and yank it to look a bit different, or leave it the same, and insert links and ads to promote whatever they are trying to make money with. It’s all automatic. One person can set it up and the rest goes click, clack, smack in the wallet. No human interaction. No human writing content. No personal or direct involvement. It works and they click off the numbers in the bank, which is pennies, but spread across a thousand websites, those pennies add up.

    This is not about censorship. I agree with you that it’s a painful process, but if you and I can spot a splog, then Google should be able to come up with a method to block them from their search results. It doesn’t mean they have to shut them down, though they should on their own blogging service as WordPress.com does.

    Personally, I think the issue is two fold. First, develop an algorithm or human test that verifies if a Blogspot/Blogger blog is a splog. Ban all splogs on Google’s services. That will decrease the weight on the web by tons. It will also clean up the useless search results that clog the user experience. It doesn’t stop sploggers from going elsewhere, but at least Google won’t be directly hosting such ad filled, bandwidth clogging, server pigs.

    Second, put pressure on web hosts to vet the sites they host as splogs or not. This is more problematic and can be seen as a form of censorship – among other legal issues. This will take longer as it means setting up a system and dealing with laws, false/positives, and so on.

    It’s a complex issue, but one step at a time. Let’s start with Google cleaning house. 😀

  6. Posted July 16, 2008 at 9:42 am | Permalink

    The easiest way to identify a splog vs. a blog is this:

    Blog – Content geared towards Human Readers.

    Splog – Content geared towards Search Engine Crawlers.

    Its important to understand the complexity of the situation. The hardest thing for google I would imagine at the moment is not weeding out the splogs, but getting false positives on legitimate blogs. It would be a hassle if for every 10 or 20 splogs blocked one real blog was blocked because of lack of comments, poor formating of comment, or if one of their posts is scrapped and splogged.

    Also, lets compare the consequences of being “banned” or “marked” by google:

    A good Splogger system can deploy hundreds of websites at a time. If they get 10, 100, or 1000 of their “splogs” filtered, they just get new domains, click some buttons, and rapid mass deployment replaces those banned ones. So the consequence isn’t very harsh for sploggers.

    Now, lets say my personal website gets marked as splog. All of a sudden I am cut off from 60-70% of all searchers on the internet. Other networks like Digg, Technorati, etc. may block me based off Google’s decision, depending if they publish the lists of Splog or allow for other sites to check if Google has splogged a site.

    This is devastating for a blogger. All of your work now cut off from the rest of the internet.

    The Solution?

    I’m afraid we might have to move towards a much more sophisticated methodology of publishing content on the internet, much of what email host providers are moving towards with a trust system. Basicly the more content you publish that is legit, the higher you validity and trust goes up. Who knows exactly, I’m no expert, but I can appreciate the difficulty of the challenge Google is facing.

  7. Posted July 16, 2008 at 12:52 pm | Permalink

    I feel sheepish…thanks for setting me straight on this one. I miss understood what a Splog is.

    I agree with Justin Carmony. The software company I work for had a major problem this year when spyware programers started using a file extension that is used by the delphi database our software uses for an optional element of our database. Suddenly our support desk was flooded with calls from users who could no longer access their databases, because their spyware detection software had either quarantined our program (thanks Norton) or simply deleted files that were suspect. What if a blog gets black listed by mistake? There would need to be some sort of process to plead your case, or verify that you are not a Splog.

  8. Posted July 16, 2008 at 2:37 pm | Permalink

    @ trinity777:

    Which is why Matt Cutts brought the issue before the blogging community so Google could get suggestions straight from the mouths of the experts whose blogs may be impacted by whatever system that is developed.

  9. autworld
    Posted July 16, 2008 at 8:34 pm | Permalink

    i think they are working on this problem, i read yesterday that 77% of all Blogs from Blogger.com are Spamblogs…

  10. Posted July 16, 2008 at 9:15 pm | Permalink

    I must have sent more than 500 splogs to spam-dump hell simply by hitting the “report as spam” button while going through my wordpress tag surfer. I usually get a little thank-you note from the person looking at it, maybe an added comment that via the IP address they were able to whack dozens more from the same spammer.

    What I’d really, really like though is a T-shirt for all that effort. 🙂

  11. Posted July 16, 2008 at 9:57 pm | Permalink

    It’s interesting that I came across your blog Lorelle, not only is it true, but it happened to me yesterday morning. Now I admit, I’m brand new to the blogging community, but was hit with the doppelblogger on my first post! The best part was that it linked to my blog in two places, and one of them was a hyperlink with someone else’s name as the author.

    I understand that there will always be someone looking for a short cut to make money for doing nothing, but for the amount of time genuine bloggers put into their posts and sites; honestly, it’s a crime.

    Anyone want to go splog hunting with me?

  12. Posted July 17, 2008 at 12:52 am | Permalink

    Thank goodness for Akismet, although it’s a crying shame that the majority of comments reside in there instead of the real comments section.

  13. Posted July 17, 2008 at 4:26 am | Permalink

    Didn’t I just read that Yahoogle controls 90% of web advertising? All those splogs are running AdSense. Seems to me that there’s a connection.

  14. Posted July 17, 2008 at 7:01 am | Permalink

    All the issue here goes back into the bigger problem of the internet – IT IS SO EASY TO GET AN ACCOUNT… FOR FREE!!

    The main purpose splogs exist is to get income – with little work as possible – from advertisements like Adsense. How hard would you think to get an Adsense account?

    Same thing goes to blogs. It is too easy to setup a blog in wordpress.com, blogger.com, myspace…. They are inviting for troubles.

    Even if the splogs do not make money from Adsense, they can use it for SEO purpose, which I think is black hat.

    I would strongly suggest all registration must be very detail, including submitting user photo, enter PIN number from actual mail, or calls from representative to verify your identity.

    No splogs can survive that.

    Not to mention it works with email spammers as well.

    And we’ll have a better and cleaner Internet world. Isn’t that nice?

    – Rufas

  15. Posted July 17, 2008 at 10:07 am | Permalink

    I have been noticing splogs more and more lately. My Google alert for Albuquerque real estate can include nothing but splogs. So I have made it my mission to search using google blog seach and flagging all those splogs. The results are getting better. Now instead of 9 out of 10 results being splogs the first page is 5 out of 10. Much better results. The flagging does work. Some of these sites had a pagerank of 3! Those are not new sites but ones that have been around a bit. Just keep flagging away until Google sets up something that works to keep them out.

  16. Posted July 17, 2008 at 2:41 pm | Permalink

    @ Ken Nickless:

    Again, the issue is not about comment spam. Akismet doesn’t help with splogs. Has nothing to do with splogs. Splogs are spam blogs. They have nothing to do with your blogs unless they use an auto-scraping tool that grabs your content via feeds and abuses it. That’s a copyright issue, a very important but different issue.

    As many have said here, when you search for something, as I reported on in Google, Clean Up Blogger!, depending upon the keywords, it can be pages and pages you scroll through before you get past the splogs to real and legitimately helpful content. If I remember right, it wasn’t until page 16 or something before I got a real site in the search. Splogs clutter up search results, waste bandwidth, server space, and have no value.

    EXCEPT to those who make money from our ignorance and lack of concern. Should they have the right to set up their spam blogs? It’s a legitimate way of making money, though I’d put it in the ranks of whatever you think is not a nice way of making money in your culture, but this is not the issue. The issue is whether or not they should be treated equally in the eyes of search engines as they have no content, thus, no legitimate purpose for search inclusion.

    I wish it was a simple answer, but that’s why I’m turning it over to the bestest and the brightest to help brainstorm the ideas: my readers. 😀

  17. Posted July 20, 2008 at 9:00 pm | Permalink

    The fact is google doesn’t want to do anything about it. Why should they. These blogs can get huge amounts of traffic, most of them use adsense, and those two things turn into profit for google. Never forget they are int he business of making money.

    I did a test on this about three years ago. I bought a domain put up a wordpress photoblog which simply reposted from a couple of popular google and yahoo group feeds. It was completely automatic including the insertion of tags, keywords, and other SEO. I won’t go into the details, I don’t want people doing the same thing, but on a non google ppc (I wasn’t going to risk my account) I earned $1350 in my third month and it went up from there. The blog hit PR 6 after 7 months. I flipped it when my host threatened to shut me down. Had this been an adsense based site I probably would have made 2 or 3 times what I was making. If I made that much, think how much google would have made. They have no interest in getting rid of splogs.

    On the other hand if you study what these splogs are doing in terms of SEO you can beat them at their own game, even if it takes a while to do.

  18. Posted July 31, 2008 at 2:54 pm | Permalink

    From necessity, I’ve recently had to look through a number of these blogspot splogs which have been scraping my wife’s blog. Virtually all of them seem to peel back in the top right hand corner to reveal dodgier sites beneath. This is one feature that without argument contravenes Blogger regulations – and is 100% indicative of a splog. One would have thought this would be easy for Google to detect automatically and remove, surely?

  19. Posted August 1, 2008 at 4:19 am | Permalink

    I wish there was a better solution to this issue.

    My personal blog is self-hosted and powered by WordPress (yay!) but I contribute to blogs on other platforms. Today the group blog of the Twelve by Twelve Collaborative Art Quilt Project was locked by Blogger’s spam prevention robots as it was [incorrectly] identified as a potential spam blog. As we were due to “reveal” our latest challenge quilts on 1 August, I suspect that the robot was triggered by the batch of posts scheduled for today. Now we cannot publish any new posts until Blogger conducts their review to verify that it is a non-spam blog and unlocks it. This process could take some time…

  20. Posted August 1, 2008 at 8:55 am | Permalink

    @ Brenda:

    And it might take only a day or two. I do hope you are blogging the issue on your non-blogspot blogs, and that you have a backup of your blogspot blog. Why not import it via the Blogger/XML import to a WordPress or WordPress.com blog and get it off of Blogspot.

    If you are doing huge projects like this that are dependent upon free blog service requirements, you can get in trouble, so I recommend independent hosting for such projects so you have more control. Good luck!

  21. Tshiananga
    Posted August 4, 2008 at 5:34 am | Permalink

    hey I just wanted to say welcome to the free world!
    There’s no way in stopping whatsoever, they will always find a new backdoor, another crack in the seems that will flood your ship with splogs, spams or whatever they can squeeze in there.S
    So for those willing to fight against it, I’m allways down for a good cause, but to do the impossible is no resolution to me.
    The reasons people advertise, or the reason people promote their products is to enhance your life, or to make a ruthless profit! But this you will never know unless you bite the cookie, and perhaps it might bite you back. Or spam/splog you. The real reason they get in is because you left the door wide open, with an invitation that says, frack me! If u want to clean up the mess they made, you should seriously consider plugging out your pc and start blogging pen and paper, 17th century style, where you can seal your private disscusion witha nice document-seal! Maybe ad a royal-ring stamp as well. All I can say, that it’s only gonna get worse. Bless you for trying tough! The Q-meister

  22. greenman023
    Posted August 23, 2014 at 6:58 am | Permalink

    Hi I’m curious as to why no one seems to have considered action against the advertisers: i.e Google, Amazon. Alibaba and the main affiliate programs that sploggers exploit. For it is not only the splogger earning from copyright infringement but the big advertisers.. It is they rather than the splogger who is the true leach.

    Would it not therefore be possible to take a class action against some of these advertisers to recover ALL the monies they have earned from copyright infringement.. If advertisers thought they could lose their income then perhaps they would make an effort to ensures sploggers don’t exploit them to exploit bloggers.

    regards

    malcolm

    • Posted August 26, 2014 at 7:04 pm | Permalink

      I think that’s a brilliant idea. “Advertisers” say that they penalize splogs, but how would they know. If you find a splog, you may report the site through links required to be on the ads, though most don’t, for abuse to the advertisers, placing the onus on us, the web visitor, a terrible decision.

      As for the copyright infringement, I don’t believe it qualifies for class action, but it is a great idea. Many attempts to penalize splogs have been done over the years, as described in this post written many years ago. Google, itself, is home to the largest number of spam blogs and splogs through Blogspot/Blogger.com, and they are making money off them while penalizing them through their algorithm.


3 Trackbacks/Pingbacks

  1. […] saw this post in my feed reader today that was titled “Want to Help Google Clean Up Splogs,” and was drawn into it right away. For those who don’t know what a “splog” […]

  2. […] there’s a good reason for this.  It’s not anything to do with splogs or the overall sameness look of most Blogger blogs.  It has to do with the Blogger comment […]

  3. […] Finally, my sanity solution is to know that splogs can be reported. You know clicking on that flag in blogger or reporting to google. Perhaps more but something can  be done. I did not know when I first started blogging and became upset when I saw  it from a google alert on my articles but now I do.  Read this post from lorelle.wordpress on helping to clean up splogs. […]

Post a Comment

Required fields are marked *
*
*