Skip navigation

AntiLeech Splog Stopper: Fighting Back Against Content Thieves

I often have these kinds of thoughts: “What if smokers had to ask for a smoking section in a restaurant, assuming all restaurants catered first to non-smokers?” “What if everyone thought first of asking permission before borrowing and taking what wasn’t theirs?” “What if the people were able to vote on whether or not they really wanted their country invaded, or just their leaders replaced?”

These thoughts all boil down to responsibility. I think the weight of the responsibility should be on the abuser. If you are thinking of doing wrong, I think bells and sirens should go off inside your head for a long time before you can act.

Unfortunately, the responsibility to protect ourselves against evil and idiots lies with us, not with the abuser. Among the new tools available to us bloggers, the responsible ones, is Owen Winkler’s AntiLeech WordPress Plugin.

The AntiLeech WordPress Plugin doesn’t stop blog feed scrapers and splogger bots from grabbing your blog’s content. According to Owen Winkler:

No, it does better than that. It produces a fake set of content especially for them that includes links back to your site (and mine, too, ok?) and sends it only to them. When they steal this content, it appears online just like normal, except now you’ve turned the tables on them. You’re actually using the sploggers to promote your own site.

AntiLeech can detect a splogger bot using its User-Agent string (an identifier that some bots send when they are collecting data), or by IP address. You can enter a User-Agent or an IP address into the Options panel of your WordPress blog. When a visitor with a qualifying (any checked option on the options page) User-Agent or IP address visits your site, they will see only the generated content. They will see it in your page layout and in your feeds. Anywhere you’re normally outputting content, that’s where the fake content will appear to them.

Regular users whose browsers do not match these strings will see your normal content. RSS aggregators should be able to display your content normally, too.

Help Defeat the Sploggers with AntiLeech by Owen Winkler

When a splog (spam blog) grabs the feeds from your blog and uses it as it’s own content, it is called scraping. Different blogs have different copyright policies. Use of full content feeds on sites with advertising or considered “commercial”, even with links back to the original site, is often a violation of the most common blog copyright policies. Putting a stop to these content thieves can be difficult, as seen recently with the Bitacle Battle.

Then Owen Winkler, WordPress hero, straps on his splog fighting coding tools and steps forward to help us fight back against the splog content thieves. He understood that everything and anything visiting your site leaves behind a footprint. The key is finding their footprint and identifying it as a splog and then stopping them from getting their foot in the door in the future.

The AntiLeech WordPress Plugin sends a small graphic “AnitLeech” graphic in your feed’s output. The graphic helps AntiLeech collect User-Agents information that you might want to block. The Plugin’s Administration Panel lists on what page it first saw the User-Agent using the graphic, and provides information to help you better make the decision to block that User-Agent or not. From the Admin Panel, you can choose to block those site’s access or not.

AntiLeech will add information to your robots.txt file, a file in the root directory of your site that contains instructions for web crawlers and web bots, computer programs that visit your site and collect information and data. Instructions on denying access to these splog abusers and scrapers is added to the robots.txt file, putting a stop to their visit before they get more than a toe in the door.

AntiLeech WordPress Plugin User Agent Options stopping splogs and content theftOnce you have activated the AntiLeech Plugin, you will find its panel under Options > AntiLeech. You have several options you can control from there.

Under Observed User Agents is the area that will help you detect who may be stealing your content through your blog’s feed. Once you have determined which splogs are stealing your feed content, you can enter in an identifying name in this section. For example, if bitacle.org is stealing your feed content, to block it, add “bitacle” in the form. Any access with an identifying footprint with bitacle in it is considered evil and AntiLeech will kick into action, delivering fake, truncated, and other “unhelpful” information to the scraping site.

Under IP Addresses, you can enter part or all of an IP address to identify abusers. Many serious sploggers will play games with hiding and changing their IP address, but not all. Using a combination of IP address and User Agent name will act in combination for stronger protection. If you add the IP address to the list, they will also get the “faux” or fake information when they access your feed.

AntiLeech WordPress Plugin Output Control stopping splogs and content theftThe Output Control section is probably the most fascinating as it gives you an insight into what AntiLeech really does. You can control the various options on how AntiLeech will respond to the User Agents and IP addresses you’ve targeted. They are:

  • Do not insert the AntiLeech image for detecting leechers into feed output.
  • Do not link to my blog inside the generated posts.
  • Do not publish the correct link (in the tag) in my blog’s RSS.
  • Do not attempt to remove AdSense iframes with javascript on remote pages that display feed output.

That’s some serious options. I especially like the last one. Stopping income generated from your stolen content is brilliant.

The last option in the Output Control section is setting what you want displayed in the feed information sent to the scraper. It can be the generated content, truncated content, or custom text that you write. You can say anything you want, but a good start would be: “You may be reading stolen content. Please visit the author’s site to read the original, copyrighted material, and find even more great related content.”

The last option to set in AntiLeech is the option to control your FeedBurner Redirects. While many think that giving control of your blog’s feeds to FeedBurner will protect you from scrapers and copyright theft, it doesn’t. They help, but they don’t always stop splogs from using your feeds.

This is brilliant stuff and a fantastic complement to Digital Fingerprint Detecting Content Theft WordPress Plugin which I highlighted recently, which puts a unique searchable phrase into your feed content and displays search results on your WordPress Administration Panel to help you track if your content shows up in search engines via splogs or scrapers, content thieves.

For those using blogs, they’ve been working overtime to put a stop to some big scrapers such as Bitacle.

Only when we, the responsible ones, are well armed and fight back, we may see an end, or at least a decline, of the evil doers. I just wish they would bear more of the responsibility for being responsible for other people’s interests and not their own. Don’t you?

Related Articles


Site Search Tags: , , , , , , , , , , , , , , , , , , , , , ,
Copyright Lorelle VanFossen, member of the 9Rules Network

Member of the 9Rules Blogging Network

32 Comments

  1. Posted October 5, 2006 at 10:19 am | Permalink

    I’m a big fan of Owen’s plugin AntiLeech. I’ve been thinking of various ways to use it other than the simple ‘Warning, you are reading stolen content’. I like your bells and sirens type message in the post above, and when my content gets stolen bells and sirens do go off, I get angry, and I want to get even with the splog. Why not server up fake content that gets the splog in trouble with their host and advertisers?

    You could write up some text to sell guns, drugs, and ask people to click on ads. Then, when this dirty content shows up on the spolg, report them to Google / Yahoo / their host. Game over for the splog. I’ve explored this idea more fully in “Fight Dirty by Entrapping Splogs Using AntiLeech.”

    This is a dirty fight, its time us nice rule and law abiding bloggers got a little mean, and learned to throw a little sand, hit low, and especially, roll with the punches. We need to stop being such easy marks.

  2. Posted October 5, 2006 at 11:00 am | Permalink

    i like this plugin a lot. i just updated to the most recent version. i just hope it works. i dont need someone else claiming my life is theirs.

  3. Posted October 5, 2006 at 11:51 am | Permalink

    Any official word on how well this works against Bitacle?

  4. Posted October 5, 2006 at 12:25 pm | Permalink

    Nitpick from the grammar Nazi: It’s scraping, not scrapping. Sorry.

    I’ve been watching discussion around this plugin, but I’ve been hesitant to use it since I don’t really understand the way it works. It sounds like it has the potential to prevent my regular readers from receiving my true feed, so I don’t want to use anything that will prove problematic on that front. How is Anti-leech about false positives?

  5. Posted October 5, 2006 at 1:11 pm | Permalink

    In my case it works based on the IP address, since bitacle seems to use a common user agent string (the plugin lists all user agents that have accessed the feed).
    Here’s my latest post (and the first to successfully having been antileeched), or at least the generated faux content:
    http://de.bitacle.org/v/249zkosalfyc0/herbst.html?usrmode=1

    I use the IP adresses I found in this comment: http://www.basicthinking.de/blog/2006/09/21/bitacle/#comment-70589

  6. Posted October 5, 2006 at 2:01 pm | Permalink

    Jim: Thanks. No wonder my spell check didn’t catch that. Scrapping is for scrapbook lovers, and scraping is what gets the gunk out from fingernails. ;-) Thanks for the catch.

    Heliologue: It works as long as you input the information bitacle is using to scrap your blog’s feed. They may change their technique, since they seem to be determined, so you might have to stay on top of this. The Plugin reports on suspected scrapers (and is that scrappers or scrapers? ;-) ) and there is information around the web now on the IP addresses they are using.

    In other words, like all responsible efforts to stop evil, you have to do the work to keep up with the user agents and IP addresses used by bitacle and other evil sploggers. So the plugin works only as much as you put work into it. But it makes the job so much easier.

  7. Posted October 5, 2006 at 7:26 pm | Permalink

    Although this plugin is indeed brilliant, it has more fatal flaw (at least, for some users): it doesn’t help if you use services like FeedBurner. I’ve been battling the last couple of weeks whether or not to force my visitors to switch feeds so that I can take advantage of plugins like this, I’d hate to start back at 0 RSS subscribers.

    But I might do it anyways :(

  8. Posted October 5, 2006 at 8:11 pm | Permalink

    Actually, it does. If you look in the article and try the plugin you will find that there is a special option for controlling Feedburner fed feeds.

  9. Posted October 5, 2006 at 8:41 pm | Permalink

    lol, Lorelle.

    scrape, scrapes, scraping, scraped, scraper

    I think that just about covers it. ;)

  10. Posted October 6, 2006 at 10:40 am | Permalink

    Lorelle, I would ask this question to Owen, but I can’t find his contact info, and the plugin pages don’t explain this:

    Do you know how AntiLeech determines that a user agent or IP is suspicious?… Exactly what’s the test? (e.g. ignoring of robots.txt? “browsing” behavior”? referrers?).

    The plugin’s functionality sounds promising, but without more information, it worries me that it could report false positives. Do you have more information?

  11. Posted October 6, 2006 at 11:15 am | Permalink

    That’s odd. Owen Winkler usually has comments open and such.

    Anyway, how can this plugin determine false positives when it is up to you to determine if the splogger is splogging your site (stealing your content). It returns a list of potential user agents that it has detected are potentially scraping content from your site. You check to see if they are, and if they are, you put a check next to their user agent name and/or IP address. YOU CHOOSE, you decide, and you pick who is playing nice.

    If you are allowing your feeds to be picked up and used by other sites, as syndication or otherwise, then you DO NOT want to include them in your leech list. You have total control. No chance of false positives when you are in charge of deciding who stays, who plays, and who goes.

    Does that help?

  12. Posted October 6, 2006 at 11:23 am | Permalink

    Thanks Lorelle. I knew that the plugin doesn’t automatically block things for you. What I’m wondering about is how the plugin determines that something may be scraping your content. How it comes up with the list of “suspects” for you to research.

  13. Posted October 6, 2006 at 11:25 am | Permalink

    I’ll try to answer a few questions.

    The first thing to know about AntiLeech is that you have full control over what it blocks, and it won’t block anything without your telling it to. AntiLeech uses the following process to create a list of potential user-agents to block:

    A small image is embedded in your outgoing feed. When a browser encounters that image on a splogger’s site, it requests the image from your site’s server. Because of this, AntiLeech knows that your image (and your content) is appearing on some page that you don’t control. The URL for this image is specifically created for the user-agent that requests it. AntiLeech uses this URL information to build its collection of potental user-agents and IP addresses to block.

    Will it report false positives? Well, no, because it reports any suspicious user-agents that meet the criteria I’ve described. You control which user-agents to actually block, so AntiLeech is never going to block someone unless you’ve told it to. Unlike email spam blockers, legitimate browsers are extremely unlikely to announce themselves using a string that you’re blocking, like “Bitacle”.

    So unless your visitors browsers identify themselves as “Bitacle”, they’re going to get the right content. Since most popular browsers don’t even allow you to change the user-agent without installing extra parts or setting odd settings, this simply isn’t going to happen.

    IP addresses are a little different in that you want to be sure that the IP you’re going to block is actually always the person that you know to be scraping your site. For some people, their ISP provides a different IP address every time they connect. If a splogger uses their home IP to scrape, then it’s possible it’ll change every time they log on. Can you block these IPs? Sure, but it’s likely that someone else who uses the splogger’s IP will later be unable to access your site. Bitacle, for example, seems to scrape from a system connected via an ISP in Spain.

    To be clear about FeedBurner, the options in the plugin only help you redirect your feeds to them instead of using the Ordered List plugin, which has a couple of issues. Currently, I’m trying to gauge the advantage of using FeedBurner (yes, I’m a paid subscriber), since they don’t seem to provide any method of managing the splog problem with the feeds that they re-publish. The only thing I see that is really useful is the feed statistics, which I’m about to get from somewhere else. As soon as I figure out how to cleanly bring my feeds back in-house away from FeedBurner, I’m going to.

    The AntiLeech plugin really just simplifies the process that you’d otherwise accomplish using complex .htaccess rules. Really, if you’re comfortable with the .htacces rules, and you don’t want the additional features like providing unique content to the sploggers, I recommend using the .htaccess rules instead of this plugin because they’re much more efficient (as would be any function that takes place at the server level, not that the plugin is not efficient).

  14. Posted October 6, 2006 at 11:37 am | Permalink

    This helps a lot Owen. Thank you!

    I currently block things via .htaccess, but from your response I see that the added-value of your plugin (besides revenge) is that it yells back at me reporting that the image is somewhere. So even if Bitacle or somebody previously blocked changed their identification, AntiLeech will find them again.

    What a bummer that this can’t coexist with FeedBurner. Please keep us updated on your findings.

    So… if you’re still checking these comments, I have one last question: Sounds like your plugin won’t mess up with the .htaccess file. Right? It doesn’t require to open .htaccess for the server to edit, or does it?

  15. Posted October 6, 2006 at 11:44 am | Permalink

    Hmm. Here’s a thought… Sorry:
    So, if Feedburner removes the image, hurting AntiLeech’s ability to find sploggers… The Digital Fingerprint (the other new plugin) will still go, and help you identify sploggers.

    Does AntiLeech allow users to manually enter offending IPs or user agents? I’m sorry.. I probably should just download the thing and try it myself. But if you could answer this question…

  16. Posted October 6, 2006 at 12:20 pm | Permalink

    Thanks, Owen. Hopefully that will help answer a lot of people’s questions.

    Maria: YES. You manually enter in offending IPs or user agents if you want. Otherwise, if they pop up on the list, you just check them in the check box to add them to the leech list.

    It’s really simple to use and you have total control over what happens. Give it a try and most of your questions will be answered.

  17. Posted October 6, 2006 at 2:47 pm | Permalink

    Maria: I think your other questions were answered, but specifically, no, AntiLeech does not modify your .htaccess file at all. It takes control via WordPress’ existing rewrite rules. The downside is that it will only protect WordPress. So if you have some other way to manage content on your site, you’ll need to protect it some other way.

  18. Posted October 6, 2006 at 4:01 pm | Permalink

    Sounds great, Owen.
    I’ll definitely give AntiLeech a try. Regardless of whether it’ll for me or not (and I hope and think it will), I wholeheartedly thank you for joining the team of copyright superheroes. We need more!!!

  19. Posted October 8, 2006 at 3:13 am | Permalink

    I’m just wandering, how many of us actually get our content stolen? And how many bloggers actually have something worth stealing? :) [no offense to anyone]

  20. Posted October 8, 2006 at 8:39 am | Permalink

    It’s not about having something “worth” stealing. A huge number of WordPress.com blogs were grabbed by bitacle. A lot of them had little original or “worth” stealing, in my very humble opinion comparatively. Stuff is stolen all the time that you or I may think has little value, but it all has value to the owner.

    Those who care about what they write and who may be abusing it, tend to find more abuse than the casual blogger who doesn’t investigate the abuse of your site. So coming up with numbers means taking care and concern, self-interest, willingness to investigate, and professionalism into account. The more you do this as your business, the more likely you are to pay attention to these things and not call copyright infringement “flattery”.

    Some sploggers are very particular, grabbing only specific keyword related content. I’ve found my main site, Taking Your Camera on the Road, listed with many sploggers because I had the keywords in my article, not because my article had anything to do with their advertising product. Over four years ago, one was selling aquariums, fish ponds, and related equipment and grabbed my article on photographing fish and sea life through the glass of an aquarium. A little related, but not at all.

    Anyone’s stuff can be stolen. It can just take work to track them down. Through WordPress Plugins like AntiLeech and Digital Fingerprint, it’s getting easier to track the thieves.

  21. Posted October 8, 2006 at 8:40 am | Permalink

    Oh, and the answer is likely to be in the millions.

  22. Posted October 9, 2006 at 5:27 am | Permalink

    Hmmm, I have installed it and now I’m wondering how it works? Is there a man-page or example?

  23. Posted October 9, 2006 at 6:54 am | Permalink

    Hmm, when all else fails, read the instructions? ;-)

    Once you have activated the AntiLeech Plugin, you will find its panel under Options > AntiLeech.

  24. Posted October 11, 2006 at 6:46 am | Permalink

    This has been an eye opener for me. I never thought that my content could be stolen! But after reading your post, I realized that “Hell! When people can steal almost your life in today’s world, they can get steal your blog’s content!”

    Thanks Lorelle! You have been an eye opener for me. I’ll download the Plugin right now and get this over with.

  25. jwideman
    Posted November 27, 2007 at 4:29 am | Permalink

    Don’t indexing services punish you for being linked by known splogs?

  26. Posted January 6, 2008 at 5:31 am | Permalink

    The download link doesn’t work. Anyone know where i can download it?
    Thanks

  27. Posted January 6, 2008 at 12:46 pm | Permalink

    @ andly:

    I’ll let Owen know that he’s having problems with his site. I don’t know if this is the latest version, but you can download the Plugin script, and copy and paste it into a text file and save and upload it to your Plugins directory from antileech.php.

  28. Posted February 19, 2009 at 10:09 pm | Permalink

    Does this plugin still work with 2.7? I have tried it and my rss feeds are not being effected when I’m testing with it.

    • Posted February 19, 2009 at 11:26 pm | Permalink

      If you are using WordPress.com, you cannot use WordPress Plugins. If you have a self-hosted version of WordPress, it should work. Contact the Plugin author for information on the status of the Plugin for the latest version of WordPress.

  29. Posted April 3, 2010 at 6:05 am | Permalink

    Hi Lorelle, thanks for sharing this great stuff, I hope this plugin will work fine for me to overcome the splog problem

  30. iLuLu
    Posted October 20, 2010 at 12:38 pm | Permalink

    Hi there. My blog is being scraped and I don’t know what to do! I installed the plugin you suggested – Antileech and it worked for all of 2 days! so now I have no idea what to do. I have tried reporting these splog sites to google. I’m a fairly new blogger and trying to build an audience and online reputation and I spend hours and money to buy photos and to write my own content and to see it being reproduced word for word is so discouraging. I tried contacting the plugins owner but he does not provide support
    thanks!


27 Trackbacks/Pingbacks

  1. [...] You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site. Technorati: fighting spam « b5 Media VCFunding [...]

  2. [...] AntiLeech Splog Stopper Fighting Back Against Content Thieves (tags: wordpress plugin copyright) [...]

  3. [...] AntiLeech Splog Stopper: Fighting Back Against Content Thieves (Tags: wordpress plugin) Social Bookmarking:These icons link to social bookmarking sites where readers can share and discover new web pages. [...]

  4. [...] The issue of sploggers getting bloggers‘ content has been quite a big thing these days. Lorelle blogged about this nifty plugin from Owen Winkler. It is aptly called AntiLeech WordPress plugin. It not only gets rid of the sploggers. It also generates fake content to show up on the splogger’s site (or sites, as the case might be). [...]

  5. [...] AntiLeech Splog Stopper: Fighting Back Against Content Thieves [...]

  6. [...] WordPress users recently got help from some creative WordPress Plugin authors. Check out AntiLeech Splog Stopper: Fighting Back Against Content Thieves and Digital Fingerprints Help Track Blog Content Theft, WordPress Plugins that will not only help you put identifying unique elements inside of your feed content, but also report back on who is ripping off your blog’s content. [...]

  7. [...] A relatively new plugin, AntiLeech has already gartered a good amount of press. The plugin, which is by Owen Winkler, works by misdirecting scrapers. It identifies scrapers through a variety of methods and directs suspected bots to dummy content, content that is determined by the user. [...]

  8. [...] One of the reasons I love WP is the third party stuff like the AntiLeech Splog Stopper: It produces a fake set of content especially for them that includes links back to your site (and mine, too, ok?) and sends it only to them. When they steal this content, it appears online just like normal, except now you’ve turned the tables on them. You’re actually using the sploggers to promote your own site. [...]

  9. [...] AntiLeech Splog Stopper: Fighting Back Against Content Thieves [...]

  10. [...] more aggressive approach is taken by AntiLeech (Review by Lorelle). Make sure you know exactly what you are doing with this [...]

  11. [...] 1. AntiLeech, a plugin that “helps prevent content theft by sploggers” and  a detailed article explaining the benefits of AntiLeech Splog Stopper: Fighting Back Against Content Thieves; [...]

  12. [...] posts and see if the plugin works. For those of you not familiar with content-leechers, check out Lorelle’s post on the topic. I prefer not to link any content-leeching sites, as this would just pad their rankings and [...]

  13. [...] Looks like Owen Winkler (Antileech) has already written this! Kudos! Lorelle gives an overview and also recommends Digital Fingerprint Detecting Content Theft WordPress [...]

  14. [...] was going through my logs banning spammers (using the Antileech plug-in, so let me know if you see feed weirdness) and I was pleased to find that according to [...]

  15. [...] AntiLeech Splog Stopper – Fighting Back Against Content Thieves [...]

  16. [...] AntiLeech Splog Stopper: Fighting Back Against Content Thieves « Lorelle on WordPress – The AntiLeech WordPress Plugin [...]

  17. [...] This plugin may be a bit confusing for some users. There is an excellent article about using this plugin at AntiLeech Splog Stopper. [...]

  18. [...] What is it?  From the plugin page (the plugin page is currently MIA, but you can find more info at this website):  “What does AntiLeech do? AntiLeech does not prevent the splogger bots from accessing your [...]

  19. […] AntiLeech Splog Stopper: Fighting Back Against Content Thieves […]

  20. […] AntiLeech Splog Stopper: Fighting Back Against Content Thieves […]

  21. […] AntiLeech Splog Stopper: Fighting Back Against Content Thieves […]

  22. […] AntiLeech Splog Stopper: Fighting Back Against Content Thieves […]

  23. […] AntiLeech Splog Stopper: Fighting Back Against Content Thieves […]

  24. […] AntiLeech Splog Stopper and Digital Fingerprints WordPress Plugins (my reviews of these options) can be used to track content thieves by inserting digital “fingerprints” into your content’s feed which then can be used to search search engines to find the unique content or “fingerprints”. […]

  25. […] AntiLeech Splog Stopper: Fighting Back Against Content Thieves […]

  26. […] can read more about how these splog-stopping WordPress Plugins work in my reviews on AntiLeech Splog Stopper: Fighting Back Against Content Thieves and Digital Fingerprints Help Track Blog Content […]

  27. […] AntiLeech Splog Stopper: Fighting Back Against Content Thieves […]

Post a Comment

Follow

Get every new post delivered to your Inbox.

Join 19,717 other followers

%d bloggers like this: