Last week, an article I wrote for the Blog Herald, Blogging Outside of Your Community By Not Blogging in Your Native Tongue, caught the attention of more than just the readers. It caught the attention of the staff of the Blog Herald by attracting a very unusual trackback from a new kind of copyright violating splogger.
Editor Tony Hung reported on the new type of splogger to warn bloggers, even before we had all the information about how they are working. It clearly appears that they are attacking WordPress blogs, often using WordPress Plugins that specialize in feed scraping and other evil doings.
Master Copyright Expert, Jonathan Bailey of Plagiarism Today, an adviser and contributor to the Blog Herald, investigated this and wrote in Protecting Your Content From the Spinning Spammers about this new trend in site scraping.
Jonathan calls them “Spinning Spammers”, and they are using Plugins and utilities for synonymized scraping of your blog. The process scrapes your blog feed and then “translates” your content using synonyms to replace recognizable words.
Here is an example comparison of my article, followed by their “translation”.
Yesterday, I wrote an analogy of comparing blogging to dancing, and how it helps to know the steps, but I also addressed the issue of blogging in your native language compared to blogging in English.
Words carry a responsibility. They convey meaning. They reek with intent. Change a word and you change the meaning.
Their synonymized version:
Yesterday, I wrote an faith of scrutiny blogging to dancing, and how it helps to undergo the steps, but I also addressed the supply of blogging in your autochthonous module compared to blogging in English.
Words circularize a responsibility. They intercommunicate meaning. They exudate with intent. Change a word and you modify the meaning.
Grabbing content through your blog’s feed and inserting or replacing synonyms in the content, typically keywords the splogger needs to get the page ranking and search terms to attract attention, has been around for a long time. However, this form of conversion is dramatically different. The words don’t translate into recognizable search terms.
The problem with this technique is that it is also harder for the blogger to identify this as a copyright violation in the trackback. The first sentence of the “translated” version appeared in the trackback on the Blog Herald. If it wasn’t for the use of the word “autochthonous”, I would have thought it was a nice article about my blog post, but I couldn’t figure out what the heck the word meant. So I checked out the trackback link and found my blog post regurgitated into a strange version.
However, the broken English belies the full extent of the problem. Spammers create these works by taking posts from legitimate bloggers and then running it through an algorithm. This can involve using a thesaurus to find synonyms for the words in questions or an automatic translation program to convert the work into another language, possibly then converting it back to English.
This process of modifying the content before reposting it is often called “spinning”. Spinning a work before republication has several advantages, the largest of which is that Google is less likely to detect the work as a duplicate and, thus rank it higher. However, almost equally important is that it is much harder for victims of plagiarism to detect and follow up on the misuse, making this kind of abuse much harder to stop.
The good news in all of this is that, since so little of the content remains the same, the odds of the search engines penalizing the victim are much more slim than with traditional spamming. However, this isn’t saying that these modified scrapers aren’t targeting similar keywords to your site, which they often intentionally leave intact when spinning a work, and might usurp the original work through a combination of scraping and spam linking.
So the technique outwits search engine algorithms, since the intent of the content remains even though the words are changed, and the words aren’t typical spam words, so the splogger thinks they can win in the battle for page ranking and content theft.
It’s the latter that really ticks me off.
Copyright law protects derivative work, unless the work is changed enough from the original as to not resemble the original. I’m sure you can see the loop holes in that statement. What defines “changed enough”?
My question to Jonathan and Tony immediately after finding the trackback from the splogger was, “How much of my content has to be changed in order for this not to be a copyright violation?”
According to my research, and Jonathan’s expertise, this example is still a valid copyright infringement. By simply changing a few words, the intent, message, and literally enough of my words have not changed, so it’s still my content, therefore, protected under my copyright policy.
Fortunately, the law is very clear on this subject. Copyright is not merely the right to copy one’s own work, but a set of rights that includes the right to create derivative works…This right to create derivative works covers the right to create translations and any other work based on copyrightable portions of the original. Spinning, since it starts with a copyright-protected work and creates a new work based upon it, violates that right.
Fair use arguments fall equally flat in the eyes of the law. Spinning is not transformative as it is designed to replace the original, it offers no commentary or criticism, it is for commercial use, it can greatly harm the market for the original work and usually is unattributed. There is almost no fair use argument left for the spammers who modify the posts they scrape, leaving the door wide open for rightsholders to take action.
Now, I can go after the copyright violating splogger.
Help Warn All Bloggers
More importantly, we need to call attention to this issue to warn every blogger, and potential splogger, that we’re onto this system. We need to be more diligent when reviewing trackbacks to our blog. We need to bring it to the attention of search engines so they can also tackle this new monster on the web that is getting away with stealing our hard work and changing a few words and making it their own, covered with advertising.
Jonathan Bailey’s article offers some tips and techniques for handling and fighting against these spinning spammers, and my article, What Do You Do When Someone Steals Your Content, contains the steps to take to stop copyright violators.
Help us spread the word and put an end to any support, encouragement, and permission to use such underhanded techniques to abuse our content. It is not a complement, nor a blessing, to have any splogger or copyright violator link to your blog. Google’s new PageRank now penalizes blogs for having such links, so don’t risk it. Report sploggers and help put them out of business.
- Splogging or Clogging: The Worst of the Worst of Blogging
- Splogs on the Rise on Blogspot
- Blogs That Look Like Blogs But Ain’t – Splogs
- Spam: Stupid Pointless Annoying Messages in Emails, Comments, and Everywhere
- Comment Spammers Now Using Hebrew to Fool You
- What Do You Do When Someone Steals Your Content
- Finding Stolen Content and Copyright Infringements
- The Growing Trends in Content Theft: Image Theft, Feed Scraping, and Website Hijacking
- Biggest Copyright Infringement in the World But Nobody Cares Enough
- Content Theft from Feeds – It’s Time To Take Action
- Abuse: Keyword Spamming versus Tag Spamming
- Reporting Spam Blogs – Splogs
- Calling All Stupid Comment Spammers
- Content Specific Comment Spam on the Loose
- Comment Spammers Resorting to Jokes
- Digital Fingerprints Help Track Blog Content Theft
- Copyrights and the Blogger: Protect What is Yours
- Applaud Those Who Warn You: Your Blog’s Content Is Being Stolen
- Stop Content Theft Buttons and Badges
- Modern Crusader: Plagiarism Today with Jonathan Bailey
- Battling Comment Spam: Human Versus Human
- Stupid Spammers: To Remove Your Site From Our Comment Spamming Database Instructions
- Brag On: Jonathan Bailey Now Offers Plagiarism Advice on the Blog Herald
- Copyright Law Tips from Daily Blog Tips
- Understanding GPL and Copyright in WordPress Community Podcast
- More Information and Resources on Copyright Than You Can Imagine
- Breaking the Brick Wall on Your Content Theft Search
Site Search Tags: content theft, feed scraping, feed scrapers, stealing content, jonathan baily, plagiarism, plagiarism today, copyright, copyright protection, copyright violation, copyright infringement, synonymized, spinning spammers, page rank, pagerank, stop content theft
Subscribe Via Feedburner Subscribe by Email
Copyright Lorelle VanFossen, member of the 9Rules Network, and author of Blogging Tips, What Bloggers Won't Tell You About Blogging.