By Jonathan Bailey of Plagiarism Today
Users of the self-hosted version of WordPress have always had a lot of tools for protecting their site and their feeds. They have always had a variety of plugins for protecting content, including anti-scraping, content theft detection and much more.
Even the ability to edit themes and the core WordPress files has proved useful in some cases allowing users to replicate the effects of some of the more popular plugins without having to install them.
Users of WordPress.com, however, are much more limited in what they can do. With no plugins and limited ability to edit the site’s appearance or core files, they can’t use many of the tools that hosted WordPress users take for granted.
However, this is not to say that WordPress.com users are helpless, With a little bit of creativity, they can mimic the effects of many of the most valuable content protection plugins.
Detecting Scraping
The most powerful tool currently available for detecting RSS scraping is a digital fingerprint. A fingerprint is a string of characters that you add to your RSS feed to make it easy to search for scraped content later. Plugins such as the Digital Fingerprint Plugin and Copyfeed are normally used to add the fingerprint to the feed and search for matches.
However, a digital fingerprint, in truth, is little more than a signature added to each post, similar to the one Lorelle adds to all of her entries. Adding it doesn’t require a plugin or an edit of the core files, but merely placing the information into each post.
To aid with that, you can use a variety of tools, such as Shortkeys, to automate and standardize the signature insertion. Also, like Lorelle, you can add images or other components as you see fit to ensure that, even if your feed is copied that it carries with it information that clearly identifies the source.
Even if you forget to insert the signature into a few posts, merely putting it in the majority will ensure that scrapers will pick up the signature the same as they would had you used a plugin. Best of all, using Google Alerts, you can fully automate the detection of the fingerprint and be routinely emailed about any new matches that appear.
All in all, though it adds an extra step before posting entries to your blog, it is fairly trivial for WordPress.com users to mirror the effect of the various digital fingerprint plugins and gain all of the benefits they provide.
Blocking Scrapers
With plugins such as Antileech and Wp-Ban, many WordPress users actively ban and prohibit scrapers from accessing their feed. However, without direct access to the server, WordPress.com users are more limited in what they can do along those lines.
Still, for extreme cases, you may be able to contact WordPress Support and see if they can assist in blocking the scraper. They have done so in the past with extreme cases, such as with Bitacle, when multiple WordPress blogs may be involved.
However, such action should be reserved for extreme cases and only after other steps such as contacting the host have failed. It simply is not practical for Automattic to ban every scraper and there will be many scrapers which can not be banned no matter what.
Other Tricks
With all of that in mind, here are a few other tricks that WordPress.com users can take advantage of to help them protect their writing on the Web.
- Google Alerts: In addition to checking for your Digital Fingerprint, you can also insert unique phrases from your static content, such as your pages, to receive alerts when they are copied.
- Anti-Plagiarism Tools: Although they are somewhat limited in their detection ability, sites such as Copyscape and Bitscan can be great for quick plagiarism checks and can alert you to content reuse that you might not have otherwise known about.
- Non-Repudiation: Even though most will never require it, non-repudiation services can provide valuable support in the event of a dispute over ownership of a work. MyFreeCopyright provides a free non-repudiation service that does not require a WordPress plugin to function, rather, it accesses the feed every day automatically.
- Truncated Feeds: WordPress.com users do have the ability to provide a partial feed, rather than their full content, by editing the settings in their dashboard under Options/Reading. However, I generally do not recommend this course of action as it upsets regular readers as well as thwarting scrapers.
- Checking Stats: The statistics provided by default with your WordPress account are very good for detecting potential infringements. Since many scrapers and even human plagiarists leave links intact, you can check for suspicious referrers and inbound links so long as you routinely link to your own articles.
- Akismet: Many scrapers, when reposting your content, will send trackbacks or pingbacks to your site. Akismet does a good job filtering those out, but it is important to look through the spam and moderated queue in order to locate any spam that might indicate scraping.
Though many of these tips and tricks are available to bloggers no matter where they post their content, the fact that they are so widely available is what makes them so useful, anyone, regardless of where they are hosted, can take advantage of them.
However, WordPress does provide a good set of default tools for its free users, including Akismet and their stats panel. This gives WordPress.com an extra advantage when dealing with people misusing their content, an advantage they should certainly use.
Conclusions
While there is little doubt that WordPress.com users have a more limited tool set than those who host their own blogs, that is not to say that they are helpless or otherwise unable to protect their work.
The tools other bloggers use to track and stop copyright infringement are still there, it just may take a little more effort and creativity to get them to work.
Fortunately, users of WordPress.com still have the ability to fight back and WordPress itself provides many of the tools that that you need by default. Much of the process is just a matter of taking the information you are given, and applying it in a whole new way.












4 Comments
Another plugin with scan for content deft: ©:Feed
After reading this, I added Angsuman’s Feed Copyrighter, Bad Behavior, and Digital Fingerprint to my blogs. I’ve seen my content appear on a few spam blogs (splogs?) in the past, but didn’t really know what to do about it. Hopefully this will help ^_^ Thanks!
Frank: Copyfeed is a good plugin but I’ve been getting reports of incompatibility with WP 2.5.1 so I’m dicey on recommending it since I think it is at least as important to remain current on your WP install.
SpiritGod: If you have any issues with content theft, send me a letter and let me know, I may be able to help get the content removed.
Let me know what I can do to assist!
@Jonathan Bailey: ©Feed works with WP 2.5.1 and on my Bleeding-plattform 2.6 correctly. When you have a problem or error, please send me the error.
With best regards
Frank
3 Trackbacks/Pingbacks
[...] I just finished posting a guest blog entry on Lorelle on WordPress entitled “Protecting Your Content on WordPress.com“. [...]
[...] Protecting Your Content on WordPress.com – I get so frustrated with finding content I wrote used (usually without credit) on other websites. It’s not that what I write would interest a pimply faced Internet scavenger, but there are plenty of lazy, unscrupulous site owners who simply “scrape” blog content and use it as their own. Join me in fighting this battle against lazy S.O.B.s. Read Jonathan Bailey’s post at Lorelle on WordPress about how to find and battle Internet plagiarism. [...]
[...] Protecting You Content On WordPress.com from Lorelle On WordPress [...]