A commenter on the recent post of mine regarding Bitacle’s insults wrote in to stand up for Bitacle saying, in part:
“Google, yahoo, and technorati scrape other people’s content every day (which I believe, there was a US court case that google won). They also display advertisements. Just like bitacle….”
“If you want your content off of bitacle, it also needs to be taken off of every other search engine that is caching it.”
While the commenter, who used the name Ricardo Sanborn, is correct that Google and other search engines do many of the same things as Bitacle, he, and others like him, are wrong to equate the two.
There’s a world of difference both legally and ethically between Bitacle (as well as other scraper sites) and the legitimate search engines. All one has to do is stop making excuses and start looking to see the difference.
Six Points of Distinction
As I said in my reply, there are at least six points of distinction between Bitacle and the legitimate search engines:
- Lack of Opt Out – Bitacle completely ignores robots.txt files, disregards meta tags and offers no means to opt out of the site. Though Bitacle claims there isn’t any “norm that forces (them) to obey the robots.txt“, the presence of an opt out mechanism was critical to Google, and other search engines, in having their cache judged to be fair use (PDF, see pages 20 and 21).*
- Displays Full Content – Though major search engines cache Web pages, they do not display the full content of the sites they index on their own result pages. They display, at the most, small snippets of content. Also, search engine caches display the content in the original context, capturing all images, licenses and attribution, Bitacle merely scrapes the content and formats it for their own site.
- Destination, Not Direction – Major search engines exist to direct users to the sites they want to see, not to be end destinations. Bitacle’s “aggregates” feature not only displays the full content of every post, but also offers a Digg feature and a comment form. Users have almost no motivation to leave Bitacle’s version and visit the original site. These are clear signs that Bitacle’s goal is not to direct users to the sites they scrape, but to keep the traffic (and money) for themselves. This means that Bitacle’s use is not transformative (where the use of the copy is different than the original intent) and thus almost certainly not fair use (See above PDF pages 14-16)
- All About the Benjamins – Until Bitacle’s Adsense account was forcibly shut down, Bitacle was displaying ads next to the scraped content and, at last check, was still attempting to do so (leaving the Adsense block intact). Commercial use is heavily frowned upon in fair use arguments and profiting directly from one’s material without their permission is generally not considered fair use or fair dealing. (See above PDF, page 16 and 17).
- Bitacle’s Past – When Bitacle first started gaining attention, they provided no attribution to the original author of the posts and relicensed every post it scraped under a new Creative Commons license, regardless of how it was licensed under the original site. Though that behavior has stopped, it shows the lack of consideration that Bitacle has for bloggers at large.
- The Spam Factor – Finally, where most search engines do not allow other sites to index their cached copies, Bitacle encourages others to do so by automatically adding search-engine friendly metadata to every post they scrape. They have hundreds of thousands of pages indexed in Google, most of them from the aggregates section. Bitacle may not be the largest search engine spam operation, but it is definitely one of the most dangerous to copyright holders.
All in all, Bitacle is light years apart from Google and the other search engines both in terms of both law and ethics. Any comparison between the two is flawed.
A Change of Venue
Astute readers will quickly point out that all of the laws I have cited are American and that Bitacle is located in Spain. However, it is unlikely that Bitacle would find a much friendlier audience in its home country.
Spain is part of the European Union (E.U.) and the E.U. is where Google News was successfully sued for copyright infringement by Belgian newspapers. Though the case leaves many unanswered questions, it is clear that European courts are no friends to search engines, even ones that do actually offer an opt out.
While it remains to be seen how a Spanish court would react to Bitacle, E.U. copyright law is notoriously strict and it wouldn’t likely favor Bitacle.
In short, if Bitacle can not meet the standards of U.S. law, it will almost certainly fail the standards of E.U. law.
The law has made it clear that caching, for certain purposes, is very much legal and acceptable. However, Bitacle caches both the wrong way and for the wrong reasons. It is a violation of copyright law and ethically divorced from the search services it tries to emulate.
Though Bitacle apologists may try to make excuses and attempt to label those who are upset with Bitacle as hypocrites, it is Bitacle itself that is in the moral dillemma.
There is little question as to Bitacle’s legal and ethical standing, even if some people don’t want that to be the case.
I intended posting on my blog about this issue yesterday anyway, but something rather serendipitous occurred to make my post even sweeter. I love a touch of irony in the mornings…
On my previous blog post I received a little comment from a lovely visitor called Anonymous – in Spanish:
Pero que puta eres, y tu hijo es un hijo de puta bastardo de mala madre.
The lovely Lioness happened to be online when I received this in my inbox this morning and reluctanctly translated it for me:
“What a whore you are and your son is a son of a whore, son of a bad mother.”
It took about 10 seconds for the penny to drop. Silly me for sticking my head up at Bitacle the previous day.
Really guys, if your going to centre your entire website around stealing other people’s content, allow people to leave comments and then spam me with hateful abuse the day after I dare to point out ON MY STOLEN BLOG that I dont really appreciate the fact that YOU HAVE STOLEN MY ENTIRE BLOG, then have then sense to cover your tracks when you visit my actual blog. If you have the technology to block my IP address surely you must realise I have the technology to trace who was online at the time the comment was left. Bitacle, you scum-sucking bastards, you may have blocked my IP address so I can no longer access your website from home, but you cant block every IP in the world.
It seems they may have also disabled the comments feature now so people cant leave comments on their website thinking it is the actual blog. Since I cant access the website I had someone else do it for me. No comments feature anymore…gee wonder why???
The most disturbing thing about this whole Bitacle debacle is not the mere displaying of copyright material. It is that we cannot now protect our privacy. Feedburner, Technorati, etc, they can’t provide you with a page that no longer exists. Bitacle keeps your information. Keeps it. If you delete your blog and your photos having decided that your privacy has been compomised…too bad. Your blog may no longer exist but every word you ever wrote and every image you ever uploaded is right there on their server for anyone to search.
Next time they want to abuse me for asserting my legal rights I hope they get their facts right: I never charge for sex.
Last week I wrote an email to Google, complaining about bitacle.org’s abusive behaviour and I asked them to reject bitacle.org as their customer. Today I received an answer in response to that by email. Read below what they wrote to me. The only thing I changed in the text was my own name to ‘X’.
Thank you for your note. It is our policy to respond to notices of alleged infringement that comply with the Digital Millennium Copyright Act (the text of which can be found at the U.S. Copyright Office website: http://www.copyright.gov/) and other applicable intellectual property laws. In this case, this means that if we receive proper notice of infringement, we will forward that notice to the responsible web site publisher.
To file a notice of infringement with us, you must provide a written communication (by fax or regular mail, not by email) that sets forth the items specified below. Please note that pursuant to that Act, you may be liable to the alleged infringer for damages (including costs and attorneys’ fees) if you materially misrepresent that you own an item when you in fact do not. Accordingly, if you are not sure whether you have the right to request removal from our service, we suggest that you first contact an attorney.
To expedite our ability to process your request, please use the following format (including section numbers):
1. Identify in sufficient detail the copyrighted work that you believe has been infringed upon. For example, “The copyrighted work at issue is the text that appears on http://www.legal.com/legal_page.html.”
2. Identify the material that you claim is infringing upon the copyrighted work listed in item #1 above. You must identify each page that allegedly contains infringing material by providing its URL.
3. Provide information reasonably sufficient to permit Google to contact you (email address is preferred).
4. Include the following statement: “I have a good faith belief that use of the copyrighted materials described above on the allegedly infringing web pages is not authorized by the copyright owner, its agent, or the law.”
5. Include the following statement: “I swear, under penalty of perjury, that the information in the notification is accurate and that I am the copyright owner or am authorized to act on behalf of the owner of an exclusive right that is allegedly infringed.”
6. Sign the paper.
7. Send the written communication to the following address:
Attn: AdSense Support, DMCA complaints
1600 Amphitheatre Parkway
Mountain View CA 94043
OR Fax to:
(650) 618-8507, Attn: AdSense Support, DMCA complaints
The Google AdSense Team
In case the content of your weblog appears on bitacle.org’s website, you might consider placing a post on your weblog containing the following text.
Don’t forget to change the web address, or URL ‘https://stopbitacleorg.wordpress.com‘ into your own!
Please look carefully at the web address in the URL field of your browser. It should read ‘https://stopbitacleorg.wordpress.com‘. In case you see a web address containing the word ‘bitacle’ or ‘bitacle.org’, you’re not looking at the original page on which this text was posted. If this is the case, the text you are reading right now might be incorrect or out of date. After I place a post on my weblog, I always try to keep published information up to date, or incorporate additional information, which I receive from readers. You will never find this information on bitacle.org.
P.S. In this particular case, the word ‘bitacle’ is part of the web address ‘stopbitacleorg’ and this might be the only exception on the rule! 😉
In the post Are Bitacle blog thieves too? on lutrov.com, David Martin, claiming to be seo of bitacle, writes that he will answer your question. I don’t believe this, because I’m still waiting for an answer from David Martin, in reply to the message I sent to him.
Report apparent abuse of Google Ads to Google, because bitacle.org uses the content on your weblog to earn money with Google Ads. As many people should inform Google as possible from whom weblog content was stolen. More information at How do I report a policy violation?
We are not the only one with this problem.
- Spoken For: Those Bastards!! (bitacle.org)
- Bitacle: thieves now open for business in the 8th circle of hell
Banning their IP is a bit of a problem because they are changing it. At the moment it’s 22.214.171.124.
The people behind bitacle.org steal content, such as text and images, from other’s weblogs and place it on their own website. Their practices are criminal and/or abusive, because these people violate the copyrights on the original content, of their holders. Not only copyrights are violated, licenses such as those of the Creative Commons are not respected as well.
Stolen content from weblogs is placed on bitacle.org’s website, between commercial messages for which the people behind binacle.org are being paid for by advertisers. At this moment Google places commercial messages on bitacle.org, but this company is requested to reject bitacle.org as their client, because of bitacle.org’s criminal/abusive behaviour.
Stop bitacle.org from stealing the content of weblogs. Take action!