Stop Spam - Read Books - reCaptcha

Saturday, June 4, 2011

Buzz It


This is a familiar image to every web user. Most of the times unable to make out why the CAPTCHA reloads in spite of correct entry!

Have you ever given a thought why should you fill two words to help a website protect from spam machines, Is single word protection so weak? Have you ever noticed why there is a difference in clarity of two words, one being so obvious and other very weird?
Answers to these question are available at the right bottom corner of the Captcha block. "Stop spam. Read books.". Unfortunately many of us are not familiar with the last two words.

Anyone with engineering background and invested efforts in making technical reports will appreciate this technology. It's OCR, technology that converts scanned images and PDF files to editable word documents. ABBYY fine reader is a good software of this kind.

Google started this campaign to digitize the old scanned documents which cannot be read by OCR and human intervention is required. Google does this for NY Times and Google books. you will be startled to know the fact that daily 200 million recaptchas are loaded. Even if one of two words is used for the purpose a rough estimate (assuming each book has 500 pages and 250 words a page) would make it 1600 books per day. 

What irks me? The idea is that it is excessively marketed as a spam detector and websites keep on reloading though the words entered are correct. May be due to poor coding (I don't think so!) or deliberate act to increase the number of instances. 

It's a Google product, hence it's always fishy!   



0 comments:

Post a Comment