May 25, 2007

reCAPTCHA


Recently, Slashdot featured an article about a new version of CAPTCHA called reCAPTCHA.  CAPTCHA is the letters that are usually crossed out or distorted in some fashion on web pages.  Usually the user is asked to re-enter these letters for validation purposes.  CAPTCHA prevents bots and form completing software from accessing the site.  Since these letters and numbers are usually images, as opposed to actual characters, bots cannot recognize them.  Sites using CAPTCHA can now give back to the community by using reCAPTCHA.  Instead of using random numbers and letters purely for authentication, reCAPTCHA uses text from scanned books which image recognition software did not validate.  This means when a user validates he or she is actually contributing to the act of publishing one of these scanned texts to the web.  This is a fantastic idea!  Web sites continue to deny bots access while at the same time helping release new text to the community.  Implementing reCAPTCHA onto a site is not a very complicated task.  Users are required to enter a little more text then with CAPTCHA, but this is a trivial down side considering the benefits in my opinion.  It would be great to see webmasters, developers, and admins contribute the small amount of time it would take to convert their sites to reCAPTCHA.

1 comment:

Unknown said...

That is an awesome idea, however I have a nagging thought... If the OCR software couldn't validate, then the string value of the image would be at worst unknown, at best estimated. How are you supposed to:

1) validate the user access by comparing what is typed in with the image string value

2) validate the image string value with what the user typed in


at the same time?!?