Friday, 18 September 2009

The Machine is using us with reCAPTCHA

Like a lot of people I am a big fan of Mike Wesch and his videos on the impact the Internet has had on society and learning. Like a lot of people I have enjoyed his take on The Machine is Us/ing Us.

A really good example of how we have become personally intermeshed into the workings of the Internet (and as a result are inadvertently building content or teaching computers to think) struck home when I saw that Google had announced they had purchased reCAPTCHA. For details see Teaching computers to read: Google acquires reCAPTCHA.

Over the last 12 months I have set up a digitisation project at work. As with a lot of other digitisation projects we faced hurdles when we had to digitise poor quality and / or old documents. As humans we could read the old text but the scanners could not make sense of faint, fuzzy, and or distorted text. What makes reCAPTCHA so interesting is that you can pass onto reCAPTCHA the images of the words your optical character recognition (OCR) scanning software has trouble reading, and then you leverage off the concept of crowds teaching computers so that the "Internet" learns to interpret and therefore process text that has otherwise stumped your OCR software.

The win win is that you are populating "Completely Automated Public Turing test to tell Computers and Humans Apart" or CAPTCHA security devices, at the same time as "teaching" your OCR software to overcome difficulties associated with processing poor quality and distorted text. So it looks like the Machine is really Us/ing Us after all.

No comments: