reCAPTCHA logo

Sunday, December 7, 2008

New Audio reCAPTCHA

One of the main goals when we launched reCAPTCHA was to provide an accessible system to visually impaired individuals (who surf the Web using screen-reading software). Most other CAPTCHAs do not provide an audio alternative, and therefore block blind people from freely navigating the Web. We're proud of the fact that reCAPTCHA has always had an audio alternative.

Today we are announcing a significantly improved audio CAPTCHA which is both easier for humans than our previous one, and most importantly, by far the most secure audio CAPTCHA we know of.

Like many of the other audio CAPTCHAs, our previous version consisted of distorted spoken digits. We collected thousands of voices saying the digits zero through nine, and formed audio CAPTCHAs by concatenating digits from different speakers and adding noise distortions in the background. To maintain the security of the audio CAPTCHA, our distortions were quite heavy. We now believe that even such heavy distortions are not enough when the audio CAPTCHAs are restricted to only spoken digits or letters.

This week, Jennifer Tam, a PhD student at Carnegie Mellon University who has been working with us, will present her results about the security of audio CAPTCHAs at the Annual Conference on Neural Information Processing Systems. In her paper, she shows that audio CAPTCHAs based solely on distorted digits (or even letters) can be broken using machine learning techniques. This includes all commonly used audio CAPTCHAs.

Although we have not seen anybody abuse our previous audio CAPTCHA in the wild, we have taken preventive measures against this potential attack. So today we announce the release of a new audio CAPTCHA that is significantly more secure and in particular not susceptible to Jenn's attack. In fact, breaking this new audio CAPTCHA would require major advancements in speech recognition technology.

Instead of using spoken digits or letters, our new audio CAPTCHA presents entire spoken sentences or phrases that the best speech recognition algorithms failed to recognize. In other words, this new audio CAPTCHA uses the same idea as the standard visual reCAPTCHA: we play audio from old time radio shows that speech recognition software could not decipher correctly, and then use the results of humans solving these CAPTCHAs to transcribe the old time radio shows. Not only is this audio CAPTCHA more secure, but it will also have a positive side-effect. Much like the visual reCAPTCHA has helped to digitize billions of printed words so far, we expect that the audio version will help transcribe large amounts of historical audio content.

You can hear the new audio CAPTCHA by going here and clicking on the audio button. You'll hear a short clip with people speaking and will have to type what they are saying. To account for spelling mistakes and homophones, the verification algorithm uses a phoneme-based encoding and allows a small number of mistakes.

We'll be rolling this update out to all of our users over the next few weeks. For now, if you are using our custom theme option, we ask that you update the instructions for the audio CAPTCHA to say something along the lines of "type what you hear".