A CAPTCHA, as defined by The CAPTCHA Project (http://www.captcha.net/), "is a program that can generate and grade tests that most humans can pass, [and] current computer programs can't pass." It stands for "Completely Automatic Public Turing Test to tell Computers and Humans Apart" and is pronounced like "capture" without the final /r/ sound. Because computers have several times more patience than a human* when performing automated tasks, web services often use human authentication tests such as those of The CAPTCHA Project to prevent automated processes from creating an account, submitting information such as URLs to a database, sending messages to users, or performing any other action that uses publicly available scarce resources.
* In this article, "human" refers to any entity with the intelligence and patience of a typical member of Homo sapiens, and "computer" refers to any entity whose reaction to the environment is governed primarily by a well-understood mechanical process. The author intends no prejudice against members of non-human biological races from fantasy and science fiction.
There exist at least three different types of information that humans are known to perceive better than computers: speech, pictures, and semantics of text. These are the differences that CAPTCHAs attempt to exploit. However, the first two raise accessibility issues with respect to people with disabilities, and all three may have legal problems. For instance, Hewlett-Packard Company (NYSE:HPQ) holds United States Patent 6,195,698 on several forms of CAPTCHAs. Thus, I don't see CAPTCHA systems coming into much wider use within the next twenty years.
Speech
It's possible to synthesize spoken text, with various distortions such as wow and flutter, wave shaping (i.e. guitar distortion), echo, reverb, and superposition (two samples overlapping) so that humans have a much easier time separating the sources and correcting for the distortions than a typical computer does. However, hearing-impaired users can't pass such a test, and neither can users of machines without audio output capability.
Pictures
The "Gimpy", "Bongo", "Pix", and "HumanAut" tests are based on recognition of natural images or images of distorted text. Yahoo!, AltaVista, Freeservers, Tripod, and several other web sites use picture-based human authentication.
There are two problems with this technique. First, most web sites use a deprecated image format called GIF that uses LZW compression technology patented by Unisys Corporation, but it's easy to work around this (see PNG). The other, perhaps more obvious, limitation is that blind users and other humans behind non-visual user agents cannot see images. Thus, you lose accessibility and make Bobby cry.
Speech and pictures
The Americans with Disabilities Act requires the United States government, those who do business with the United States government, and those who engage in interstate commerce to make appropriate efforts to satisfy the reasonable special needs of disabled people. A law commonly called Section 508 requires federal government web sites and some commercial web sites to make all information accessible to the disabled. To comply with the ADA, PayPal uses a test that presents the same information as a picture and as a sound; responding to either one will allow a user to sign up for an account. But the test material still cannot be perceived by a user on a braille display without a sound card.
Semantics
Finally, there exist tests based on a human's ability to interpret the meaning of written natural language. I know of no other test that a braille terminal can reliably present. Examples:
- Type the ninth letter of Blockstackers, followed by an exclamation point.
- C!
- In the sentence 'I regret that I have but food life to lose for my country', which word does not make sense?
- food
Patents
However, Hewlett-Packard's patent (http://patft.uspto.gov/netacgi/nph-Parser
?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=/netahtml/srchnum.htm
&r=1&f=G&l=50&s1=6,195,698.WKU.&OS=PN/6,195,698&RS=PN/6,195,698) seems to cover all methods based on the meaning of text, as well as methods involving the visual or acoustic distortion of text. HP has every right to prevent third parties from making the patented invention during the patent's term. There exist several legitimate ways to make commercial use of an invention covered by a broad patent, none of which a company would find acceptable: license the patent (which HP has every right to deny), leave the country (which may cease to apply as countries sign treaties; you'd also lose all your customers in the USA), buy the company (prohibitive, as HP's market capitalization as of December 2002 was close to $45 billion), or wait twenty years for the patent to expire (provided that Cher does not follow in the late Sonny Bono's footsteps as a spokeswoman for the drug industry and seek a Cher Patent Term Harmonization Act). HP may have a government-granted monopoly on reliably distinguishing humans from bots for at least the next decade.
If Section 508 really does prohibit use of speech- or picture-based CAPTCHAs, web sites will have to allow for the only 100 percent reliable method: talking with another human. In fact, Yahoo! already does this (http://add.yahoo.com/fast/help/us/edit/cgi_access).