Print This Post Print This Post

Multivalue CAPTCHA or How to Separate Apples from Oranges

Regular CAPTCHAs work pretty much with single values (single words). Finding out the single values using a computer is often not a big problem, therefore many CAPTCHAs are cracked and don’t serve their purpose really well. But what if one were to up the level of complexity by introducing multiple partially overlapping values?

Consider the following CAPTCHA image:

Multivalued CAPTCHA example

A human can easily see what the words are (in the example “Orange” and “Apple). But for a computer doing OCR, separating apples from oranges is not so easy, especially if the words are of the same color and overlap into weird non-character forms.

In principle the multivalued CAPTCHA picture should appear as if it were written with an unknown script. Think an English OCR program attempting to recognize Chinese, a Russian OCR attempting to recognize Arabic, and so on.

Note, if the words were partially transparently blended together, i.e. 50% grey + 50% grey, the places where the words intersect would be darker. Thus I would imagine one could separate the words more easily (upper part of the intersection belongs to the word stacked to the bottom part of picture, and vice-versa for the word at the upper part).

So, given a CAPTCHA image, there would be a maximum of four input boxes where to type all words which appear in the picture. The picture itself could have any number of overlapping words, from two to four. This would add an additional level of difficulty to the OCR algorithm; it would not even know how many words there are in the picture.

There is a CAPTCHA improvement to group different symbols (=characters of a word) more closely together, but as far as I know not this kind of multivalued CAPTCHA as presented here… If anyone knows or has seen such a thing, please post a comment.

No Comments so far
Leave a comment



Leave a comment
Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

(required)

(required)


*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word