Video CAPTCHAs: Usability vs. Security
Venue
Proceedings of the IEEE Western New York Image Processing Workshop (WNYIP '08), IEEE Press (2008)
Publication Year
2008
Authors
Kurt Alfred Kluever, Richard Zanibbi
BibTeX
Abstract
A CAPTCHA is a variation of the Turing test, in which a challenge is used to
distinguish humans from computers (”bots”) on the internet. They are commonly used
to prevent the abuse of online services. CAPTCHAs discriminate using hard
artificial intelligence problems: the most common type requires a user to
transcribe distorted characters displayed within a noisy image. Unfortunately, many
users find them frustrating and break rates as high as 60% have been reported (for
Microsoft’s Hotmail). We present a new CAPTCHA in which users provide three words
(”tags”) that describe a video. A challenge is passed if a user’s tag belongs to a
set of automatically generated ground-truth tags. In an experiment, we were able to
increase human pass rates for our video CAPTCHAs from 69.7% to 90.2% (184
participants over 20 videos). Under the same conditions, the pass rate for an
attack submitting the three most frequent tags (estimated over 86,368 videos)
remained nearly constant (5% over the 20 videos, roughly 12.9% over a separate
sample of 5146 videos). Challenge videos were taken from YouTube.com. For each
video, 90 tags were added from related videos to the ground-truth set; security was
maintained by pruning all tags with a frequency ≥ 0.6%. Tag stemming and
approximate matching were also used to increase human pass rates. Only 20.1% of
participants preferred text-based CAPTCHAs, while 58.2% preferred our video-based
alternative. Finally, we demonstrate how our technique for extending the ground
truth tags allows for different usability/security trade-offs, and discuss how it
can be applied to other types of CAPTCHAs.
