出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:CAPTCHA is almost a standard security technology, and has found widespread application in
commercial websites. There are two types: labeling and image based CAPTCHAs. To date,
almost all CAPTCHA designs are labeling based. Labeling based CAPTCHAs refer to those
that make judgment based on whether the question “what is it?” has been correctly answered.
Essentially in Artificial Intelligence (AI), this means judgment depends on whether the new label
provided by the user side matches the label already known to the server. Labeling based
CAPTCHA designs have some common weaknesses that can be taken advantage of attackers.
First, the label set, i.e., the number of classes, is small and fixed. Due to deformation and noise
in CAPTCHAs, the classes have to be further reduced to avoid confusion. Second, clean
segmentation in current design, in particular character labeling based CAPTCHAs, is feasible.
The state of the art of CAPTCHA design suggests that the robustness of character labeling
schemes should rely on the difficulty of finding where the character is (segmentation), rather
than which character it is (recognition). However, the shapes of alphabet letters and numbers
have very limited geometry characteristics that can be used by humans to tell them yet are also
easy to be indistinct. Image recognition CAPTCHAs faces many potential problems which have
not been fully studied. It is difficult for a small site to acquire a large dictionary of images
which an attacker does not have access to and without a means of automatically acquiring new
labeled images, an image based challenge does not usually meet the definition of a CAPTCHA.
They are either unusable or prone to attacks. In this paper, we present the different types of
CAPTCHAs trying to defeat advanced computer programs or bots, discussing the limitations
and drawbacks of each.