The anti-human verification code should have been killed long ago


"I'm not a robot" should be self-evident.

But before the computer recognizes you're human, you might be asked to click on images containing traffic lights or sidewalks.

When you squint close to the screen, wondering if a tiny corner counts, you know it's not that simple.

This kind of feeling that is difficult to prove to oneself, the Spring Festival traveler who grabbed tickets in 12306 in 2015 should have a taste.

Years later, the ever-changing captcha still forces you to think about the age-old philosophical question – who am I?

A smiling dog, a horse made of clouds, it's harder to prove yourself a human being

"Please click on every picture that contains a smiling dog."

Jared Bauman, founder of a creative marketing agency, was recently stumped by CAPTCHA . What he wonders is, can dogs really laugh? Most of the dogs looked neither happy nor sad, some grimacing, others just mouths open.

On August 2, he was asked to find a "horse made of clouds" . Among the 9 pictures, there were 2 elephants made of clouds. He unfortunately lost the first click.

Jared Bauman realized a serious problem – finding a traffic light, bus, or chimney was out of date, and the captcha system set out to set the next level of challenge.

The captchas are from hCaptcha, which the developers say is more privacy-conscious than Google's captcha system, reCAPTCHA, collecting only the minimum necessary personal data.

And why the verification code is getting more and more difficult, we still have to start with what the verification code is and what Google's verification code system reCAPTCHA is.

CAPTCHA, the full name is "Automatically Distinguish Computer and Human Public Turing Test".

Captcha is also considered a reverse Turing test because it uses computers to test humans, not humans as in the standard Turing test .

Captcha is designed to protect websites from harmful bots , including spreading malware, spreading fake accounts, performing DDoS attacks, sending mass spam, stealing user information, and more. These bots are essentially lines of computer code that run automatically.

Captcha was created in the early 2000s by a few computer scientists at Carnegie Mellon University.

The original CAPTCHA took the form of distorted text to avoid automatic recognition by computer programs such as optical character recognition, more than could be deciphered by computers at the time, but readable by most humans.

Soon researchers realized the technology had the potential to differentiate between humans and robots, and they developed reCAPTCHA technology that lets users digitize paper profiles as they fill out captcha codes that humans can decipher better than computers Twisted letters from old literature.

At this stage, the user must enter two words, a real test with a clear answer, and a new word that has not yet been transcribed . By showing the same word multiple times to users around the world, reCAPTCHA automatically verifies that the word was transcribed correctly.

It's like a crowdfunding on the internet, asking for your time instead of your money. This is the magic of the Internet. With the support of technology, to create some fun, you can use a little energy of everyone to naturally gather together into a tower.

In 2009, Google acquired reCAPTCHA and used it to digitize Google Books and the New York Times archives. In 2011, Recaptcha had digitized the entire Google Books archive, 13 million New York Times articles. In 2012, it translated about 150 million words a day.

Why is the verification code more and more difficult?

Humans are immersed in the ocean of knowledge, and robots have not stopped learning.

In 2014, Google released an algorithm dedicated to deciphering distorted text captcha, and artificial intelligence technology has been able to solve the most difficult distorted text with 99.8% accuracy , while the human success rate is 33%.

The twisted letters have lost their original purpose, and it's time for the next generation of captcha.

In 2012, Google launched an image-recognition version of reCAPTCHA that included photos from Google Street View, allowing users to transcribe house numbers and other signs.

Similar to the digitization of old books at the beginning, in this process, Google has served multiple purposes, not only defending against malicious scripts, but also improving its own artificial intelligence.

In 2014, Google said : "Street View and reCAPTCHA teams work closely together, and both will continue to improve to make maps more accurate and useful, and reCAPTCHA safer and more effective." Making maps more accurate and useful means Google needs to train artificial intelligence Better recognition of objects in images .

So how do you train artificial intelligence? reCAPTCHA. Hundreds of millions of users have built machine learning datasets for tech companies to prove they are human.

It's not just Google that's making progress. In 2017, developer Francis Kim conducted an experiment in which he built a system in 40 lines of Javascript to attempt to pass reCAPTCHA's image captcha using Google competitor Clarifai's image recognition API. As a result, the script successfully found the store in the picture.

In theory, this could also be achieved using Google's own image recognition technology.

Google's CAPTCHA system actually has two purposes: to suppress the behavior of malicious scripts while training artificial intelligence with text, images, etc. But the fact is that Google's artificial intelligence is getting better and better, but the malicious script is also progressing in the battle of wits and courage, and it is becoming more and more difficult for users to prove that they are human.

In 2014, Google's "No CAPTCHA reCAPTCHA" came on the stage, that is, "Captcha without verification code". The interface is simple and friendly, and you only need to believe that "I am not a robot".

Google says it has launched a new API that observes user behavior, collecting data such as pointer movement rate, current IP, whether plugins are used, how long a page has been used, and how many clicks have been made, radically simplifying the reCAPTCHA experience. In most cases, a single click confirms that the user is a bot.

However, the captcha did not disappear. Arguably even the most annoying captcha ever.

In the case that the risk analysis engine cannot predict whether the user is a human or not, Google will make the verification code come out again, and give more new ways to play, such as based on the classic computer vision image tagging problem, let you select all the items including cats or turkeys Photo.

In addition, there are game-like captcha that require the user to rotate an object to a specific angle, or move a puzzle piece into place.

Humans can understand the logic of puzzles, but robots that lack explicit instructions can be stumped. But it's hard to say whether it will be mastered in the future.

The more machines learn, the fewer advantages humans have.

Can the verification code be replaced?

Jason Polakis, a professor of computer science at the University of Illinois at Chicago, pointed out that machine learning is now on par with humans at basic text, image and speech recognition tasks, and "we need some alternatives."

What's more, before the captcha system, the user experience and accessibility are greatly reduced. The verification code is not easy for many people, especially the elderly and other groups with learning disabilities .

Eileen Ridge, who provides technical advice to elderly clients, said she often gets calls from clients who have trouble distinguishing between painted sidewalks and normal crosswalks, and are very worried about being locked out of accounts for wrong answers, just like many senior citizens in China are. The Internet has the same attitude.

A smiling dog, a horse made of clouds, may be harder for them.

The scheme to replace the verification code is also under continuous development.

Some sites use a form of captcha invisible to human users, inserting fields into screens visible only to bots, tricking them into filling out forms and proving they're not human.

In the past two years, Google has launched a new verification code system, reCaptcha v3 , which uses reverse thinking to automatically record the behavior characteristics of users browsing the website, and score users according to these records. If the user's score is too low, it will be judged as a robot. . Otherwise, users will not be disturbed, and the online experience is very smooth. But it may involve privacy concerns.

FastCompany reports that whether users use Google Cookies is an important factor in determining ratings. Users get higher scores if they choose to let Google remember their login information, are not logged in to a Google account, or use a VPN or onion browser are often prompted to be high-risk.

Ghosemajumder, CTO of robot detection company Shape Security, believes that verification code tests such as game verification codes and video verification codes will eventually be cracked. Compared to testing, he prefers "continuous authentication", which essentially observes user behavior and looks for signs of automation:

"A real human doesn't have good control over their motor functions, so even if they try very hard, they can't move the mouse the same way multiple times over multiple interactions."

In June, Apple announced at the Worldwide Developers Conference that it would replace verification codes with Private Access Tokens .

Password or biometrics to unlock the phone, open the browser, enter the website accurately… a series of operations are enough to "verify the identity". When the Apple system verifies that the device and the Apple ID account are in normal state, the "private access token" can be provided to the app or website that requires a verification code.

Companies such as Cloudflare and Ffast, which provide website security management, already support private access tokens, which no longer require verification codes to log in to their apps or websites with iOS 16 devices. At present, this technology is still being promoted, and it needs more supporters to join in order to be more useful.

"This will save a lot of time for a lot of people, and users like to feel trusted," said Apple engineer Tommy Pauly.

But as long as there are fake accounts, spam, harassing messages, etc., we still need technology that separates human users from bots, and some form of captcha technology will always exist, developing in parallel with artificial intelligence.

In the future, captcha systems will likely recognize humans not by our ability to surpass robots, but by our ability to make mistakes. That is to set more challenging tests, we tend to fail, and the robot gives the correct answer. Perhaps, as we scratch our heads to find all the beacons in the picture, we are fighting a human-defeating struggle.

References:
1. https://auth0.com/blog/captcha-can-ruin-your-ux-here-s-how-to-use-it-right/
2. https://www.wired.com/story/smiling-dogs-horses-made-of-clouds-captcha-has-gone-too-far/
3. https://www.techradar.com/news/captcha-if-you-can-how-youve-been-training-ai-for-years-without-realising-it
4. https://www.theverge.com/2019/2/1/18205610/google-captcha-ai-robot-human-difficult-artificial-intelligence

Li Ruoqiuhuang, to exorcise evil. Working email: [email protected]

#Welcome to pay attention to the official WeChat account of Aifaner: Aifaner (WeChat: ifanr), more exciting content will be brought to you as soon as possible.

Love Faner | Original link · View comments · Sina Weibo