Sooner or later a new generation of spam protection methods will emerge to block all unwanted site visitors. The recently launched Google “No CAPTCHA reCaptcha” or ReCaptcha v2.0 could just be such a method.
This new behaviour analysis tool is getting more and more attention both from the site owners and from scraping engines who are trying to break it. Since Google does not reveal any secrets of its operation, we want to share with you the techniques used in this new smart analysis CAPTCHA that determines between bot and human. Let s look inside.
How does No CAPTCHA reCaptcha work
The supplied JavaScript captcha api code accumulates the cues of human activities (or its absence) on a web page even before a user (client) approaches the reCaptcha itself.
When a user moves to and ticks the I m not a robot checkbox, that behaviour drives even more browser events. These are caught by the same script and a request with encoded payload is sent to the Google server, the user s fingerprints are recorded and his cookies stored.
The behaviour analysis system on the Google server analyses the data provided and returns an encoded value to the client page. This value is user and time dependent.
In case of confusion (or bot-like behavior) Google s server will ask the client to complete an additional image-check CAPTCHA (see picture below) to further verify if the user is a bot or not.
The encoded value bears the hidden info if user is verified or not. But then you need to know whether Google has verified that user or not on that page. To check it, you send a POST (ajax) request with the following parameters: the returned encoded value, the secret key and end user ip (the last one is optional). Read the details on how to fetch and verify the user s response.
Cases in which a second image-check is required
Bot is suspicious of behavior in the initial test. In cases when the risk analysis engine can t confidently predict whether a user is a human or an abusive agent, it will prompt a CAPTCHA to elicit more cues, increasing the number of security checkpoints to confirm the user is valid. from Google reCaptcha page.
Expiration of time is also handled with new reCaptcha. If there is no response from the client for a while, the reCaptcha pops up an additional image-check puzzle.
ReCaptcha application on mobile devices. The website will show you images for comparison/selection and you will be verified upon single or multiple tap(s).
Criteria of engine verification analysis
For this new type of CAPTCHA the main evidence will be browser behaviour, rather than check box value.
- mouse movement, its slightness and straightness
- page scrolls
- time intervals between browser events
- keystrokes
- click location history tied to user fingerprint
All these criteria, are stored in the browser s cookie. These criteria are processed by Google s server to discern bots from humans it is pretty hard for bots to mimic the browser behavior of humans. This technique is pretty far advanced when you compare it to the old CAPTCHAs spam protection methods which for the most part can be solved using today s technology.
Today s Artificial Intelligence technology can solve even the most difficult variant of distorted text at 99.8% accuracy. Thus distorted text, on its own, is no longer a dependable test (by Google research).
Some more on the behavior captcha
Some readers are perplexed: If the software is capable of differentiating between bots and humans before presenting CAPTCHAs, then what is the point of the CAPTCHA?
ReCaptcha 2.0 is smart. Really smart. How much CAPTCHA users are asked to do, depends on how human they behave. If the risk assessment machine does not have enough evidence that a user is a human, it puts additional tricks (image CAPTCHA) for final verification. This method should remove the usual frustrations we humans feel when confronted with the traditional super distorted text CAPTCHAs.
Want it? Register in google to integrate it
At this point, I believe, many readers are eager to get this new generation CAPTCHA on their sites. Prior to using it, you need to register your site (prooving your site ownership) in reCaptcha google service. Upon success you ll be issued the reCaptcha credentials (a site key and a secret key). The site key is later integrated into the form with reCaptcha (follow steps of the reCaptcha management after a signup) while the secret key is needed for final verification by your server. This php library is available for integrating reCaptcha into a website.
In the following post we’ve described how to integrate it on site and make it work.
The simplest form with reCaptcha code
<script src="https://www.google.com/recaptcha/api.js" >
<form method="post">
<div class="g-recaptcha" data-sitekey="[site key issued by google]"></div>
<input value="submit" type="submit" />
</form>
Need to break it?
At the same time, I am sure some web scraping developers and businesses would like to find a way of breaking through this type of CAPTCHA.
We’ve managed an iMacro script that breaks reCaptcha thru a brute force approach. Selenium has also contributed in here.
In the following posts, we’ll explore some software and services that might be able to break this new CAPTCHA. So, stay tuned! If you want to help us test drive these methods, please let me know in the comments.
Conclusion
The reCaptcha v2.0 is no doubt a nice and powerful tool in spam and web scraping protection. Google has finally created a good user experience for sites which rely on CAPTCHA. Yet, I believe, both human labour CAPTCHA solving services and the programming CAPTCHA solving systems will continue to fight and break this new invention in the endless human-bot competition.
12 replies on “No CAPTCHA reCaptcha challenge”
My site has a login form whose button is inserted by Javascript and whose result is submitted by Javascript to an API, not directly to an http request.
Seems to me that this makes the form robot proof unless the robot can interpret Javascript.
Comments?
The form will not be the robot proof since the reCaptcha takes the cues, evaluates them at the google server and inserts as an encoded value field into the form. This encoded value robot can’t obfuscate.
/
It’s a shame, since I block javascript at most places. I especially block traffic to google. I don’t trust them, and I specifically do not trust them with data indicating my quirks involving how human I am and what I do in a browser. Forget that.
Nice idea but can’t a Slammer record a single successfull human session and use the cookie and sent payload to google many times as bot?
Yopi, your suggestion is not bad. If you can, you may try to code of it and shate with us. Yet, as far as I know, google creates time dependent cookie, so the interaction might be “valid” only for some time period.
I used googles no captcha recaptcha plugin in contact form7 of wordpress.Only once It worked fine.After that each time it asks for image verification.But I don’t want that image verification challenge.Please help me to modify the code of plugin.I want the answer as soon as possible.please help.
The google might suspect your reCaptcha soltion behaviour to be a bot-like. What’s the code of plugin? Any link to it? For such a task you might probably have to pay to one who would do it.
CAn u haCk cAtpcha in eaCh sites then emaiL me 🙂
It really Sucks.
im automation tester how can i bypass the reception technique using selenium code
What do you mean “reception technique” ?
Alternative Google recaptcha2 solver service provider, please see http://www.solverecaptcha.com/