Categories
Development SEO and Growth Hacking

ReCaptcha to be solved with iMacros

breaked by imacroRecently I’v been getting requests for a tutorial showing how to solve Google’s No CAPTCHA ReCaptcha. I’ve introduced it before and promised to work out a script to automate solving it. And here’s what I’ve come up with.

Disclaimer. After we’ve published the post, Google has drastically complicated the form.

  1. Google engineers have removed iframe’s name attribute that we’ve tried to stick to.
  2. They’ve changed the html markup pertaining to the image puzzle. The table layout is now inside the block layout, a table being of 3×3 to 6×6 boxes large. Therefore the random click solution probability has decreased.
    Thus the scripting solution time has drastically increased. Probability to solve 4×4 table puzzle (with 2-3 right checked tiles) in a single attempt is now: 1/16*1/15*1/14 * 100% = 0.03-0.4%. It’s orders of magnitude less than original 2.8%.
  3. Google also has set a session timeout limit. So after certain time, it makes session to time out.

Practically we still strive to improve the code to beat reCaptcha down. Keep track of the new posts where we’ll expose new code!

Brute force cracking

After some unsuccessful behaviour imitating trials with Selenium, I got help from my friend (Egor Homakov) and his post on reCaptcha. Specifically, this paragraph where he talks about reCaptcha’s vulnerability:

“Client (website with CAPTCHA) knows how many wrong attempt you made (because verification is server side) but doesn’t know how many challenges you actually received (because User gets challenge with JavaScript, Client isn’t involved). Getting a challenge and verifying a challenge are loosely coupled events”.

So it seems that the brute force method might be the only way forward, since Client does not know how many picture challenges a macro solves at a single attempt and google provides as much as challenges needed with no ban.  I’ve calculated the probability of a single trial to rightly choose random 2 out of 9 pictures: 2/9 * 1/8 = 1/36 ≅  2.8%. Not a bad probability in case google puts forth only 2 pictures out of 9 to choose at each challenge.
[box style=’grey’]If you are eager for the CAPTHCA solution code you can jump directly to it. Or you can find out how to insert the reCaptcha HTML frame numbers; see how below.[/box]

iMacros

select randomlyThe reCaptcha is essentially a user behaviour operating puzzle. iMacros functionality does not provide to reproduce real user behaviour, so we need to leverage brute force to break it.

Algorithm

  1. Macro clicks on a reCaptcha’s check box
  2. Google’s assessment machine at server decides if the right clues are present and returns results on the web page.
  3. Macro checks reCaptcha’s checkbox if it is marked up. If true (captcha is solved) => Finish macro
  4. [If not] Google’s server provides a picture puzzle to be solved
  5. Macro checks up randomly selected pictures* and submits solution to Google’s server.
  6. Go to point 2.
*If captcha algorithm finds there are already some checked pictures, at this point in the loop the macro marks out only one tile (picture). This gives a better probability for solution with 3-d or forth pictures puzzle.

The algorithm can be securely repeated for unlimited repetitions and thus eventually gets the bull eye.

JavaScript wrapping

Since the macro code requires only a part of it to be repeated, I’ve used a JavaScript wrapping (scripting) over iMacro code. Read more here.

Frames identifying

At the record stage, iMacro identifies reCaptcha HTML frames by name which is a random number set by Google’s server. This does not fit for repeated addressing them. So, instead, you’ll need to find out what frame number persist to the main CAPTCHA code and loaded picture puzzle. Through trial and error I’ve found frame numbers for main reCaptcha box and picture puzzle box of WordPress register page (https://wordpress.org/support/register.php) – 5 and 6 correspondingly.

Here is the simple code for you to get a frame number of the main captcha HTML frame. Get the number prompted up after checkbox is checked. Picture puzzle frame number must be a next number after a main captcha frame number:

VERSION BUILD=8920312 
TAB T=1
SET !LOOP 1
URL GOTO=https://wordpress.org/support/register.php
SET !ERRORIGNORE YES
FRAME F={{!LOOP}} 
SET !TIMEOUT_STEP 3
TAG POS=1 TYPE=DIV ATTR=CLASS:recaptcha-checkbox-checkmark
SET !VAR1 Frame<SP>number:<SP>{{!LOOP}}
PROMPT {{!VAR1}}

Play this code in loop. Having captured the main captcha frame number you’ll substitute it into the final macro code as well as picture puzzle frame number; 2 times each in a code:

FRAME F=<your number>  // captcha main frame number

FRAME F=<your number>+1  // picture puzzle frame number

Submacros

Basically there are 4 macros doing their own function:

1. Initial click on check box

var init_macro = "CODE:";
init_macro += "VERSION BUILD=9052613" + "\n";
init_macro += "TAB T=1" + "\n";
init_macro += "TAB CLOSEALLOTHERS" + "\n";
init_macro += "SET !EXTRACT_TEST_POPUP NO" + "\n"; 
init_macro += "URL GOTO=https://wordpress.org/support/register.php" + "\n";
init_macro += "FRAME F=5" + "\n";
init_macro += "SET !TIMEOUT_STEP 3" + "\n";
init_macro += "TAG POS=1 TYPE=SPAN ATTR=ID:recaptcha-anchor" + "\n";
init_macro += "WAIT SECONDS=2" + "\n";

2. Check if captcha is solved (checkbox is checked)

We check if the checkbox is checked by hovering on it – CONTENT=EVENT:MOUSEOVER – and getting the return code. See iMacro Error and Return Codes page.

var solved = null;
var check_macro ="CODE:"; 
check_macro += "FRAME F=5" + "\n"; // we turn back to the main frame 
check_macro += "SET !TIMEOUT_STEP 1" + "\n";
check_macro += "TAG POS=1 TYPE=SPAN ATTR=ID:recaptcha-anchor&&aria-checked:true CONTENT=EVENT:MOUSEOVER"  + "\n";

 3. Check if already some pictures checked up.

If so, this makes us click on one more picture per step (not 2 pictures at initial puzzle load).

var selected_tile = null, error_selected_tile;  
var selected ="CODE:"; 
selected += "FRAME F=6" + "\n"; 
selected += "SET !TIMEOUT_STEP 1" + "\n"; 
selected += "TAG POS=1 TYPE=DIV ATTR=CLASS:rc-imageselect-tileselected CONTENT=EVENT:MOUSEOVER"  + "\n";

 4. Main macro

It works to click on one or two pictures (tiles) in a puzzle. Thus we may proceed with better probability for 3-d or forth pictures check.

var macro = "CODE:"; 
macro += "FRAME F=6" + "\n";
macro += "SET !TIMEOUT_STEP 1" + "\n";
macro += "TAG POS={{pos1}} TYPE=DIV ATTR=CLASS:rc-image-tile-target " + "\n";
macro += "SET !TIMEOUT_STEP 0" + "\n";
macro += "SET !ERRORIGNORE YES" + "\n"; // since sometimes next commad fails 
macro += "TAG POS={{pos2}} TYPE=DIV ATTR=CLASS:rc-image-tile-target" + "\n";
macro += "SET !TIMEOUT_STEP 3" + "\n";
macro += "SET !ERRORIGNORE NO" + "\n";
macro += "EVENT TYPE=CLICK SELECTOR=\"#recaptcha-verify-button\" BUTTON=0" + "\n";

The whole iMacro code

Now we put together these macros in a JavaScript wrapper code and play macros inside the JavaScript loop through iimPlay() function. I’ve added here the debugging (iimDisplay()) and benchmarking (start, end vars) code. Feel free to remove it.

/* random between 1 and 9 */
function rand(){ return Math.floor(Math.random()*9) + 1;  }
// initial click macro
var init_macro = "CODE:";
init_macro += "VERSION BUILD=9052613" + "\n";
init_macro += "TAB T=1" + "\n";
init_macro += "TAB CLOSEALLOTHERS" + "\n";
init_macro += "SET !EXTRACT_TEST_POPUP NO" + "\n"; 
init_macro += "URL GOTO=https://wordpress.org/support/register.php" + "\n"; 
init_macro += "FRAME F=5" + "\n"; // captcha main frame number 
init_macro += "SET !TIMEOUT_STEP 3" + "\n";
init_macro += "TAG POS=1 TYPE=SPAN ATTR=ID:recaptcha-anchor" + "\n";
init_macro += "WAIT SECONDS=2" + "\n";

// macro for checking captcha checkbox
var solved = null; 
var check_macro ="CODE:";
check_macro +="FRAME F=5" + "\n"; // captcha main frame number
check_macro +="SET !TIMEOUT_STEP 1" + "\n";
check_macro +="TAG POS=1 TYPE=SPAN ATTR=ID:recaptcha-anchor&&aria-checked:true CONTENT=EVENT:MOUSEOVER"  + "\n";  

// main macro 
var macro = "CODE:"; 
macro += "FRAME F=6" + "\n"; // picture puzzle frame number
macro += "SET !TIMEOUT_STEP 1" + "\n";
macro += "TAG POS={{pos1}} TYPE=DIV ATTR=CLASS:rc-image-tile-target " + "\n";
macro += "SET !TIMEOUT_STEP 0" + "\n";
macro += "SET !ERRORIGNORE YES" + "\n"; // since sometimes next commad fails 
macro += "TAG POS={{pos2}} TYPE=DIV ATTR=CLASS:rc-image-tile-target" + "\n";
macro += "SET !TIMEOUT_STEP 3" + "\n";
macro += "SET !ERRORIGNORE NO" + "\n";
macro += "EVENT TYPE=CLICK SELECTOR=\"#recaptcha-verify-button\" BUTTON=0" + "\n"; 

// macro for checking if any selected tile exists
var selected_tile = null, error_selected_tile;
var selected = "CODE:"; 
selected += "FRAME F=6" + "\n";  // picture puzzle frame number
selected += "SET !TIMEOUT_STEP 1" + "\n"; 
selected += "TAG POS=1 TYPE=DIV ATTR=CLASS:rc-imageselect-tileselected CONTENT=EVENT:MOUSEOVER"  + "\n";  
  
// start execution
var start = new Date(), end; // for benchmarking
iimPlay(init_macro); 

for(var i=1;i<150 ;i++)    
{
	// ***************** check if captcha is solved or not *****************
	solved=iimPlay(check_macro);  
	if(solved > 0)
	{ 	   
		end =+ new Date();
		alert('Captcha is solved at loop '+i+'\n\rTime spent: '+Math.floor((end-start)/1000)); 
		break; 
	} 
	// ***************** check if there are already selected tiles *****************
	selected_tile =iimPlay(selected);
	error_selected_tile = iimGetErrorText();
	iimDisplay('Found at least one selected tile: ' + selected_tile + '\n\r '+ error_selected_tile);
	
	// ***************** we set random positions for image checkboxes *****************
	iimSet("pos1", rand());
        if (selected_tile<0) 
	     { iimSet("pos2", rand()); }
	else   
	     { iimSet("pos2", ''); } // omit second position if present selected tiles 
	
	// ***************** play the images checker *******************
	iimPlay(macro);   	
}

Execution statistic

The macro execution time is subject to normal distribution. With the over 50 samples (trials) I’ve got these results:

  • mean value – 47 loops  (~3 min)
  • standart deviation – 21 loops

Obviously reCaptcha solutions with iMacro code is not optimal in terms of time, but it proves that reCaptcha is not a break-proof anti-scrape solution.

Integration

Some readers may ask if it’s possible to integrate this solution into real app. Sure. You can leverage iMacro Scripting interface through many languages (Windows-based).

[box style=”grey”]The iMacros Browser, Internet Explorer (with iMacros Add-on), Google Chrome (with iMacros Add-on) and Firefox (with iMacros Add-On) can be controlled with any Windows programming or scripting language.[/box]

As far as the Linux integration, iMacros “no longer support or update the Linux version of the Scripting Interface” (2013), but you can try to get in gear using this existing approach.

Conclusion

So far, the brute force approach is the only one that I’ve been able to use successfully to break such remote managed CAPTCHAs. The Client does not know how many challenges were issued by remote CAPTCHA provider (in this case – Google) until the CAPTCHA is solved by the user. And the provider does not care since it’s generating millions of challenges for the host of remote CAPTCHA clients (holders). I feel the browser automation is the best way to solve these new sophisticated bot protection puzzles!

I welcome your comments and questions!

32 replies on “ReCaptcha to be solved with iMacros”

Hi i dont understand why do you use the frame number five and how can i see this frame because i record with imacros and i dont look this frame. thanks a lot.

Since Google provides unlimited number of access to its recaptcha, the banning is unlikely (see the post on the theory side: “No captcha recaptcha challenge ”
As far as uncertain iframe, we’ll need to figure it out.

Hi. Google put new picture that this captcha can’t solve. I say for example when they request you lakes and the pictures changes. ¿Do you know one solution possible?

Hi, i have a doubt, this macro SOLVE the captcha or send to a website that sell services for solve captcha(human services)?

how to use this code for other website?
i want to use it on his site: faucet.raiblockscommunity.net/form.php

Muhammad, the site you’ve mentioned is running new invisible reCaptcha. It operates following the same principles as a regular reCaptcha 2.0 Yet, I dought the iMacro might get hold of it (solve it).

Can you make imacro script to integrate capmonster? it’s an automated captcha solving software. I’m just having trouble sending captcha image to capmonster and using the results to click on recaptcha puzzle. Capmonster uses this site antigate.com/imacros.html to emulate.

SCRIPT FIXED, Now you need change only URL where captcha Found, but it’s not helpful for Recaptcha.
You Need to change only URL: “https://www.google.com/recaptcha/api2/demo”

//Script Start you need to copy this text and paste in text file. and change extension “.js” and this script only work in Firefox Imacro Extension.

/* random between 1 and 9 */
function rand(){ return Math.floor(Math.random()*9) + 1; }
// initial click macro
var init_macro = “CODE:”;
init_macro += “VERSION BUILD=9052613” + “\n”;
init_macro += “TAB T=1” + “\n”;
init_macro += “TAB CLOSEALLOTHERS” + “\n”;
init_macro += “SET !EXTRACT_TEST_POPUP NO” + “\n”;
init_macro += “URL GOTO=https://www.google.com/recaptcha/api2/demo” + “\n”;
init_macro += “FRAME F=1” + “\n”; // captcha main frame number
init_macro += “SET !TIMEOUT_STEP 3” + “\n”;
init_macro += “TAG POS=1 TYPE=SPAN ATTR=ID:recaptcha-anchor” + “\n”;
init_macro += “WAIT SECONDS=2” + “\n”;

// macro for checking captcha checkbox
var solved = null;
var check_macro =”CODE:”;
check_macro +=”FRAME F=2″ + “\n”; // captcha main frame number
check_macro +=”SET !TIMEOUT_STEP 1″ + “\n”;
check_macro +=”TAG POS=1 TYPE=SPAN ATTR=ID:recaptcha-anchor&&aria-checked:true CONTENT=EVENT:MOUSEOVER” + “\n”;

// main macro
var macro = “CODE:”;
macro += “FRAME F=2” + “\n”; // picture puzzle frame number
macro += “SET !TIMEOUT_STEP 1” + “\n”;
macro += “TAG POS={{pos1}} TYPE=DIV ATTR=CLASS:rc-image-tile-target ” + “\n”;
macro += “SET !TIMEOUT_STEP 0” + “\n”;
macro += “SET !ERRORIGNORE YES” + “\n”; // since sometimes next commad fails
macro += “TAG POS={{pos2}} TYPE=DIV ATTR=CLASS:rc-image-tile-target” + “\n”;
macro += “SET !TIMEOUT_STEP 3” + “\n”;
macro += “SET !ERRORIGNORE NO” + “\n”;
macro += “EVENT TYPE=CLICK SELECTOR=\”#recaptcha-verify-button\” BUTTON=0″ + “\n”;

// macro for checking if any selected tile exists
var selected_tile = null, error_selected_tile;
var selected = “CODE:”;
selected += “FRAME F=2” + “\n”; // picture puzzle frame number
selected += “SET !TIMEOUT_STEP 1” + “\n”;
selected += “TAG POS=1 TYPE=DIV ATTR=CLASS:rc-imageselect-tileselected CONTENT=EVENT:MOUSEOVER” + “\n”;

// start execution
var start = new Date(), end; // for benchmarking
iimPlay(init_macro);

for(var i=1;i 0)
{
end =+ new Date();
alert(‘Captcha is solved at loop ‘+i+’\n\rTime spent: ‘+Math.floor((end-start)/1000));
break;
}
// ***************** check if there are already selected tiles *****************
selected_tile =iimPlay(selected);
error_selected_tile = iimGetErrorText();
iimDisplay(‘Found at least one selected tile: ‘ + selected_tile + ‘\n\r ‘+ error_selected_tile);

// ***************** we set random positions for image checkboxes *****************
iimSet(“pos1”, rand());
if (selected_tile<0)
{ iimSet("pos2", rand()); }
else
{ iimSet("pos2", ''); } // omit second position if present selected tiles

// ***************** play the images checker *******************
iimPlay(macro);
}

//script end

You need to implement image recognition instead of rand formula. then it will solve Recaptcha.
For example:
step1 download all recaptcha images
step2 then when Recaptcha found you need to match Existing recaptcha image to webpage Recaptcha if match image then click on that.

wiki.imacros.net/IMAGESEARCH
wiki.imacros.net/Image_Validation

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.