Categories
Development

An Example of Captcha Solver in Java

java_captcha Recently I published an article on how to solve captcha in C# using DeathByCaptcha service, and I promised to offer you an example in other languages as well. In this post I’ll offer a Java project that does the same thing.

A Shortcut

You can download the project right away.

How it works

In short, this program uses Selenium Webdriver to get a CAPTCHA picture, sends it to DeathByCaptcha service, receives a response, types it in and gets to the secured page. As an example of a captcha-protected webpage, I use my Web Scraper Testing Ground.

Let’s have a tour of the code now.

1. Opening the Webpage

First we need to initialize the WebDriver and open the target webpage. Let’s use the Firefox driver for this:

FirefoxDriver driver = new FirefoxDriver();
driver.manage().timeouts().implicitlyWait(1, TimeUnit.SECONDS);
driver.navigate().to("http://testing-ground.scraping.pro/captcha");

2. Getting  the Captcha Image

To get the image we will take a screenshot of the whole screen and then cut the image out according to its dimensions and location. After that the image is saved into a file in PNG format for further sending to DeathByCaptcha service:

byte[] arrScreen = driver.getScreenshotAs(OutputType.BYTES);
BufferedImage imageScreen = ImageIO.read(new ByteArrayInputStream(arrScreen));
WebElement cap = driver.findElementById("captcha");
Dimension capDimension = cap.getSize();
Point capLocation = cap.getLocation();
BufferedImage imgCap = imageScreen.getSubimage(capLocation.x, capLocation.y, capDimension.width, capDimension.height);
ByteArrayOutputStream os = new ByteArrayOutputStream();
ImageIO.write(imgCap, "png", os);

You may ask why I use such a complicated solution in taking a screenshot and extracting the image from it. Why not download the ready image by its URL? The problem is that every time  you request the image the server returns a new, randomly generated CAPTCHA, so to enter a valid code you need to use the very image that was generated specifically for the page on which you enter the code.

3. Requesting the DeathByCaptcha Service

Now as we have the captcha image extracted, we can send it to DeathByCaptcha for recognition. It’s done in a couple of code lines:

SocketClient client = new SocketClient("user", "password");
Captcha res = client.decode(new ByteArrayInputStream(os.toByteArray()));

Note that you need to replace “user” and “password” with your real DeathByCaptcha account details.

4. Typing the Recognized Captcha In

As soon as we get the response from DeathByCaptcha, we can type it into the page accessing the secure part:

driver.findElementByXPath("//input[@name='captcha_code']").sendKeys(res.text);
driver.findElementByXPath("//input[@name='submit']").click();

That’s it! Note, though, that in these snippets I have omitted several minor details that are present in the whole project, but are not so important here.

Libraries

Here I’d like to briefly mention some crucial libraries, packages and classes used in the project:

9 replies on “An Example of Captcha Solver in Java”

Web scraper can get blocked ip by web master.
Could you please make a tut how to change IP with WebDriver, Selenium.

we develop a software in vb.net/c#, for extract data from website. where website added capicha characters to receive detailed data, and through our software we have to enter capicha character one by one. can you help us to read capitcha image.
thanks

Me too. When I run the code, I am gettting an exception:

java.awt.image.RasterFormatException: (y + height) is outside of Raster

Leave a Reply to Deepti Cancel reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.