Categories
Challenge Development

Undetected ChromeDriver in Python Selenium

Selenium comes with a default WebDriver that often fails to bypass scraping anti-bots. Yet you can complement it with Undetected ChromeDriver, a third-party WebDriver tool that will do a better job.

In this tutorial, you’ll learn how to use Undetected ChromeDriver with Selenium in Python and solve the most common errors.

What Is Undetected ChromeDriver?

Undetected ChromeDriver is a Selenium WebDriver optimized to avoid triggering anti-bots. 

Some examples? Cloudflare and Akamai. It works with Google Chrome, Brave and many other Chromium-based browsers.

How to Use Undetected ChromeDriver in Python with Selenium?

Let’s see what you’ll need to get started!

Prerequisites

Let’s see what you’ll need to get started!

To use Undetected ChromeDriver 2 (repo) and an example code , you’ll need the following:

  • Selenium because it’s the base.
  • Python 3 since the driver works only with Python 3.6 or higher.
  • Chrome because it’s the browser you’ll control from the script.

Install undetected_chromedriver to use it with Selenium and Python.

pip install undetected-chromedriver

Remark: If you don’t have Selenium installed, it’ll be automatically added with ChromeDriver.

Once we have the tools we’ll need, let’s write our first lines of code.

import undetected_chromedriver as uc 
 
driver = uc.Chrome() 
driver.get("https://www.nowsecure.nl") 
 
print(driver.current_url) # https://www.nowsecure.nl/ 
print(driver.title) # nowSecure

The script will load the Chrome browser, then redirect and load all the website’s resources of our target URL https://www.nowsecure.nl/. To see if it works, we’ll print the address and title of this homepage.

It’s important to note that base Selenium’s WebDriver doesn’t come up with the headless mode enabled by default. To change this, we’ll pass options to undetected_chromedriver.

# ... 
options = uc.ChromeOptions() 
 
options.headless = True 
 
driver = uc.Chrome(options=options) 
# ...

Using Undetected ChromeDriver with Selenium

Having established a secure connection with Undetected ChromeDriver, we’ll use Selenium to look for the information we want.

We’re interested in the information available on the OpenSea page. So, we’ll use CSS selectors and Selenium’s find_elements method to match all the occurrences of the class.

from selenium.webdriver.common.by import By 
 
# ... 
driver.get("https://opensea.io") 
node = driver.find_element(By.CSS_SELECTOR, "h5[class='sc-29427738-0 sc-bdnxRM kgxFZp hBeyeI']") 
print(node.accessible_name) # "I'm Spottie" Vinyl Record Collection verified-icon

Undetected ChromeDriver Proxy

You can also use a proxy with Undetected ChromeDriver to avoid getting blocked while web scraping. However, opting for a free solution won’t do you much good because free proxies are often unreliable. The reason is they’re run by providers with limited resources and usually outdated infrastructure. And as they’re public, many people use them, which can easily result in IP bans.

On the other hand, residential proxies are well-maintained and sourced from reputable ISP providers, so they’re a better option for web scraping than datacenter IPs.

Common Errors from Undetected ChromeDriver and Selenium

The most common errors we can get from undetected_chromedriver and Selenium are:

  • Denied Access.
  • Headless Evasion.

We’ll see each in detail to understand how to solve them.

Denied Access

According to the current release on PyPi, Undetected ChromeDriver is more optimized for bypassing anti-bots than Selenium WebDriver. However, the results aren’t guaranteed.

For instance, let’s try using https://www.zoominfo.com as a target URL.

# ... 
driver.get("https://www.zoominfo.com") 
# Access denied | www.zoominfo.com used Cloudflare to restrict access

Here, we fail to bypass the bot detection system implemented by ZoomInfo.

Headless Evasion

According to release 3.1.0 on GitHub, the headless mode is still a work in progress as per a prior release, and no news has been communicated in the latest ones.

That means you shouldn’t use headless to have more guarantees to remain undetected. With that in mind, using the driver featured in this article in regular mode might be expensive, depending on your project size.

Conclusion

In this Undetected ChromeDriver tutorial with Selenium in Python, we learned why and how the library could help us. Also, we compared it with a popular alternative that might fit your needs better in some use cases. 

Unlike Selenium WebDriver, undetected_chromedriver is more optimized, which makes it better at bypassing bot detection systems.

Some Questions

Is Undetected ChromeDriver Safe?

Undetected ChromeDriver is safe even though it’s an unofficial version, as it’s regularly maintained and updated by a third party. Therefore, you shouldn’t expect any compatibility issues, security vulnerabilities, or other trouble, but always check PyPI and GitHub for new developments.

What Is the Use of Undetected ChromeDriver?

Undetected ChromeDriver is used to avoid triggering anti-bot measures. It’s a web driver for Selenium to avoid bot detection you can use in Python to complement its official WebDriver.

Why Is Undetected ChromeDriver Not Working?

If the Undetected ChromeDriver isn’t working, it’s likely because you’ve encountered one of the following errors:

  • Denied Access: The driver sometimes fails to bypass security measures.
  • Headless Evasion: The headless mode isn’t optimized yet, so it may fail to avoid detection.

How to Use Undetected ChromeDriver in Python Selenium?

Here’s what you need to do to use Undetected ChromeDriver with Selenium in Python:

  1. Install Python 3, Selenium, and Chrome.
  2. Use pip install undetected-chromedriver to install the Undetected ChromeDriver.
  3. Import the library, load the Chrome browser, and get your target site.
  4. Make sure you’ve connected properly by printing the address and title of the homepage.
  5. Use CSS selectors and Selenium’s find_elements method to look for the information you want.

Source

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.