Categories
Uncategorized

Pros and Cons of using Selenium WebDriver for Website Scraping

Since Selenium WebDriver is created for browser automation, it can be easily used for scraping data from the web. In this post we will consider some advantages and drawbacks of using WebDriver for web scraping.

Categories
Development

How to change WebDriver’s IP address

I have already written several articles on how to use WebDriver for web scraping, but I have never touched on the topic of changing WebDriver’s IP address. Nevertheless, this topic is quite crucial when you come to web scraping, and here I’d like to show you an example of using proxies with WebDriver in Python (and […]

Categories
Development

6 Tips for Using WebDriver with Java

In this article I’ll share with you 6 usefult tips that may help you when you work with Selenium WebDriver in Java.

Categories
Development

How to scrape Amazon with WebDriver in Java

Here is a real-world example of using Selenium WebDriver for scraping. This short program is written in Java and scrapes book title and author from the Amazon webstore.

Categories
Development

How to use Selenium WebDriver with Java

As we already showed you the example of using WebDriver with C#,  in this post we will see how to extract web data using Selenium WebDriver with Java, the native language of Selenium WebDriver.

Categories
Development

Example of Scraping with Selenium WebDriver in C#

In this article I will show you how it is easy to scrape a web site using Selenium WebDriver. I will guide you through a sample project which is written in C# and uses WebDriver in conjunction with the Chrome browser to login on the testing page and scrape the text from the private area of the website.

Categories
Uncategorized

What is Selenium WebDriver?

If you are interested in browser automation or web application testing you may have already heard of Selenium. Since there is a lot of terminology related to this framework, it is easy for you to get lost, especially if you come to Selenium for the first time. In this article I want to save your day by providing a short […]

Categories
Challenge Development

Undetected ChromeDriver in Python Selenium

Selenium comes with a default WebDriver that often fails to bypass scraping anti-bots. Yet you can complement it with Undetected ChromeDriver, a third-party WebDriver tool that will do a better job. In this tutorial, you’ll learn how to use Undetected ChromeDriver with Selenium in Python and solve the most common errors.

Categories
Development

How to find out that website is Distil protected?

Given: a webpage to scrape.If you inspect the DOM tree of that page you will find that quite a few tags are having the keyword dist. As an example: <link rel=”shortcut icon” type=”image/x-icon” href=”/wcsstore/ColesResponsiveStorefrontAssetStore/dist/30e70cfc76bf73d384beffa80ba6cbee/img/favicon.ico”> <link rel=”stylesheet” href=”/wcsstore/ColesResponsiveStorefrontAssetStore/dist/30e70cfc76bf73d384beffa80ba6cbee/css/google/fonts-Source-Sans-Pro.css” type=”text/css” media=”screen”>

Categories
Development

JAVA, Selenium, headless Chrome, JSoup to scrape data of the web

In this post we share with you how to perform web scraping of a JS-rendered website. The tools as seen in the header are JAVA with Selenium library driving headless Chrome instances (download driver) and JSoup as parser to fetch data of the acquired HTML.