Categories
Development

Node.js to automate a browser XHR (Ajax)

Lately I needed to scrape some data that are dynamically loaded by “Load more” button. A website JavaScript invokes XHR (or Ajax request) to fetch a next data portion. So, the need was to re-run those XHR with some POST parameters as variables.

So, how to make it in Node.js?

Categories
Development

Puppeteer async scraper with browsers number to be tuned based on CPU capacity

Recently we’ve got a tricky website of dynamic content to scrape. The data are loaded thru XHRs into each part of the DOM (HTML markup). So, the task was to develop an effective scraper that does async while using reasonable CPU recourses.

Categories
Challenge Development

Bypass GoDaddy Firewall thru VPN & browser automation

Recently we encountered a website that worked as usual, yet when composing and running scraping script/agent it has put up blocking measures.

In this post we’ll take a look at how the scraping process went and the measures we performed to overcome that.

Categories
Challenge Development

Human-operated and automated Browser Fingerprints testing and needed parameters

In a previous post we’ve considered the ways to disguise an automated Chrome browser by spoofing some of its parameters – Headless Chrome detection and anti-detection. Here we’ll share the practical results of Fingerprints testing against a benchmark for both human-operated and automated Chrome browsers.

Categories
Guest posting Web Scraping Software

UiPath – Robotic Process Automation Software

UiPath is an Enterprise Robotic Process Automation (RPA) Software designed to empower companies to automate repetitive, manual, rules-based business processes. Any repetitive task a user performs on his computer, including data entry, legacy application integration, data or content migration, screen scraping and testing can be automated with UiPath.