Categories
Development

BrowserForge [Python] library to generate scraper headers & fingerprints

Recently I’ve found a Python library that generates fake headers and consistent fingerprints into custom scrapers. Such a generated headers and fingerprints contribute to bypass anti-bots
solutions.

Intelligent browser header & fingerprint generator

Categories
Development

JAVA library to scrape Linkedin & its data affiliates

In this post we want to share with you a new useful JAVA library that helps to crawl and scrape Linkedin companies. Get business directories scraped!

If you are considering the Linkedin data scrape legal issues, please refer to the following post: Linkedin lost in court to data analytic company that scrapes Linkedin’s public profiles info
Categories
Development Web Scraping Software

JavaScript rendering library for scraping javascript sites

Can you imagine how many scraping instruments are at our service? Though it has a long history, scraping has at last become a multi-lingual and simple approach. Unfortunately, there is a list of non-trivial tasks which can’t be resolved in a snap.

One of these tasks is scraping javascript sites, those that output data using JavaScript. Facing this task, classic scrapers (not all of them though) ignore JS-data and continue their own life-cycle. However, when this little defect becomes a big trouble, developers all over the world take measures. And they did it! Today we consider one of the most awesome tools which scrapes JS-generated data – Splash.