Categories
Challenge Development

Node.js, Python & Ruby Bots Zoo repo

Today, I got in touch with the Node.js [and Python] bots garden/zoo providing modern bots with different kinds of browsers (Firefox, Chrome, Headless/not headless) using different automation frameworks (Puppeteer, Selenium, Playwright) in several programming languages.

The Bots repo of Antoine Vastel contains to the time the following bots:

NodeJS
ChromiumPuppeteer PlaywrightSelenium

  • Playwright (NodeJS): Chromium, Webkit (Safari), Firefox
  • Playwright extra stealth (Nodejs): Chromium (will be updated when it becomes stable)
  • Puppeteer (NodeJS): Chromium, Firefox, Android (emulation), iPhone (emulation)
  • Puppeteer extra stealth (NodeJS): Chromium
  • Pyppeteer stealth (Python): Chromium
  • Selenium (NodeJS): Chromium, Firefox
Python ChromeRubyVPNSeleniumGolang
  • Selenium stealth (Python): Chrome
  • Undetected Chromedriver (Python): Chrome
  • Ferrum (Ruby): Chrome
  • Watir (Ruby): Chrome, Safari (MacOS)
  • Simple HTTP module/library (NodeJS + Cheerio): Sequential, Parallel, Sequential using Nord VPN, HTTP proxies
  • Simple HTTP module/library (Python requests/aiohttp + Beautifulsoup): Sequential, Parallel (x2 implementations)
  • Simple HTTP module/library (Golang standard library + goquery): Sequential, Parallel

To be added

  • Playwright Firefox/WebKit
  • Selenium Firefox, both in NodeJS but also in other programming languages like Python.
  • Examples for bot frameworks that provide mechanisms against bot detection solutions.

More for browser masking

The headers directory contains data related to HTTP headers. For the moment, it contains:

  • A list of ~16K user-agents;
  • Accept headers for the main browsers;
  • Accept-Encoding headers for the main browsers;
  • Header names for the main browsers;
  • Fetch metadata request headers.

Bonus: posts on Bot detection.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.