Tag: crawling

Node.js, Puppeteer, Apify for Web Scraping (Xing scrape) – part 2

Post author By admin
Post date October 8, 2019
2 Comments on Node.js, Puppeteer, Apify for Web Scraping (Xing scrape) – part 2

In the post we share the practical implementation (code) of the Xing companies scrape project using Node.js, Puppeteer and the Apify library. The first post, describing the project objectives, algorithm and results, is available here.

The scrape algorithm you can look at here.

Tags business directory, crawling, headless, Node.js

Development

Make crawling easy with Real Time Crawler of Oxylabs.io

Post author By admin
Post date November 26, 2018
No Comments on Make crawling easy with Real Time Crawler of Oxylabs.io

Nowadays, it’s hard to imagine our life without search systems. “If you don’t know something, google it!” – is one of the most popular maxims in our life. But how many people use Google in an optimal way? A lot of developers use google commands to get needed answers as fast as it possible.

Even this is not enough today! Large and small companies need terabytes of data to make their business profitable. It’s necessary to automate the search process and make it reliable to satisfy the user with fresh news, updates or posts. In today’s article we will consider a very helpful tool – Real-Time Crawler (RTC) for the collection of fresh data. Let’s start!

Tags crawling, service, web scraping

Data Science

Testing the Filter by TheWebMiner for advanced web content filtering

Post author By admin
Post date February 9, 2016
No Comments on Testing the Filter by TheWebMiner for advanced web content filtering

thewebminer_logo Recently I came across an interesting new tool from TheWebMiner called Filter. The Filter is an attempt by TheWebMiner to sort (categorize) indexed websites and deliver them to users as a content filtering service.

Tags crawling, service

Web Scraping Software

A tool to extract phone numbers from a list of URLs

Post author By admin
Post date October 14, 2015
No Comments on A tool to extract phone numbers from a list of URLs

Today I got a question from one of my readers asking if there is a good out-of-the-box solution for crawling multiple websites for contact information.

Tags crawling

Review

Inspyder Power Search Review

Post author By admin
Post date February 28, 2013
No Comments on Inspyder Power Search Review

Inspyder Power Search is a crawling and scraping application which is more for straightforward scraping, using both XPath and Regex. The program has a simple, nice interface making it easy to learn and employ it.

Inspyder is designed for multiple purposes:

Tags crawling, scraper

SaaS

80legs Review – Crawler for rent in the sky

Post author By admin
Post date December 1, 2012
No Comments on 80legs Review – Crawler for rent in the sky

80legs offers a crawling service that allows users to (1) easily compose crawl jobs and (2) cloud run their crawl jobs over the distributed computer network.

The modern web requires you to spend huge amount of processing power to mine it for information. How could a start-up or a small business do comprehensive data crawling without having to build the giant server farms used by major search engines?

Tags crawling, service