Categories
Web Scraping Software

Import•io: the First Impression

There are two extreme approaches for building a web scraper: to make it highly flexible and customizable but understandable for IT gurus only or to make it nice, simple and handy but limited in usage. All scraping software developers usually try to find a golden mean between these two approaches. In this article I want to introduce you to a relatively new startup, import•io, which says that anyone can scrape any data regardless of his or her IT skills.

Categories
Web Scraping Software

Knowledge Walls: manipulation with JSON, XML, CSV and more

Personally, I prefer using online tools for performing quick manipulation on different data formats like JSON, XML, CSV and so on. They’re platform independent and always within reach of my hand (since I mainly work in a browser). After we published an article about 7 best JSON viewers, I was told about Knowledge Walls, a similar service containing many tools for text data manipulation.

Categories
Web Scraping Software

A simple way to turn a website into JSON

Recently, while surfing the web I stumbled upon an simple web scraping service named Web Scrape Master. It is a kind of RESTful web service that extracts data from a specified web site and returns it to you in JSON format.

Categories
Web Scraping Software

Quick Scraping with Yahoo Pipes

Yahoo PipesAs we are talking about web scraping, it would be a pity not to mention Yahoo Pipes, an exciting service provided by Yahoo!. This tool provides users with an intuitive graphical interface to assist them in organizing their favorite feeds and webpages into a single stream of content.

Categories
Uncategorized

Distil: Scrape Bot Protection Test

The anti scrape bot service test has been my focus for some time now. How well can the Distil service protect the real website from scrape? The only answer comes from an actual active scrape. Here I will share the log results and conclusion of the test. In the previous post we briefly reviewed the service’s features, and now I will do the live test-drive analysis.

Categories
Review

Distil Review: Anti-Scrape-Bot Service

Are you thinking of protecting your website content from theft and nonlegal scraping? Are you suspecting that some ‘innocent bots’ are continually visiting your web pages for data retrieval? Now we come to the anti scraping bot software and services. In this post we want to briefly review the new anti scrape bot service called Distil

Categories
Web Scraping Software

TheWebMiner, a cloud scraping tool

If you need to quick extract some data from an website and you lack of tech skills of the TheWebMiner’s Get By Sample web tool is a solution for you. Get By Sample works as a cloud web scraper and therefore it may work everywhere, on many devices even tablets and smartphones.

Categories
SaaS

80legs Review – Crawler for rent in the sky

80legs offers a crawling service that allows users to (1) easily compose crawl jobs and (2) cloud run their crawl jobs over the distributed computer network.

The modern web requires you to spend huge amount of processing power to mine it for information. How could a start-up or a small business do comprehensive data crawling without having to build the giant server farms used by major search engines?