Categories
Guest posting Web Scraping Software

Turn any interactive website into an API with ParseHub

parsehubAnyone should be able to pull data from the web and access it in the format they want. If a website does not have an API available, scraping is one of the only options to get the data you need. But figuring out how to scrape data in the complicated HTML is a pain.

ParseHub is a new web browser extension that you can use to turn any dynamic and poorly structured website into an API, without writing code. ParseHub is a scraping tool that is designed to work on websites with JavaScript and Ajax; it is similar to web scraping tools such as Import.io and Kimono Labs.

Categories
Challenge

Q&A with ScrapeHero

In this post we’d like to share an interview with a young service called ScrapeHero. We’ve interviewed Tony Paul (marketing head) and this is what he had to say.

Categories
Web Scraping Software

Scraping software and services landscape

After almost 3 years in running this scraping blog and reviewing dozens of products; in this small post I’d like to categorise the tools/means used for web scraping available to end user. Here are the typical examples of scrapers in those categories.

Categories
Development

Web Scraping with Python + Scrapy (blog series)

This is part 1 of a series dedicated to getting novices started using a simple web scraping framework using python.

Categories
Uncategorized

import.io’s New Scraping Process and Features


Web scraping Data platform import.io, announced last week that they have secured $3M in funding from investors that include the founders of Yahoo! and MySQL.

They also released a new beta version of the tool that is essentially a better version of their extraction tool, with some new features and a much cleaner and faster user experience.

Categories
Uncategorized

Scraping software, services and plugins sum up

scraping-software-services-sum-upSince we have already reviewed classic web harvesting software, we want to sum up some other scraping services and crawlers, scrape plugins and other scrape related tools.

Web scraping is a sphere that can be applied to a vast variety of fields, and in turn it can require other technologies to be involved. SEO needs scrape. Proxying is one of the methods which can help you to stay masked while doing much web data extraction. Crawling is another sub-technology indispensable in scrape for unordered information sources. Data refining follows the scrape, so as to deal with the unavoidable inconsistency of harvested data.
In addition, we will consider fast scrape tools, making our life better, and some services and handy scrapers which enable us to obtain freshly extracted data or images.

Categories
Web Scraping Software

TheWebMiner, a cloud scraping tool

If you need to quick extract some data from an website and you lack of tech skills of the TheWebMiner’s Get By Sample web tool is a solution for you. Get By Sample works as a cloud web scraper and therefore it may work everywhere, on many devices even tablets and smartphones.

Categories
Data Mining

Data Journalism Handbook Poster

The poster is composed by Liliana Bounegru and Lulu Pinney shortly says what is in the Data Journalism Handbook. This referrence book shows how journalists can produce  interesting news out of data gathered from the web.

Categories
Review

Scraping for Journalists book Review

Scraping for Journalists by Paul Bradshaw is a handy book for non-programmers to master some basic scraping techniques with online scraping tools. For sure, this book does not and cannot embrace all the techniques and problems that arise with the practical scheduled business web extraction; instead, it guides common people through how to get and refine some open data.