Categories
Development

5 Best XPath Cheat Sheets and Quick References

XPath Cheat Sheets I always love a good cheat sheet hanging on my corkboard when I’m working, and XPath is one of the fields where I often refer to it. If you’re looking for a good XPath cheat sheet you will probably find something useful in this post.

Categories
Web Scraping Software

Knowledge Walls: manipulation with JSON, XML, CSV and more

Personally, I prefer using online tools for performing quick manipulation on different data formats like JSON, XML, CSV and so on. They’re platform independent and always within reach of my hand (since I mainly work in a browser). After we published an article about 7 best JSON viewers, I was told about Knowledge Walls, a similar service containing many tools for text data manipulation.

Categories
Web Scraping Software

A simple way to turn a website into JSON

Recently, while surfing the web I stumbled upon an simple web scraping service named Web Scrape Master. It is a kind of RESTful web service that extracts data from a specified web site and returns it to you in JSON format.

Categories
Web Scraping Software

Quick Scraping with Yahoo Pipes

Yahoo PipesAs we are talking about web scraping, it would be a pity not to mention Yahoo Pipes, an exciting service provided by Yahoo!. This tool provides users with an intuitive graphical interface to assist them in organizing their favorite feeds and webpages into a single stream of content.

Categories
Web Scraping Software

Using External Input Data in Off-the-shelf Web Scrapers

External Data Source ConnectionThere is a question I’ve wanted to shed some light upon for a long time already: “What if I need to scrape several URL’s based on data in some external database?“.

Categories
Web Scraping Software

Visual Web Ripper: Using External Input Data Sources

Visual Web Ripper: Using External Input Data SourcesSometimes it is necessary to use external data sources to provide parameters for the scraping process. For example, you have a database with a bunch of ASINs and you need to scrape all product information for each one of them. As far as Visual Web Ripper is concerned, an input data source can be used to provide a list of input values to a data extraction project.

Categories
Development

XPath in Examples

Here we’ll show how XPath works. Let’s take the following XML as a lab rat.

Categories
Web Scraping Software

8+ Best CAPTCHA Solvers

In this post we want to share about some decaptcha software and services that we have encountered in our web scraping experience.

Categories
Web Scraping Software

Scraping Amazon.com with Screen Scraper

Let’s look how to use Screen Scraper for scraping Amazon products having a list of asins in external database.

Categories
Miscellaneous

Free Website Backup

For simple web scraping jobs I often prefer a php + mysql bundle putting the project right to the web and working online. But as you work online a problem appears: how to backup your work results?