Categories
Web Scraping Software

Quick Scraping with Yahoo Pipes

As we are talking about web scraping, it would be a pity not to mention Yahoo Pipes, an exciting service provided by Yahoo!. This tool provides users with an intuitive graphical interface to assist them in organizing their favorite feeds and webpages into a single stream of content. By pulling information from across the internet, Yahoo […]

Yahoo PipesAs we are talking about web scraping, it would be a pity not to mention Yahoo Pipes, an exciting service provided by Yahoo!. This tool provides users with an intuitive graphical interface to assist them in organizing their favorite feeds and webpages into a single stream of content. By pulling information from across the internet, Yahoo Pipes enables users to receive all of the information they care about without the unnecessary hassle of navigating between sites.

Built in 2007, this tool has received years of feature-driven development that make it the premier content organization service available on the web today. It takes only a few clicks to set up a new content feed that can aggregate content from your favorite sources and display it in a single stream on your website or homepage. A library of modules which extend the functionality of Yahoo Pipes exists for dedicated users, allowing for incredible customization of the application.

This library has been indexed by a variety of categories that make them easy to navigate even for new users, such as:

Sources

Yahoo Pipes SourcesSources are the Yahoo Pipes modules that get data and information from one or several sources on the web. On the screenshot at the left you can see the XPath Fetch Page module that applies XPath expression to any web page.

User Inputs

Yahoo Pipes User InputsThese modules allow you to add user input into a pipe. You can add either text, location, URL, number, or date.

Operators

Yahoo Pipes OperatorsThere are plenty of tools to manipulate the data flow in your pipe. You can find such operators as filter, location extractor, regex, reverse, split, tail, union, web service, count, loop, rename, sort, sub-element, truncate, and unique.

URL Builder

Yahoo Pipes URL BuilderURL builder module is one of the most important Yahoo Pipes modules. It allows you to construct a URL from parts. Some parts you may type in, others you may wire in using Text User Input modules.

String

Yahoo Pipes StringThe modules in this category are used to either combine or modify the strings. Here they are: string builder, string replace, term extractor, string regex, sub string, and translate.

Date

Yahoo Pipes DateThere are only two modules in this category: date formatter and date builder. The latter converts text to dates while the first one takes the dates and changes them to the desired formats.

Location

Yahoo Pipes LocationThis module is able to convert a description of a place into geographical data. It can recognize addresses, zip codes, airport codes, city/country names, and U.S. city/state names.

Number

Yahoo Pipes NumberThis module performs simple mathematical operations. It applies math operations to the numbers inputted into it and outputs the result. The operations include addition, subtraction, multiplication, division, modulo, and powers.

Putting all things together

All you need to do is to choose the proper modules and connect them into a pipe:
Yahoo Pipe
Also you may browse other pipes made by others. Probably you will find that many tasks are already done for you.

One reply on “Quick Scraping with Yahoo Pipes”

Leave a Reply to Andrew williams Cancel reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.