Categories
Web Scraping Software

Visual Web Ripper: Using External Input Data Sources

Sometimes it is necessary to use external data sources to provide parameters for the scraping process. For example, you have a database with a bunch of ASINs and you need to scrape all product information for each one of them. As far as Visual Web Ripper is concerned, an input data source can be used […]

Visual Web Ripper: Using External Input Data SourcesSometimes it is necessary to use external data sources to provide parameters for the scraping process. For example, you have a database with a bunch of ASINs and you need to scrape all product information for each one of them. As far as Visual Web Ripper is concerned, an input data source can be used to provide a list of input values to a data extraction project. A data extraction project will be run once for each row of input values.

An input data source is normally used in one of these scenarios:

  • To provide a list of input values for a web form
  • To provide a list of start URLs
  • To provide input values for Fixed Value elements
  • To provide input values for scripts

Visual Web Ripper supports the following input data sources:

  • SQL Server Database
  • MySQL Database
  • OleDB Database
  • CSV File
  • Script (A script can be used to provide data from almost any data source)

To see it in action you can download a sample project that uses an input CSV file with Amazon ASIN codes to generate Amazon start URLs and extract some product data. Place both the project file and the input CSV file in the default Visual Web Ripper project folder (My Documents\Visual Web Ripper\Projects).

For further information please look at the manual topic, explaining how to use an input data source to generate start URLs.

Cheers,
Mike

One reply on “Visual Web Ripper: Using External Input Data Sources”

Leave a Reply to justaguywhosawyourwebsite Cancel reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.