As we reviewed web scraping software and services, we stumbled upon an interesting cloud scraping service called Grepsr. This service is dedicated to extracting consumer requested data by its own specialists with the possibility that the user may control scrape scheduling and some other data extraction steps.
Author: admin
Scraping for Journalists book Review
Scraping for Journalists by Paul Bradshaw is a handy book for non-programmers to master some basic scraping techniques with online scraping tools. For sure, this book does not and cannot embrace all the techniques and problems that arise with the practical scheduled business web extraction; instead, it guides common people through how to get and refine some open data.
When we work with different types of data, it’s always good to have a tool-belt to make our labor easier. In the age of the “cloud,” it would be good to have all these tools online. Here are some online tools that may help you in your work:
Often, we need to prove regexes with an online tool or test them on different Regex engines. Here is an overview of online regex testers to assist you in your selection. Also, there is regex tester comparison table available.
When working with different scrapers in python, we often need to run them detached from the main process and monitor their output in real-time. Here’s how we do this:
Here are some basic principles I follow when I do web scraping. All these principles came from my personal experience and I hope they may help others to avoid many mistakes and difficulties.
Python, Eclipse, Windows
If you want to start programming in Python but don’t know where to start, you may find this step by step tutorial useful. It leads you through installing Eclipse for Windows and then adding Python Development Environment into it.
XPath in Examples
Here we’ll show how XPath works.
Regex in PHP
If you want to use regular expressions in your PHP program the best way is to use so called preg-functions (they wrap Perl-Compatible Regular Expressions library so sometimes they are called PCRE functions). Of course, there’re some other function sets like ereg and mb_ereg but they are quite outdated and in this article we’ll focus on preg functions only.
Regex in Perl
In this post we summarized some basic features of regex in Perl. We presented basic operators using regex and special regex pattern modifiers. More details are the in following articles…