Here I’d like you to get familiar with an online scraping protection service called BotDefender. It’s interesting both to know how to use it (in case you want to protect your data) and to understand how it works in case you ever come across it while collecting data.
OutWit Hub is a software providing simple data extraction without requiring any programming skills or advanced technical knowledge. What impressed me about Outwit Hub is its general approach to data gathering: harvest everything (links, text, images, etc.) and, then, let the user choose what is needed (sift by scrapers). The program is apt to browse over links on pages, so this feature works well if multiple chains web scraping is required. UPDATE: OutWit Hub 4.0 is released!
Are you thinking of protecting your website content from theft and nonlegal scraping? Are you suspecting that some ‘innocent bots’ are continually visiting your web pages for data retrieval? Now we come to the anti scraping bot software and services. In this post we want to briefly review the new anti scrape bot service called Distil.
Scraping for Journalists by Paul Bradshaw is a handy book for non-programmers to master some basic scraping techniques with online scraping tools. For sure, this book does not and cannot embrace all the techniques and problems that arise with the practical scheduled business web extraction; instead, it guides common people through how to get and refine some open data.
Screen Scraper is a classical scraping tool for all kinds of data scraping, extracting and packing. However, it takes time to properly master it.