Categories
Development

Selenium IDE and Web Scraping

Selenium is a web application testing framework that supports for a wide variety of browsers and platforms including Java, .Net, Ruby, Python and other. In this post we touch on the basic structure of the framework and how it can be applied in Web Scraping.

Categories
Data Mining

The stat visualization that makes sense

As I was searching for data mining and data visualization tools I came across the data visualization website Gapminder by Hans Rosling, the professor of Global Health in Karolinska Institute, Sweden. The website presents over a century of statistic data in visual form in graphs, the data being UN and other world organizations out-sourced.

The professor has done an extensive work with plenty of data sources for this data visualizer, and his efforts are notable.

Categories
Web Scraping Software

TheWebMiner, a cloud scraping tool

If you need to quick extract some data from an website and you lack of tech skills of the TheWebMiner’s Get By Sample web tool is a solution for you. Get By Sample works as a cloud web scraper and therefore it may work everywhere, on many devices even tablets and smartphones.

Categories
Data Mining

Data Journalism Handbook Poster

The poster is composed by Liliana Bounegru and Lulu Pinney shortly says what is in the Data Journalism Handbook. This referrence book shows how journalists can produce  interesting news out of data gathered from the web.

Categories
SEO and Growth Hacking

OutWit: Scrape search results for SEO Audit

In this video, Dale Stokdyk, explains how to scrape Search Engine Results using OutWit Hub with custom scraper.

Categories
SaaS

80legs Review – Crawler for rent in the sky

80legs offers a crawling service that allows users to (1) easily compose crawl jobs and (2) cloud run their crawl jobs over the distributed computer network.

The modern web requires you to spend huge amount of processing power to mine it for information. How could a start-up or a small business do comprehensive data crawling without having to build the giant server farms used by major search engines?