Categories
Development SEO and Growth Hacking

A simple LinkedIn Group Submitter

LinkedInLinkedIn API doesn’t allow you to publish into groups if you are not their administrator. That was done in order to eliminate spamming, but if you are a member of several groups of a similar topic and you want to share some interesting information with all of those groups, you have to do it manually group by group and eventually it becomes tedious. In this post I’ll show you a simple way to automate this process in C# using Selenium WebDriver.

Categories
Miscellaneous

An Independent Test of 7 Hosting Providers

Choosing a provider is not an easy task, you always want to find something «cheap and cheerful». However, quite often it is hard to find a golden mean and you have to choose between computing power, speed, and cost, not mentioning additional features such as DNS-servers, control panel, etc. In this article, I will present you test results for several providers of various sizes, and I’m hoping that it will guide you in a decision-making process of choosing a hosting.

Categories
Data Mining Web Scraping Software

Easy Data Visualisation with Silk.co

This post is outdated. The silk.co service is no more awailable.

This is a guest post by Daniel Cave.

With the rise of social media sharing, collaboration and a increasingly interested market for data, there are more and more people wanting to ‘play with data’ and learn using some basics free tools. So recently I’ve been trying to find a technically advanced and interesting combination of free tools to collect and visualise web data that will allow enthusiasts and students to get those all important initial quick and easy wins.

Categories
Development

Tutorial: How to use Headless Firefox for Scraping in Linux

I have already written several articles on how to use Selenium WebDriver for web scraping and all those examples were for Windows. But what about if you want to run your WebDriver-based scraper somewhere on a headless Linux server? For example on a Virtual Private Server with SSH-only access. Here I will show you how to do it in several simple steps.

Categories
Miscellaneous

What is import•io from the user’s point of view?

Import•io is a big data cloud platform that has the ambitious goal of turning the web into a database.  It was founded in March, 2012, and a year later it received $1.3M in seed funding from Wellington PartnersLouis Monier and Emmanuel Javal.

Categories
Development

An Example of Captcha Solver in Java

java_captcha Recently I published an article on how to solve captcha in C# using DeathByCaptcha service, and I promised to offer you an example in other languages as well. In this post I’ll offer a Java project that does the same thing.

Categories
Development

How to improve your scraper with “Bypass CAPTCHA”

If you develop an application for web scraping then it would be really nice to upgrade it with automatic captcha recognition.  “Bypass CAPTCHA” service allows you to do this very easily since its focus is on use in third-party software. In this post I’ll show you how easy it is to extend your scraper using this service.

Categories
Development

How to Write a Captcha Solver that uses DeathByCaptcha service

Let’s look at a practical example on how to solve CAPTCHAs using the DeathByCaptcha service. This example is written in C#, but you can get it in Java as well.

Categories
Web Scraping Software

Captcha Breaker Review

GSA Captcha Breaker is a CAPTCHA solving software. It uses Optical Character Recognition algorithms for CAPTCHA decoding. Being a standalone program it works independently of any online captcha recognition services (like DeathByCaptcha, BypassCaptcha and etc). This means that once you have paid for the program you don’t need to pay for each recognition anymore, and this allows you to save money when you need to recognize a huge amount of CAPTCHAs.

Categories
Web Scraping Software

How to extract emails and phones with GSA Email Spider

email_spider_logoThe task of email extraction is quite popular in the sphere of web scraping. Here I want to present you with a review of the GSA Email Spider, a useful program designed for collecting emails, phones and fax numbers from the web.