Categories
Development

Selenium IDE and Web Scraping

Selenium is a web application testing framework that supports for a wide variety of browsers and platforms including Java, .Net, Ruby, Python and other. In this post we touch on the basic structure of the framework and how it can be applied in Web Scraping.

Categories
Development

Scraping in PHP with cURL

In this post, I’ll explain how to do a simple web page extraction in PHP using cURL, the ‘Client URL library’.

The curl  is a part of libcurl, a library that allows you to connect to servers with many different types of protocols. It supports the http, https and other protocols. This way of getting data from web is more stable with header/cookie/errors process rather than using simple file_get_contents(). If curl() is not installed, you can read here for Win or here for Linux.

Categories
Development

How to scrape an online dictionary using Python and lxml library

When I needed to extract dictionary words’ definitions I chose Python and lxml library. In this tutorial, I’ll review the steps of scraping Webster online dictionary using lxml in Python.

Categories
Development

How to scrape an online dictionary using Python and lxml library

When I needed to extract dictionary words’ definitions I chose Python and lxml library. In this tutorial, I’ll review the steps of scraping Webster online dictionary using lxml in Python.

Categories
Development

Free online tools to work with BASE64, HASH, CRC, HEX, BIN and etc…

When we work with different types of data, it’s always good to have a tool-belt to make our labor easier.  In the age of the “cloud,” it would be good to have all these tools online. Here are some online tools that may help you in your work:

Categories
Development

Running python script detached and getting realtime output

When working with different scrapers in python, we often need to run them detached from the main process and monitor their output in real-time. Here’s how we do this:

Categories
Development Web Scraping Software

4 Best Practices of Web Scraping

Here are some basic principles I follow when I do web scraping. All these principles came from my personal experience and I hope they may help others to avoid many mistakes and difficulties.

Categories
Development

Python, Eclipse, Windows

If you want to start programming in Python but don’t know where to start, you may find this step by step tutorial useful. It leads you through installing Eclipse for Windows and then adding Python Development Environment into it.

Categories
Development

XPath in Examples

Here we’ll show how XPath works.

Categories
Development

Regex in PHP

If you want to use regular expressions in your PHP program the best way is to use so called preg-functions (they wrap Perl-Compatible Regular Expressions library so sometimes they are called PCRE functions). Of course, there’re some other function sets like ereg and mb_ereg but they are quite outdated and in this article we’ll focus on preg functions only.