Categories
Development

Traverse and count youtube videos total play length

youtube video length timingRecently I’ve received a request on how to sum the total hours of a Youtube videos in a search result. I’ve made the simple JS iterator that fetches hours/min/sec from browser html info and sums them up.
See the code below:

Categories
Development Web Scraping Software

The worthy alternative to dissolving scraping Kimono API

Recently I got notified of Kimono service finishing its work due to kimono team being joining another project. So many data hunters who were using this prominent free API service are now in search for a good alternative. 

Categories
Development

Make web page to auto scroll down

Today I want to share with you how to make a web page to automatically scroll down. This is applicable in dealing with social networks pages, business directories (ex. yellow pages) and other auto-upload resources.

Categories
Development

Auth in bot-proof login form with PHP Curl and JavaScript

Recently I was challenged to make a script that would authenticate through a bot-proof login from and redirect to a logged in page. 

Categories
Data Science Development Guest posting

Audio Captcha Solving Algorithm for XBox

I want to share how I’ve done the audio captcha recognize-er. The audio captcha recognize-er was designed to solve captcha at xbox.com back in 2012. 

Categories
Development

Web scraping with JavaScript

Is it possible to scrape an HTML page with JavaScript from inside of a web browser?

To be perfectly honest I wasn’t sure so I decided to try it out.

Full disclaimer here, I didn’t actually succeed. However, it was a great learning experience for me and I think you guys could benefit from seeing what I did and where I went wrong. Who knows, maybe you can take what I’ve done and figure it out for yourself!

Categories
Development

How to parse messy encoded HTML

Let’s suppose you want to extract a price with a currency sign from a web page (eg. £220.00), but its HTML code is this:

<div>cost: &#163;220.00</div>

which is obviously encoded HTML.

Categories
Development Web Scraping Software

Dexi.io REST API in php (example)

In this post, I’d like to demonstrate how to leverage the Dexi.io (CloudScrape) API along with its PHP Client library (also avail in Ruby and C#).

Categories
Development

Extract browser’s Local Storage with Python

Some of you may be wondering if it’s possible to extract a web browser’s local storage by web scraping?

Categories
Development

Solve ReCaptcha with Selenium (python)

breaked by seleniumI’ve already written about how the new No CAPTCHA ReCaptcha works, and even had some success breaking it with an iMacros’ browser automation. But, the latest scraping tools are – for most part – driven by Python, so now I want to try the same experiment with Selenium + Python.