Question:
The Python requests library is a useful library having tons of advantages compared to other similar libraries. However, as I was trying to retrieve the Wikipedia page, requests.get() retrieved it only partially:
The Python requests library is a useful library having tons of advantages compared to other similar libraries. However, as I was trying to retrieve the Wikipedia page, requests.get() retrieved it only partially:
Recently I decided to work with pythonanywhere.com for running python scripts on JS stuffed websites.
Originally I tried to leverage the dryscrape library, but I failed to do it, and a nice support explained to me: “…unfortunately dryscrape depends on WebKit, and WebKit doesn’t work with our virtualisation system.”
Often for the purpose of scraping, one needs to find certain elements’ XPath on a webpage. How can one do that with browser Web developer tools, aka Web inspector? A picture is worth of thousand words.
We want to share with our readers about a new testing-ground with reCaptcha v2.0. Since we do R&D of how to solve reCaptcha by web scripts and by captcha breaking services, it’s vital to have a reCaptcha testing ground.
This testing ground is designed according to the How to insert and configure reCaptcha post.
In this post we want to show you the code for an automatic connection to 2captcha service for solving google reCaptcha v2.0. Not long ago, google drastically complicated the user-behavior reCaptcha (v2.0). This online service provides a method for solving it.
Recently I’ve received a request on how to sum the total hours of a Youtube videos in a search result. I’ve made the simple JS iterator that fetches hours/min/sec from browser html info and sums them up.
See the code below:
Recently I got notified of Kimono service finishing its work due to kimono team being joining another project. So many data hunters who were using this prominent free API service are now in search for a good alternative.
Today I want to share with you how to make a web page to automatically scroll down. This is applicable in dealing with social networks pages, business directories (ex. yellow pages) and other auto-upload resources.
Recently I was challenged to make a script that would authenticate through a bot-proof login from and redirect to a logged in page.