Search: “dynamic content”

We found 41 results for your search.

Development

Scrapy to get dynamic business directory data thru API

In this post I want to share on how one may scrape business directory data, real estate using Scrapy framework.

Post author By Igor Savinkin
Post date March 25, 2022

Challenge Development

How do I get pass dynamic “load more” btn?

Recently I’ve got a question: How do I get pass the dynamic “load more” button using a Python web scraper?

Post author By Igor Savinkin
Post date January 6, 2019

Challenge Development

CloudFlare – a limited feature anti-content-duplicate tool

Here we come to the next anti-scrape tool, called CloudFlare, former ScrapeShield. CloudFlare The CloudFlare app has been developed by CloudFlare to guard a site’s content. Its features are limited number, but it’s still an interesting tool to look at for anyone interested in web scraping.

Post author By Igor Savinkin
Post date February 16, 2015

Review Web Scraping Software

Web Content Extractor Review

Web Content Extractor is a visual user-oriented tool that scrapes typical pages. Its simplicity makes for a quick start up in data ripping.

Post author By Igor Savinkin
Post date April 18, 2012

Challenge Development

Python, Selenium for custom browser automation scraper

Recently we’ve got the tricky website, its data being of dynamic nature. Yet we’ve applied the modern day scraping tools to fetch data. We’ve develop an effective Python scraper using Selenium library for browser automation. About the project We were asked to have a look at a retailer website. And our task was to gather […]

Post author By Denis Soloviev
Post date April 10, 2023

Development

Puppeteer async scraper with browsers number to be tuned based on CPU capacity

Recently we’ve got a tricky website of dynamic content to scrape. The data are loaded thru XHRs into each part of the DOM (HTML markup). So, the task was to develop an effective scraper that does async while using reasonable CPU recourses.

Post author By Igor Savinkin
Post date February 9, 2023

Uncategorized

Pros and Cons of using Selenium WebDriver for Website Scraping

Since Selenium WebDriver is created for browser automation, it can be easily used for scraping data from the web. In this post we will consider some advantages and drawbacks of using WebDriver for web scraping.

Post author By Igor Savinkin
Post date January 2, 2020

Development SaaS

Dexi Pipes: multi-threaded web scraping of site aggregators

Today I want to share my experience with Dexi Pipes. Pipes is a new kind of robot introduced by Dexi.io to integrate web data extraction and web data processing into a single seamless workflow. The main focus of the testing is to show how Dexi might leverage multi-threaded jobs for extraction of data from a […]

Post author By Igor Savinkin
Post date December 23, 2019

Web Scraping Software

Dexi.io – how to improve performance

Tags Dexi, HTTP, scraping tool, web scraping

Intro Some may argue that extracting 3 records per minute is not fast enough for an automated scraper (see my last post on Dexi multi-threaded jobs). However, you should realize that Dexi extractor robots behave like a full-blown modern browser and fetch all the resources that crawled pages load (CSS, JS, fonts, etc.). In terms […]

Post author By Igor Savinkin
Post date June 8, 2017

Development

HTTP vs HTTPS

Tags HTTP
No Comments on HTTP vs HTTPS

In this post we will deal with the most vital facts and the pros and cons concerning the HTTPS vs HTTP issue. Besides the security advantage, we will consider the main things that make a difference: caching, performance issue, virtual hosting issue and others.

Post author By Igor Savinkin
Post date March 13, 2013