Tag: anti-scrape

How to find out that website is Distil protected?

Post author By admin
Post date November 6, 2020
No Comments on How to find out that website is Distil protected?

Given: a webpage to scrape.
If you inspect the DOM tree of that page you will find that quite a few tags are having the keyword dist. As an example:

<link rel="shortcut icon" type="image/x-icon" href="/wcsstore/ColesResponsiveStorefrontAssetStore/dist/30e70cfc76bf73d384beffa80ba6cbee/img/favicon.ico">
<link rel="stylesheet" href="/wcsstore/ColesResponsiveStorefrontAssetStore/dist/30e70cfc76bf73d384beffa80ba6cbee/css/google/fonts-Source-Sans-Pro.css" type="text/css" media="screen">

Tags anti-scrape, scrape protection

Challenge

How Imperva protects against scraping bots

Post author By admin
Post date November 6, 2020
No Comments on How Imperva protects against scraping bots

Imperva (that includes the former Distil anti-bot management) is a service providing many kinds of website protections. The present Imperva services include the following ones:

Cloud Web Application Firewall (WAF)
Bot Protection service (formerly Distil Networks)
IP Reputation Intelligence
Content Delivery Network (CDN)
Attack Analytics solution (eg. DDoS)

As to the protection of the bot scraping activities we mention the following.

Tags anti-scrape, scrape protection

Development

Bypass Distil

The Distil scrape protection is a prominent one in the modern anti-scrape techniques. So, now we want to share with you some tips of how to bypass it. If you are interested, please make an inquiry to the following email: igor[dot]savinkin[at]gmail[dot]com

Tags anti-scrape, scrape protection

Development

Scraping JavaScript protected content

Post author By admin
Post date December 27, 2019
11 Comments on Scraping JavaScript protected content

Here we come to one new milestone: the JavaScript-driven or JS-rendered websites scrape.

Recently a friend of mine got stumped as he was trying to get content of a website using PHP simplehtmldom library. He was failing to do it and finally found out the site was being saturated with JavaScript code. The anti-scrape JavaScript insertions do a tricky check to see if the page is requested and processed by a real browser and only if that is true, will it render the rest of page’s HTML code.

Tags anti-scrape

SEO and Growth Hacking

Strategies on how to protect your data from cyber theft

Post author By admin
Post date May 23, 2019
No Comments on Strategies on how to protect your data from cyber theft

cyber-theft-protection-strategies

Cyber-attacks are becoming a real threat to businesses both small and large. The damage they bring into people’s lives is more severe than people presume. In 2019, hundreds of billions of dollars went down this tunnel, and the crime is yet to stop. With the evolvement of threat landscapes, attacks are becoming more and more sophisticated. It has also become clear that big companies need to understand that they cannot be 100% secure from such breaches. The real question is, if hackers manage to attack the big companies, how long would it take them to steal your data? The only way to handle this menace is if you understand these basic security strategies and implement them.

Tags anti-scrape

Miscellaneous

Bypass distil network, the anti-scraper protection

Post author By admin
Post date March 27, 2019
No Comments on Bypass distil network, the anti-scraper protection

safe-key

For details of how to bypass distil-network, the anti-scraper protection, please contact by email: igor [dot] savinkin [at] gmail [dot] com.

Tags anti-scrape, security

Development Guest posting

Web scraping: How to bypass anti-scrape techniques

Post author By admin
Post date March 22, 2019
1 Comment on Web scraping: How to bypass anti-scrape techniques

Web scraping is a technique that enables quick in-depth data retrieving. It can be used to help people of all fields, capturing massive data and information from the internet.

Tags anti-scrape, Javascript, Octoparse

Uncategorized

My site is being scraped, how can I prevent being scraped?

Post author By admin
Post date June 2, 2015
No Comments on My site is being scraped, how can I prevent being scraped?

As anyone who has spent any time on the scraping field will know, there are plenty of anti-scraping techniques on the market. And since I regularly get asked what the best way to prevent someone from scraping a site, I thought Id do a post rounding up some of the most popular methods. If you think we’ve missed any out, please let me know in the comments below!

If you are interesting of how to find out if your site is being scraped, then turn to this post: How to detect your site is being scraped?

Tags anti-scrape

Challenge Development

CloudFlare – a limited feature anti-content-duplicate tool

Post author By admin
Post date February 16, 2015
No Comments on CloudFlare – a limited feature anti-content-duplicate tool

Here we come to the next anti-scrape tool, called CloudFlare, former ScrapeShield.

CloudFlare

The CloudFlare app has been developed by CloudFlare to guard a site’s content. Its features are limited number, but it’s still an interesting tool to look at for anyone interested in web scraping.

Tags anti-scrape, CloudFlare

Challenge Review

BotDefender Analysis

Here I’d like you to get familiar with an online scraping protection service called BotDefender. It’s interesting both to know how to use it (in case you want to protect your data) and to understand how it works in case you ever come across it while collecting data.

Tags anti-scrape