In this article I’d love to revise few well-known methods of protecting website content from automatic scraping. Each one has its advantages and disadvantages, so you need to make your choice basing on the particular situation. None of these methods is ultimate and each one has its own ways around I will mention further.
Author: admin
Free Online JSON Visual Editor
JSONMate is yet another free online JSON visual editor with a modern appearance that allows users to edit, query, and visualize data in JSON format (in case you don’t know, JSON is a human-readable language that is usually used to communicate data between an online web application and a server). Let’s see what is unique about JSONMate.
Competitor Analysis
As strange as it may sound, competitor analysis is a significant part of the marketing process. Any company that hopes to become successful one day needs to look at the way competing businesses handle themselves. A competitive analysis is a document designed to showcase the strengths and weaknesses of other companies in an industry. A technical writer with marketing intelligence can create a strong analysis that allows a business to make good decisions.
It’s very common to use proxy servers for web data extraction. If you want to stay undetected when you scrape a website, you have to change your IP address periodically. Otherwise it is very easy to detect unusual activity by observing a large number of requests from a single IP address. Visual Web Ripper has a built-in support of proxy servers called Private Proxy Switch.
Being the biggest scraper Google itself doesn’t like when somebody scrapes it. This makes life of google scrapers difficult.
In this post I offer you several hints on how to scrape Google in a safe way (if you still decided to do this).
LinkedIn API doesn’t allow you to publish into groups if you are not their administrator. That was done in order to eliminate spamming, but if you are a member of several groups of a similar topic and you want to share some interesting information with all of those groups, you have to do it manually group by group and eventually it becomes tedious. In this post I’ll show you a simple way to automate this process in C# using Selenium WebDriver.
Choosing a provider is not an easy task, you always want to find something «cheap and cheerful». However, quite often it is hard to find a golden mean and you have to choose between computing power, speed, and cost, not mentioning additional features such as DNS-servers, control panel, etc. In this article, I will present you test results for several providers of various sizes, and I’m hoping that it will guide you in a decision-making process of choosing a hosting.
This is a guest post by Daniel Cave.
With the rise of social media sharing, collaboration and a increasingly interested market for data, there are more and more people wanting to ‘play with data’ and learn using some basics free tools. So recently I’ve been trying to find a technically advanced and interesting combination of free tools to collect and visualise web data that will allow enthusiasts and students to get those all important initial quick and easy wins.
I have already written several articles on how to use Selenium WebDriver for web scraping and all those examples were for Windows. But what about if you want to run your WebDriver-based scraper somewhere on a headless Linux server? For example on a Virtual Private Server with SSH-only access. Here I will show you how to do it in several simple steps.
Import•io is a big data cloud platform that has the ambitious goal of turning the web into a database. It was founded in March, 2012, and a year later it received $1.3M in seed funding from Wellington Partners, Louis Monier and Emmanuel Javal.