General Data Protection Regulation or GDPR: enforcement date – 25 May 2018. The GDPR covers the matter of online user data privacy rules for electronic communication and data protection. The regulation includes modern communication messengers and services, eg. Skype, Viber, Gmail, etc., that have not been previously mentioned in the former EU e-communication directives.
“Privacy is guaranteed for content of communication as well as metadata (e.g. time of a call and location) which have a high privacy component and need to be anonymised or deleted if users did not give their consent, unless the data is needed for billing.”
See the main elements of GDPR in EU (wiki).
2 main updates
We’ll mention the 2 essential regulation updates (most closely related to web scraping/ data processing):
- Websites are no longer forced to request a web user consent for cookie storing at his/her browser – “no consent is needed for non-privacy intrusive cookies improving internet experience”.
- Identifying marketing calls – “People will have to agree before marketing messages are addressed to them by automated calling machines, SMS or e-mail.”
International (outside EU) data-exchange
The EU commission will engage in discussions on reaching “adequacy decisions” (allowing for the free flow of personal data to countries with “essentially equivalent” data protection rules to those in the EU)…
How will this influence web data scraping
Essentially the new consolidated EU regulations will not negatively change web data scraping play rules, so, if you gather openly accessible data (under websites’ ToS), you are eligible to do it with any of the automatic tools/scripts.
in the future
Quote from GDPR: “Marketing callers will need to display their phone number or use a special pre-fix that indicates a marketing call”.
So, “the marketing calls identifying” might be expanded in the future to include calls (requests) to websites done with a mass info gathering goal and thus force web scrapers and web crawlers to make only authorized, self-identifying queries to websites. Eg. a bot will have to identify itself as a bot and show its origin and a legal base.