Recently we encountered a new powerful scraping service called Data Collector [of Bright Data]. The life-test and thorough drill-in are coming soon. Yet now we want to highlight it main features that has badly (in positive sense, strongly) impressed us.
- Pre-made collectors
- Request a new collector or develop it yourself
- Unblocking tough sites
- Data delivery & integration
- Data retention
Hundreds of free pre-made agents to gather data of top scrape targets
Data Collector in its nature is a scraping agent that is developed (already!) for a specific task. So, zero-coding-level individuals are welcome to use it. Take a look at the shot of the pre-made free to use data collectors. Eg.:
Social media category:
Request for a new collector to be coded or develop it yourself
Take a look at the IDE, the page interaction code being separated from the page parsing code:
Note: the cost of ordering a collector (along with its maintenance) is USD $150.
Unblocking tough sites
How to unblock tough sites as business directories and CloudFlare protected ones? The Data Collector utilizes a huge residential and data center proxy network provided by Bright Data, formerly Luminati. No need therefore to pay for any extra proxy services.
Data delivery & integration
The Bright Data offers various ways to deliver and integrate extracted data.
- Amazon S3
- Google Cloud Storage
- Microsoft Azure Storage
- API download
Besides, one may get data not just (1) at a job completion yet also (2) in real time with a single request.
The scraped data retention is 1 week only.
The service provides quite budget pricing. The max cost is USD $5 for collection of 1000 successful pages.
The Data Collector service by Bright Data seems to meet the present web scraping challenges (business directories scrape, data integration) while keeping a moderate pricing and providing a big heap of pre-made scrape agents.