In this post we want to share with you a new useful JAVA library that helps to crawl and scrape Linkedin companies. Get business directories scraped!
If you are considering the Linkedin data scrape legal issues, please refer to the following post: Linkedin lost in court to data analytic company that scrapes Linkedin’s public profiles info
The library offers the two following LinkedIn scrape useful classes:
It works whether thru cookie or email/password. Stored at txtFilexExample/data.properties file.
Gets data from several sources:
* – linkedin.com
* – crunchbase.com (company email and phone number)
* – bing.com (longitude and latitude)
- The JAVA code uses Selenium ChromeDriver instance.
- The crunchbase.com is a selenium-proof one, so we use a simple scrape process. JSoup library is also in use to parse fetched html.
- The classes work with urls/links from text files and return output data into the output stream.
- To do: plug in a work with proxies/proxy services.
Download the library code from here.