In this post we want to share with you a new useful JAVA library that helps to crawl and scrape Linkedin companies. Get business directories scraped!
The library offers the two following LinkedIn scrape useful classes:
It works whether thru cookie or email/password. Stored at txtFilexExample/data.properties file.
Gets data from several sources:
* – linkedin.com
* – crunchbase.com (company email and phone number)
* – bing.com (longitude and latitude)
- The JAVA code uses Selenium ChromeDriver instance.
- The crunchbase.com is a selenium-proof one, so we use a simple scrape process. JSoup library is also in use to parse fetched html.
- The classes work with urls/links from text files and return output data into the output stream.
- To do: plug in a work with proxies/proxy services.
Download the library code from here.