I often receive requests asking about email crawling. It is evident that this topic is quite interesting for those who want to scrape contact information from the web (like direct marketers), and previously we have already mentioned GSA Email Spider as an off-the-shelf solution for email crawling. In this article I want to demonstrate how easy it is to build a simple email crawler in Python. This crawler is simple, but you can learn many things from this example (especially if you’re new to scraping in Python).
In the post we share with you the simple JAVA email crawler that crawls a input host (website) and searches for all the emails at the host and stores them.
The script uses
JSoup library and the full project you may find here.