Categories
Web Scraping Software

How to extract emails and phones with GSA Email Spider

email_spider_logoThe task of email extraction is quite popular in the sphere of web scraping. Here I want to present you with a review of the GSA Email Spider, a useful program designed for collecting emails, phones and fax numbers from the web.

Some useful features of Email Spider

  1. Extracts emails starting from a URL as well as from search results for a given keyword
  2. Phone and fax numbers are collected too
  3. Automated email sender
  4. Harvests emails with the help of search engines (300+ included)
  5. Supports https web sites
  6. Supports SSL-only email providers (like google mail)
  7. Allows using proxy in the crawling process
  8. Can send emails directly using an internal SMTP server
  9. Analyzes JavaScript code to to find hidden email addresses
  10. Can cheat anti-spider protection (e.g. by using a random user agent string)
  11. Collects emails with related extra information (e.g. addresses)
  12. Has many filters for conditional extraction (like specifying keywords or excluding some domain names)

How it works

The program has a simple dialog-based interface. First, as I mentioned earlier, you choose between starting with a keyword or with a URL.  Then you can tune the extraction process with dozens of settings in the Options tab:

email_setup

For example, to narrow your email search you can set up an additional filter on what email you need to scrape:

email_filter

After everything is set up press the Start button and the email extraction process will start. When I ran the demo version I used keywords “php”, “scrape”, “cookie” and the extraction results were following:

  • extraction time for 1000 results per search results was approx. 28 hours.
  • 227,555 URLs were searched
  • 49071 emails & phones were gathered

email_result

Though the demo version is limited to only 1000 search results per search engine, I was still impressed with the total number of emails that the spider could extract.

Auto mailer

The Email Spider does not only extract email from the web but also can automatically send messages to the extracted emails (this feature is available in the full version only). The settings of this feature are shown on the picture below:

email_send

Conclusion

GSA Email Spider is a really good helper in email and phone extraction. Being simple it is smart enough (due to the large number of options) to sift only the relevant information. As an additional feature, the in-built automailer allows you to easily send several emails based on a single template.

6 replies on “How to extract emails and phones with GSA Email Spider”

Leave a Reply to Igor Savinkin Cancel reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.