Recently I was challenged to do bulk submits thru an authenticated form, the website requiring a login. While there are plenty of examples of how to use POST and GET in Python, I want to share with you how I’ve done the handling of session along with a cookie and authenticity token (CSRF-like protection).
In the post we are going to cover the crucial techniques needed in the scripting web scraping:
- persistent session usage
- cookie finding and storing [in session]
- “auth token” finding, retrieving and submitting in a form