Categories
Development

Download a file from a link in Python

I recently got a question and it looked like this : how to download a file from a link in Python?

“I need to go to every link which will open a website and that would have the download file “Export offers to XML”. This link is javascript enabled.”

Let us consider how to get a file from a JS-driven weblink using Python :

 

1. Look at the link content for file download:

 

<a href="javascript:__doPostBack('ctl00$ContentPlaceHolder1$lnkExportToExcel')">Export offers to XML</a>

Here comes the __doPostBack() js function.

2. I’ve found this function in code: 

<script type="text/javascript">
var theForm = document.forms['aspnetForm'];
if (!theForm) {
    theForm = document.aspnetForm;
}
function __doPostBack(eventTarget, eventArgument) {
    if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
        theForm.__EVENTTARGET.value = eventTarget;
        theForm.__EVENTARGUMENT.value = eventArgument;
        theForm.submit();
    }
}
</script>

Obviously the function submits a form. Form id is aspnetForm.

3. Form in the code :

<form method="post" action="ApplesToApplesComparision.aspx?Category=Electric&amp;TerritoryId=2&amp;RateCode=1" id="aspnetForm">
<div class="aspNetHidden">
<input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" />
<input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" />
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="qqEqj+MJZCXWWypTsbeT2OudaHSwkSmxn4MMtBuWopgD50psDlTzoVSH0gMVRNktX7EW7I2uWKnF9IzD8/BkloDdz+4OSdWS7MbiJaQ2KVBHoZCFqMN0IgLe82fkuPJxk/wf1h/ZWYjOwi5XRTLZEy4JKRc...

In dealing with ASP.NET remember that they try to save/handle the state of HTTP requests in a special variable: __VIEWSTATE. So at each request (to and fro) this variable (input field in a form) is being changed, its value getting increased. Therefore it’s vital to send requests thru this same form with updated __VIEWSTATE.

We recommend that you simulate a form in python, it having been loaded with parameters from a needed link and especially with the parameter __VIEWSTATE from the real form of a scraped page.

See below the screenshot with important info :

Shape modeling

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.