When working with different scrapers in python, we often need to run them detached from the main process and monitor their output in real-time. Here’s how we do this:
import subprocess with open('output.txt', 'w') as f subprocess.Popen(["python", "-u", "script.py"], stdout = f)
- subprocess.Popen() launches a script as a detached process.
- stdout = f redirects stdout into file.
- Python’s “-u” parameter gives us fresh written data, the buffering being “off”.
We run an OS command invoking a scraper script to run in parallel to the main process. Subprocess.Popen() is the best procedure to run a script in parallel. When getting ‘fresh scraped’ data from file, the buffering should be “off” (the second parameter “-u” denoting unbuffered binary stdout and stderr). Alternatively, the Python environment variable PYTHONUNBUFFERED is to be ‘non-empty string’. For more options of non-buffering output in Python, check here.