Combining Python with bash scripting
Bash scripting is an extremely handy tool for batch data analysis which accelerates data analysis (e.g., in post-analysis of MD simulation results) and avoids tedious labor work. Python is also a convenient and easy-to-use tool in data analysis. In some cases, it may be more beneficial to combine the two together in data analysis, as Python is very flexible and feasible to use in terms of loops, math, array (numpy), file read/write, etc.
There are two ways to invoke running bash commands in Python:
- os
- subprocess
For the first case using os package:
import os
# run
os.system("gmx hbond -f md.gro -s md.tpr -n index.ndx")
Here the whole command is a complete string as a parameter. For the second case using subprocess package:
import subprocess
# run
out = subprocess.run(['grep', 'HP1', 'md.gro'], stdout=subprocess.PIPE)
Here the whole command is split into an array of individual token of a command (grep HP1 md.gro). Or more conveniently,
import subprocess
cmd = "grep HP1 md.gro"
out = subprocess.check_output(cmd, shell=True, text=True)
the returned value is in string format.
A main difference between the two is that the first type just run the command without any return of the bash script while the second receive the return of the bash script.
References
For more detailed information and complete usage of the two approaches, see: