Combining Python with bash scripting

Bash scripting is an extremely handy tool for batch data analysis which accelerates data analysis (e.g., in post-analysis of MD simulation results) and avoids tedious labor work. Python is also a convenient and easy-to-use tool in data analysis. In some cases, it may be more beneficial to combine the two together in data analysis, as Python is very flexible and feasible to use in terms of loops, math, array (numpy), file read/write, etc.

There are two ways to invoke running bash commands in Python:

  • os
  • subprocess

For the first case using os package:

  
import os

# run 
os.system("gmx hbond -f md.gro -s md.tpr -n index.ndx")
  

Here the whole command is a complete string as a parameter. For the second case using subprocess package:

  
import subprocess

# run 
out = subprocess.run(['grep', 'HP1', 'md.gro'], stdout=subprocess.PIPE)
  

Here the whole command is split into an array of individual token of a command (grep HP1 md.gro). Or more conveniently,

  
import subprocess

cmd = "grep HP1 md.gro"
out = subprocess.check_output(cmd, shell=True, text=True)
  

the returned value is in string format.

A main difference between the two is that the first type just run the command without any return of the bash script while the second receive the return of the bash script.

References

For more detailed information and complete usage of the two approaches, see:

  1. Python os.system
  2. Python subprocess