Multiprocessing with xargs

The other day I was to run a command multiple times with different input parameters and take notes. Each command would take few hours. Say, you’ve a bunch of e-books scattered around a 5TB hard disk. You’ve to pick few selected ones and find their corresponding hardbacks on Amazon. You have a script to do that. I started running it, but it’s so distracting, just after 2-3 times you go – what the hell am I doing? Can’t I just tell someone to schedule these commands? So I started writing a Python script with a list of dictionaries holding input parameters and multiprocessing to run the commands, 2 or 3 at a time. But that is so boring. I wanted something real quick, simple and sleek. I remembered Xargs.

  Xargs(1): build and execute command lines from standard input.

Xargs(1) does just what I wanted, exactly the way I wanted. I had seen one liners using xargs, but never used it myself. So I spent some time collecting all input parameters in a text file, and then, voila!

  $ xargs -rtP3 -L1 -a cmd.sh env
   -r : tells xargs not to interpret blank lines,
   -t : tells xargs to print the input commands as they are invoked
   -P3: to start 3 processes at a time
   -L1: tells xargs to interpret each line as a single command
   -a : to read the input file instead of the standard input.
   env: is the command to which each input line is passed as parameter.

I really like it when tools behave exactly how I want them to. Bliss! 🙂

Leave a comment