by Leon Rosenshein

Built-In Functionality

A pocket knife with multiple tools available.

You can use all the tools, not just the large blade.

Most languages have a way to start an external process. It’s usually called some version of exec, as in execute this processes for me please. There are generally lots of ways to call it. Synchronous and Asynchronous. Capturing the output,stdout and stderr. Passing arguments or not, or even piping data in viastdin. Capturing the exit code.

All those options are needed when you’re running external applications/executables. If you’re calling a 3rd party program to do some heavy lifting, you’ll probably want that level of control over what goes into the executable. You’ll want to know exactly what comes out, stdout, stderr, and any data persisted. If you need to then do something with the output data then you’ll want to wait for it to finish so you know it’s done and if it succeeded, so you’ll want to be synchronous. On the other hand, if it’s a best effort you might just want to know that it started successfully and have it keep running after you’re done. For all those reasons, and others, there are very good times and reasons to use the exec family of functions.

On the other hand, they’re also very easy to mis-use. In many (most?) languages it’s pretty trivial to run a shell command, pipe its output to a file, then read the file. If that’s all you do you’ve opened yourself up to a whole raft of potential issues.

The biggest is that if you’re exec’ing to a shell, like bash or zsh you never know what you’re going to get. You’re at the mercy of the version of the shell that’s deployed on the node/container you’re running in. You can hope that the version you want is in the place you want, but unless you’ve made sure it’s there yourself, you don’t know. Sure, you could write your shell script to use sh v1.0 and be pretty sure it will work, but that’s really going to limit you. The same goes with relying on standard unix tools in a distro. That works fine until someone sticks the thing you’ve written into a distroless container (or tries to build/run it on a Windows box) and suddenly things stop working. That’s why most languages have packages/modules/libraries built into them that provide the same kind of functionality you would get from those tools.

Second consider this little golang example. It’s much easier to just call

out, err := exec.Command("ls", "-l", "/tmp/mydir").Output()
fmt.Println(string(out))

than

 infos, err := os.ReadDir("/tmp/mydir")
 if err != nil {
  log.Fatal(err)
 }
 
 for _, info := range infos {
  entryType := "file"
  if info.IsDir() {
   entryType = "directory"
  }
  fmt.Printf("Found %s, which is a %s\n", info.Name(), entryType)
 }

and have the output right there on the screen. And that’s how it often done. But that easy leads to some big gaps where problems can sneak in. There’s no input validation or error checking. In Go at least you have to capture any error in err, but you never have to use it. And that snippet ignores stdout.

At the same time, you have to properly escape your input. With ls it’s not too bad, but you have to handle spaces, special characters, delimiters, and everything else your users might throw at you. Add in calling a shell script and it gets worse. The more interpreters between the thing you type and the thing that gets executed the more likely you are to miss escaping something so it gets to the next level as intended.

Finally, if you’re calling a shell script, how robust is it really? Code Golf might be a game, but it’s a lousy way to write reliable, resilient code. Even if the correct version of bash is used, and you get the argument parsing and escaping right, executing a script becomes an undebuggable, fragile, black box. And no one wants that.

So next time you think “I’ll just do a quick exec to get something done, think again. Use the tools of your language and check your work.