by Leon Rosenshein

This Is The Way

I’ve talked about The Unix Way before. I recently came across a really good description of it. It doesn’t matter if you’re writing a tool or deciding what goes into your commit. It shows up in architecture, in Domain Driven Design and Bounded Contexts. It shows up in non-code places as well. Reducing Work In Progress is just another way of implementing the Unix way.

The best definition of the Unix way I’ve ever seen is:

That’s it. Two things. Simple and straightforward. Of course, the devil is in the details.

First, what is the one thing? Where are the boundaries to that one thing? Are you writing a program, a library, a service, or a platform? Where the customer value comes from drives the boundaries. Where the boundaries are drives what you’re building. But it’s not done a one-way flow in isolation. It’s actually a system with feedback loops, so boundaries influence what the “one thing” is which then ends up influencing how it gets used, which influences customer value. And now we’re talking about levels of abstraction. You can always add another level of abstraction to help solve a problem (except the problem of too many levels of abstraction. The trick is to figure out if you should or not.

A really good way to approach this one is to recognize that it is an iterative process. Understand the domain and the context. Figure out what you need to do. Figure out what you shouldn’t do. Then look at everything you think you need to do and decide if you really need to do. At any given level you can probably do less, then use composition at a higher level. Even if you never share that lower level with the end user, tighter domains and less coupling will help you. Now and in the future. At the same time, don’t be afraid to merge two contexts or rearrange a group of them into things that make more sense. As you design, your understanding grows. As your understanding grows the boundaries will shift. Use that to your advantage.

Second, how do you tie things together? In classic Unix, the text output of one program is used as the text input to the second. For CLI tools that still makes sense. But lots of things don’t run from the command line. They’re driven by user clicks/taps on a screen, and they might not even run on the same computer. Which means you can’t use your favorite shell’s pipe mechanism. In fact, you need to think about how to tie things together across computers. Data size has grown as well. By multiple orders of magnitude. On Falcon 4 we shipped our entire game on a single CD. Less than 680 MB, including all of the art assets. More recently, for the Global Ortho project each individual image was ~550 MB, and we had over 2 million of them between the US and Western Europe. Even one of those images is a lot of data to pass via a pipe, let alone hundreds or thousands of them. And since you don’t know what’s going to be reading your data you don’t know what language the next thing is going to be written in, so your access needs to be cross-language.

The key to solving this issue is to add structure. But only as little structure as you need. And document it. Cleanly, clearly, and with examples. Consider using a formal IDL like Protobuf or Apache Thrift. They’re great for ensuring you have a single source of truth for the definition while letting you have language specific implementations. They can be used on distributed systems (across the wire) or through persistent stores like file systems or databases. You can even use them as part of your API in a library to keep things consistent. Even if you’re using text for a CLI tool, a little structure will make your life easier. YAML and JSON are good choices. And remember that it’s often useful have text output that’s formatted in a way that makes it easy for people to understand (columns, colors, complete sentences, etc.) and a machine parseable format. With clean, clear documentation.

The Unix Way has been around for almost 60 years. Despite the advances in processor, storage, language, and network design and implementation, it’s still as relevant today as it was back then.