by Leon Rosenshein


I’ve talked about guardrails and Root Cause Analysis before. And one of the questions I often ask candidates during an interview is what they (or their team) did to prevent a recurrence of a mistake with significant impact. These are all ways to make things safer in the future.

A general term for building that kind of safety into a system is poka-yoke. The idea that you can, by design, put special steps or checkpoints into a system so that mistakes are hard to miss. Checkpoints along the way that are easier to acknowledge than ignore. So you keep mistakes in the process from turning into defects in the results.

There are many examples of this in manufacturing. Your typical Ikea bookshelf comes with a bunch of boards and a cardboard sheet with the exact number of nuts, bolts, anchors, pegs, and tools needed to build the bookshelf. When you get to the end of the build, if you’ve got parts left over you’ve missed something. You might not know exactly what the problem is, but it is clear that there’s a problem.

There are ways to do this with software as well. Some of it is input validation. It can be as simple as using an Enum instead of a string in your API. If you take a type called HttpVerb, and the only valid values are HttpVerb.Get, HttpVerb.Post, HttpVerb.Put, HttpVerb.Head, HttpVerb.Delete, HttpVerb.Trace, HttpVerb.Connect, HttpVerb.Options, and HttpVerb.Patch then the user can’t specify an invalid value. And if your API only supports a subset of those you can only define that subset, again keeping the user from making a mistake.

The Builder pattern is a great example. You create a builder, set all the properties you want, then call the “Build” method. If you get something back it will work. If you get an error you know why. There’s no way to get a half-baked, inconsistent, or otherwise invalid thing. It might not be exactly what you want, but it is what you asked for. Similarly, another way is through proper separation of concerns. If the only way to get an instance of something is to ask the service/library/factory for it then you don’t have to worry about someone creating an invalid one on their own.

You can also have a poka-yoke step later in the process. If a you’re doing some kind of image analysis and the number of answers is supposed to be the same as the number of images analyzed you can add a step to the process that validates the number of results. If doesn’t prevent a problem, and it doesn’t make sure the analysis is correct, but it does ensure that everything is analyzed. You can make it even simpler. If a step in the process is supposed to have an output then check if there is any output. If not, it’s an error. Regardless of whether the step says it succeeded or not.

Regardless of whether it happens early or late in the process, the important part is that you detect the mistake, and make sure the mistake is recognized, as soon as possible so it can be handled then instead of turning into a problem for others.

How it gets handled is outside the scope of poka-yoke and a topic for another day.