by Leon Rosenshein

Where do defects come from?

For those that know me, it’s no secret that while I like a lot of Uncle Bob’s specific advice (clean code, unit tests, I have some problems with his overall approach. Consider his response to the question “Where do defects come from?”.

According to him, the cause is:

  • Too many programmers take sloppy short-cuts under schedule pressure.
  • Too many other programmers think it’s fine, and provide cover.

And the obvious solution is

  • Raise the level of software discipline and professionalism.
  • Never make excuses for sloppy work.

It’s true that as software engineers we need to do a better job, have more discipline, and not make excuses for those that don’t. Raising the bar on how we do our jobs is vitally important to eliminating defects.

But it’s not enough. You can always find someone who typed the line of code and blame them for the problem. There’s even a git command, blame to do just that. But it’s kind of disingenuous to stop there.

I’ve been doing this a long time. In all my years as a software engineer, I’ve never run into a co-worker who didn’t care. Some had more information than others, and some had different experience and viewpoints that let them see things others missed. But no-one thought sloppy short-cuts were OK to take or that it was OK if someone else took one.

I’m much more partial to Hillel Wayne’s approach to thing. Not just in this case, but in many others. He takes a much more nuanced approach. He avoids the Tyranny of Or Yes, someone typed that line, but the defect has a deeper cause than that. Why was that line allowed to be typed? Could we have added guardrails to at least prevent it from getting into user’s hands?

It’s because for all but the simplest bits of software, running in all but the simplest environments, it’s not just one developer, or even one team of developers that wrote the code. The code uses libraries and OS functionality from somewhere else. Someone else built the environment. The hardware the code runs on and the network it’s connected to are managed by someone else. It’s part of much bigger, interconnected, and interdependent system. To say that one person is responsible for an error in such a system isn’t accurate.

Which is not to absolve developers of responsibility for defects. They need to have discipline and be professional. There is no excuse for sloppy work. But it’s not enough. We also need to ensure that we have, and use, all of the tools at our disposal to not just find defect, but even better, keep them from happening in the first place. Because the goal is to make sure the defects don’t reach the customer, not figure out who to blame for them.