by Leon Rosenshein

High Quality Quality

Before today’s topic, a little housekeeping. After a little over 14 months, I’m back at Aurora. I’ve taken on a Tech Lead Manager role, responsible for measuring and helping to improve software quality in support of Aurora’s mission to deliver self-driving, safely, quickly, and broadly.

Of course, before you can measure (or improve) something, you need to define what it is. In this case it’s software quality. There are lots of ways to measure software quality. To me, the most important is suitability to the problem at hand. Whatever software you write needs to add value and solve the problem it intends to solve.1

If you never intend to touch the code again, nor will it ever be used in a situation you didn’t expect, nor with inputs you hadn’t considered, you might have quality software. For the rest of us, writing code that will need to be changed, either to handle new requirements or new operating conditions or new environments, solving the problem at hand is a necessary, but not sufficient condition for having quality software.

Then there’s the definition of solving the problem. Have you solved the problem if you only solve it in some specific cases? Maybe, but it depends on the failure mode. And what the expectations are for that failure mode. A basic calculator is pretty simple. If it adds positive numbers correctly and says “invalid input” for negative numbers that might be quality. If it gives you the wrong answer, it’s not quality software.

You might be saying to yourself, wait a minute, this isn’t about software quality, it’s about correctness and suitability, or External Software Quality (ESQ). That’s sort of true, but you can’t have quality software without it, and if we let ourselves forget that we’ve got a whole different problem.2

Having said all that, what about code quality, or Internal Software Quality (ISQ)? When you start talking about ISQ, you’re talking about some of the ilities. Readability. Maintainability. Extendability. Scaleability. Securability. Optionality. Things that matter when you come back to the code weeks or months later because you’ve got a new requirement, or you found a problem.

At the heart of ISQ is the notion that while we might write code occasionally, we spend way more time reading and modifying code that’s already written. That’s where things like coding styles and linting come in. Things that make it easier to read the code because things look the same. Loop boundaries are clear. Variables are named clearly and not re-used for strange purposes. Things that make it easier to understand not just the exact meaning of a specific line, but how that line fits into the overall flow of the code. Reducing nesting and indentation. Things that make it easier to change code and be confident that you haven’t broken something. Increased code and branch coverage. Static analysis. Reducing cyclomatic complexity. Things thay keep the reader’s cognitive load down so that they can focus on the why, and not get caught up understanding the what.

Things like coverage, static analysis, and cyclomatic complexity are nice because they are quantifiable. You can look at today’s numbers and compare them to the numbers from last week or last month and see if you’re getting better or worse. They can be calculated for small parts of the code and changes to the numbers can be attributed to specific changes to the code, so you know what areas need work. You can use the numbers as gates to allow (or block) changes if you don’t like the way the numbers are impacted by the change. Doing that can help your ISQ. Just like ESQ, you need to measure and respond to what those numbers are telling you. But you also need to be careful to understand what you’re doing. You might be moving the numbers in the right direction, but there’s no guarantee that you’re improving your ISQ. If they’re getting worse, ISQ is probably going down, but the inverse isn’t always true.

Or put another way, the set of things you need to do to improve quality is also the set of things that make developers more productive. Things like DORA and SPACE. High ISQ leads to happy developers and happy developers leads to high ESQ.3 Unfortunately, if you think measuring code quality is hard, measuring developer sentiment and productivity is even harder.

There’s no silver bullet or magic potion. Instead, you have to implement all the individual measures, think about them, and make a change that you think will drive things in the right direction. Then you reevaluate and see if you’ve made things better. When something works, you do more of it. You look at incentives and goals, then you make sure the incentives align with the goals. You take many more much smaller steps, improving things one step at a time.

Or, you start instrumenting your code reviews and minimize the WTFs/Min.

Image of two doors to rooms were a code review is happening. One door has a caption reading 'Good Code'. The other has the caption 'Bad Code'. The good code door has though bubbles for 2 WTFs. The bad code door has 5.

WTFs/m


  1. It’s ok if the street finds its own use for your software. If it solves that problem, it has quality for that problem, even if it’s not the one you originally tried to solve. ↩︎

  2. It’s a problem for a different time, but for now, take a look at these search results ↩︎

  3. That’s also a topic for a different time ↩︎