Stuttering

There are two hard things in computer science: cache invalidation, naming things, and off-by-1 errors. Today’s post is about the middle one of the two.

Naming is hard. Noun clumps¹ for data. Verbs² for functions. Hungarian Notation³ for clarity? Everyone’s got an opinion on what’s right. Even languages have opinions on what’s allowed. Almost all languages allow almost all of the ASCII characters in identifier names. Some they have rules around what can be first in the identifier and how long it can be. Others assume certain types for different identifiers (I’m looking at you FORTRAN).

And some languages are more opinionated than others. APL has its own character set (and keyboard). Python doesn’t care much. Go, the language is not very opinionated. Go, the ecosystem and the community, on the other hand, are very opinionated.

gofmt, the Go formatter is so opinionated that, unlike every other formatting tool I’ve ever come across, it has ZERO options. You can add rules, but you can’t turn any off. You can’t pick the brace pattern. You can’t pick tabs, spaces, how much to indent each block, or anything else.

It’s simple, it’s fast, and all Go code looks the same. I consider that a win.

Go’s linters are similar. Both vet and golint let you choose which checks to run, but there is no ability to ignore a false positive. You’ve got to fix it or re-write the code to remove the false positive. I consider that mostly a win, but I have seen bad rewrites to make it happy.

Another thing golint will do for you is help prevent what it calls stuttering. According to golint, stuttering is when you have a public method in a struct with the same name as the struct itself. Something like run.Run(). I get that. Stuttering like that is both annoying and forcing ambiguity. When you’re talking about the code and you mention run, are you talking about the struct or the method? You need to be careful when you talk about it and the person you’re talking to needs to pay close attention to what you’re saying. That adds cognitive load, and part of Go’s design principles was the idea that it would be easy to ready. Both for a person and for the compiler. The Go Proverbs really lean into this. Make it easy to understand. A few extra lines or methods is fine.

But it’s not perfect. Sometimes code still stutters. And sometimes it’s hard to read and understand. Especially when there’s lots of text in the output to look at.

All of which is a very long way to say that naming is hard, and when a name stutters it adds to your cognitive load. They higher the cognitive load, the more likely you are to miss something. Missing something simple can cause you a lot of grief.

I was thinking of this because of an issue I tripped over the other day. I misread the name of a test in a test report because of stuttering issue. Starting from that simple mistake, I spent 4 hours questioning how computers worked, the correctness of a very simple, core feature of our build system, bazel, and my sanity.

You see, bazel has the ability to filter out tests based on the keywords you tag them with. Want to run only tests tagged with needs_network? Just add --test_tag_filters=needs_network to you bazel test command. Want to skip any tests with that tag, just add --test_tag_filters=-needs_network. Simple and straightforward.

And it seemed to work. I’d been happily marking bad tests with a tag and then fitering them out. Then, all of a sudden one of them showed up in my list of failures. Why was it running? It was supposed to be filtered out. And when I tried to run just that test and filtered it out it didn’t run.

However, when I ran all of the tests in that directory with the same filter, it did run. This apparent dichotomy led me to 4 hours of questioning myself and computers. I did all sorts of debugging. Printf, tracing, log grepping. And it just didn’t make any sense.

I started asking others if they had seen anything like that. No one had. Until finally, someone (thanx Alex) pointed out the simple thing that I was missing. Our test names stuttered. We had one test called //a/really/long/path//that/leads/to/the/thing:go_default_test⁴, and one test called //a/really/long/path//that/leads/to/the/thing/thing:go_default_test. I marked //a/really/long/path//that/leads/to/the/thing:go_default_test as a test that should be skipped, and when I tried to run it by itself, it got skipped. Then, when I ran all the tests under //a/really/long/path//that/leads/to/the/…⁵ and looked at the results, it was still skipped. But I didn’t notice it.

What I did notice was //a/really/long/path//that/leads/to/the/thing/thing:go_default_test. And because the prefix was the same for all the tests, my eye was drawn to the suffix. And I saw /thing:go_default_test, which I wasn’t expecting. It wasn’t supposed to be there. And because I was task focused I completely ignored the prefix. Instead of recognizing that these were two different tests, I thought the tools were broken. And down the rabbit hole I went.

To wrap this up, the real problem was that, because of the stutter, I had marked the wrong test. I was excluding a test that was fine, but running a test that sometimes failed. Then, when debugging it, I never noticed that there were two tests with almost identical names.

The moral of the story? First, beware of target fixation. Instead of looking at the bigger picture, I got stuck looking for how the tools were broken. Second, make it easier on yourself and don’t stutter in the first place. There didn’t need to be two tests that started with the same long prefix and ended with the same medium sized suffix, with just a tiny difference in the middle. That was a choice we made. We should have chosen better. If the tests names weren’t so similar I wouldn’t have missed the obvious.

A bunch of nouns and adjectives that describe a thing. Generally, a good way to name a data item. ↩︎
Methods should be named for what they do. If you find you want to put and in the name you’re probably wrong. ↩︎
The real Hungarian Notation wasn’t about prefixing with base types, it was about prefixing with intent. And that’s not necessarily a bad thing. ↩︎
Not only do we use bazel, we use gazelle, and we’ve been using it for a while, which makes things more complicated. ↩︎
For the uninitiated, that weird notation basically means “all of the tests defined in or below this path”. ↩︎

Stuttering

About Me

Latest Articles

Categories