Recent Posts (page 36 / 70)

by Leon Rosenshein

Discoverability

For a while we had a column on our sprint board called “Documentation”. It was a catch-all column to help us remember that things weren’t “done” until they were shared. They say that if you build a better mousetrap the world will beat a path to your door. That may be true, but only if the world knows you built it. Yes, iIt’s possible for word of something to leak out and for usage to grow, but that’s not the normal case. Better is to tell people what you’ve built and why it’s better for them. That’s where documentation comes in.

We wrote the infra tool to help us manage our infrastructure and help our customers use it. One of the first things we added was the auth module. Because we wanted a good way to update all of the different contexts and tokens once and not have to worry about it. We didn’t tell anyone at first. Just used it ourselves and enjoyed it. Some people noticed and started using it. Then we started getting questions about how to use it. That didn’t make sense since it was fully documented, wasn’t it?

Well, it was, in that the docs, online and off, explained exactly what options were available and what every option meant. But there’s more to documentation than that.

It starts with a good description. What does this thing do, and why should a user care? What’s the benefit? Remember, people generally buy things for the benefits, not the features. In this case,

This utility allows you to add authentication profiles for external services
that can't integrate with uSSO (currently only AWS). Then, at the start
of each day, you can refresh the credentials for for each profile en masse,
only entering your password once (though potentially 2FA-ing multiple
times).

So a clear benefit.

Then there’s how to get it. With the infra tool, as long as you’re in the rNA repo, it’s just there. And it auto-updates. No need to do anything. That’s the best kind of docs.

And of course you need to have a full description of the API (or CLI in this case). All of the functions and parameters. Each with not just a what, but a why. In many ways this is the easy part. For the infra tool you can see it at any level with the --help option, like infra auth --help.

You want to have examples. Both targeted ones and general ones. For the `infra` tool it includes things like a specific example of adding a new profile (`infra auth add-aws-profile --help) as well as the more general codelab that show you how to use it day to day.

If you’re writing a library (or expecting others to contribute to your tool) you also want a “how we did it and why” section of your docs. For the infra tool that part lives with the code and in our internal docs, but it’s there. We’ve even got a presentation on how you can extend the tool. But we keep it separate because it would get in the way of our users knowing how and why they will benefit from it.

So next time you think you’re done with something, look at the documentation and make sure you’re really done.

by Leon Rosenshein

Adapters, Facades, and Decorators - Oh My

The GOF gave us Design Patterns in 1994. It's been helping and confusing folks ever since. There's good high level advice and a bunch of patterns that can be reused. The confusion comes around because some of them are very similar, but as usual, the details are important.

Consider the adapter, facade, and decorator patterns. All of them take existing functionality and wrap it, changing the interface slightly. So why are there 3 different patterns and which one should you use? The answer is, it depends. It depends on why you're doing whatever it is you're doing.

The simplest is the adapter. Consider a trip from the US to EMEA. You've got a bunch of electronics with you, and you'd like to be able to plug them in wherever you are. So you carry a bunch of adapters with you. They're just lumps of copper and insulator, carefully designed to allow you to plug your device in on one end and into the wall on the other. It just changes the shape of the plug. If you're in Austria the outlet has 220V, 50hz, and that's what goes into and out of the adapter. Hopefully your device can handle it. Software adapters are the same. All of the same parameters go in/out, and they're not modified on the way through. The purpose is to allow you to adapt one system to fit another, but the functionality/meaning better match, or else.

Next is the facade. Facades are false fronts, and make things simpler to use. If your API offers 20 different flags and options, but really you only care about changing one of them you might make a facade that takes that one parameter, adds your own defaults, and then passes it on. Back to the power adapter analogy, consider the Qi charging on your phone. Instead of plugging in a 24 pin USB-C connector you throw your phone onto the charger and get more power. Very simple, but you lose the ability to transfer data.

Finally, there's the decorator. What do you do if your US electronics can't handle 220V at 50 Hz? Then you need a decorator. Sure, on the surface it looks like an adapter, but under the covers it has a transformer and frequency converter. Plug it in to an Austrian wall socket and you can plug your US device in and, if it's a good decorator, the device will never know. Software decorators are the same thing. They add functionality. If you have a data source that uses push to send its data and a sync that uses pull you need something between them. One example might be a statsd metrics exporter (push) and a Prometheus Time series dB (pull). You need a decorator between them if you want them to talk to each other. The decorator would take the push from the service, buffer it, then respond to the Prometheus pull. So not entirely unlike an adapter, but more.


by Leon Rosenshein

YAML

Hard coded config is bad. Making code flexible is good. We can debate that at another time, but let's go with that for now. One of the ways to make your code more generic is to move the instance specific data to an external config file that gets read and used at runtime. That way the same code can run in multiple environments and you don't have to worry about which version is deployed where. And one of the common ways to define that config file is yet another markup language, YAML.

The first thing to keep in mind is that YAML is a superset of JSON, which means that every JSON file is valid YAML, but the converse is definitely not true. The other thing to keep in mind is that while it's possible to have multiple YAML documents in one YAML file, a YAML document can not span files. Which means that while a common YAML file can be used by code in multiple languages, every implementation of merging YAML files is custom and probably not portable across languages.

That said, there are a bunch of things you can do inside a YAML file that are supported cross-platform, and can make your life a lot easier.

Most important, to me, is the ability to define "variables" and then use them multiple times in the file. This lets you put together a set of information and use it in multiple places without having to worry about typos. It can be a single value, or it could be a complex sequence of values, maps, and sub-items. Whatever it is, it is exactly the same everywhere it is used. You can even use the "variable" as a base and extend it differently in different places.

Another good one is the ability to handle multi-line strings or long strings without having to scroll 6 screens to the right or slam things against the left. Combined these features let you keep your YAML file readable while maintaining the look/format you want for the end user.

The last one that gets used a lot, especially when working with Kubernetes and kubectl is multiple documents in one file. Instead of having multiple files that you need to keep in sync you can add a separator and then just keep them all in one file. I don't recommend putting ALL your YAML in a single file, but if you need to define multiple related Kubernetes resources like a role, role-binding, and namespace, putting them all in one file can help you keep things in sync.

Which brings me back to merging multiple files. In this case I don't mean concatenating, or extending, but merging as in overriding a base for an instance specific result. Like having a base YAML file, then overriding the dB endpoint differently for dev, staging, and prod, but having the resulting document have the same structure. While there's no globally portable way (that I know of), there is Uber's config/configfx for Go. The base config library lets you specify how the different environment specific files are merged into the base, and configfx offers a pre-defined structure that uses the always-available ENV variables to automatically choose the correct files to merge together for you. If you're writing Go code, check it out.

by Leon Rosenshein

Roadmaps

One of the key things that makes a shared journey possible is a shared destination. If you're not all going to the same place the chances that you'll end up in the same place are pretty slim. But while a shared destination is key, it's not enough. To have a successful shared journey you need a shared roadmap.

It's not that everyone needs to be traveling the same path. Although many will travel the same path, there are multiple ways to get from A to B, and if you need a stop at C, D, E, F, G, H, and I to pick things up it's probably more efficient to split up and have parts of the group go to each destination. You don't just tell everyone that you'll see them in New York City and hope everyone gets there and they pick up everything that's needed along the way. Instead, you take some time up front and do a little planning. Where are you going? Who's going? Why are you going? What are the stops along the way? What will we need along the way? What will we need when we get there? That's more than a destination, that's a travel plan. A roadmap if you will.

That's just as true for an organization's journey. An org is just a group of people, and whether you're going from one city to another, or rAV to rNA, or say Uber ATG to Aurora, the same basic principles apply. Just saying "Go ahead. I'll see you there on Monday" and expecting everyone to get there on time with everything they need is, shall we say, optimistic ? Better to have a roadmap.

So what goes into a roadmap? Every roadmap is different, because every roadmap starts from a different place/situation and goes to a different place, but there are some good questions to ask to figure out what goes into your roadmap. These questions apply not just to the "destination", but any stops (releases) along the way.

  1. What are we building/learning/proving?
  2. Who are the users/customers?
  3. How do we share this?
  4. What are the assumptions?
  5. What do we depend on?
  6. What do we need that we don't have?
by Leon Rosenshein

Momentum

Every object persists in its state of rest or uniform motion in a straight line unless it is compelled to change that state by forces impressed on it
-- Sir Isaac Newton

People have an enormous tendency to resist change. They love to say, ‘We’ve always done it this way.’ I try to fight that.
-- Grace Hopper

Inertia and momentum are real. Change is work. Sometimes hard work. Even when applied to non-physical things, like code. There’s a constant tension between changing everything and changing nothing. In development, as in life, the answer is somewhere in the middle because “it depends”.

Very often things are the way they are for a good reason, even if we don’t know the reason (there’s Chesterton’s fence again). So think critically before you change something. Just because you don’t know the reason for a method/process doesn’t mean it’s not needed. Figure out why it’s there. You might find that it’s actually more broadly applicable and you should be doing more of it.

On the other hand, sometimes we do things the same way out of habit, or cargo culting. We’ve seen it work before, and it’s worked for us, so don’t change it. But again, you need to think critically about it. It’s like Grandma’s cooking secret. Is what you’re doing the real cause of the outcome or would it have happened anyway? Does the current method/process/code solve a problem we once had, but don’t have now?

All of us are on the cusp of a large transition. That liminal state between the before times and the after times. That means we have an opportunity to think before we act and set ourselves up for success in whichever direction we end up going, both individually and collectively. So think about where you want to be, then figure out how to get there. Don’t pick a direction and see where you end up.

by Leon Rosenshein

Big Bang Theory

tdd

Big Bang TheoryIn the beginning was the command line (actually it was switches, buttons, and wire wrap). There was no "software industry" and everyone did everything themselves, or at least for in-house customers, and the pace of change was low. After all, that's how things were always done. In business, changes on the factory floor were expensive and there was lots of planning. Software was treated the same way.

As software became less bespoke we moved into the era of box products. Come up with an idea. Think about what it would look like. Build it. Put it in a box and send it to resellers. Never think about it again. Repeat every 2 years or so. Between 1980 and the mid 2000's that's how software was done. Updates were hard to impossible to deliver, and lots of (most?) people wouldn't bother to get them anyway.

And the way software was developed followed the same cycle. The big bang approach. Think about it for a while. Make plans. Execute on them. Slip the schedule to fix the problems that weren't understood. Release. Start again.

Things have slowly shifted over the last decade or so. We deliver, especially to internal customers and on the web, a lot more often. Like every day. Or multiple times a day. One might say continuously. Which leads me to test driven development.

Or at least automated unit and integration tests. One of the reasons for the long cycle time in days of yore was the need for extensive, manual testing. The only way to be sure something was right was to build it then throw it over the wall for testing, look at the bug reports, fix them, and repeat. That took time. Because every test cycle tested everything.

Now it is important to test everything, and to test it together, at least at some point. But you don't always need to test everything. The better you can do at identifying the tests that need to be run for a given change the fewer tests you need to run. Fewer tests means less time. Automated tests mean less time. Less time for testing means shorter cycle times. Short cycle times means more releases. More releases means move value over time.

So yes, it's really impressive if you can do a big reveal of the be-all, end-all on day one. But it's not the best way to make your customers happy.

by Leon Rosenshein

Hindsight

Things go wrong. We make mistakes. We don't always consider all the possibilities. New, different things can happen. The goal is to do better in the future. That's where Root Cause Analysis comes in. And hindsight is 20/20, right, so we're done, right?

First, as I talked about last year, there's a difference between the proximate cause and the root cause. We need to avoid the trap of thinking that the first time a person could have done something different to prevent the issue is the root cause. It almost never is so we need to keep digging.

Second, while our view of the past may be 20/20 (it probably isn't, but that's a different issue), it also tends to suffer from target fixation. As part of mitigating/solving the problem and then preventing it, we (hopefully) wrote down all the steps taken and noted problems along the way. And if we're really on top of things at the post-incident review we even created tasks on the backlog for the longer-term things that need to happen to prevent a recurrence.

But what we almost never do is document the struggles we had diagnosing the problem and getting to the root cause. We don't document/remember the dead ends we followed, the tools we built (or wish someone else had built), or the side learnings. Even when doing RCA, we follow the happy path through the solution and forget about many of the possible error cases.

And that's bad. For a bunch of reasons. Those dead ends are probably incidents waiting to happen. Yes, this time the problem wasn't bad input from a sensor we weren't properly filtering, but it can still happen. Unless we fix the underlying issue, by capturing it in the backlog and then completing it, in the fullness of time it will happen. And then we'll go through the whole process again.

At which point we'll reach for the same tool we wished we had but didn't, and find that it's still missing. So we'll throw something together. Which will do the trick, but slow us down. Which means that we should have but building that tool on the backlog as well. As a side benefit I'll bet you find that there are other uses for that tool as well.

When you're dealing with an incident and doing RCA you're an explorer. While it's possible that you got lucky and your path to the root cause was simple and straightforward, more likely it wasn't. And each of those bends and twists is a part of the map that currently says "terra incognita", but you've got an opportunity to update it and make the blank spot smaller. Next time someone reaches that spot and has a working map they'll be happy you did. And that person might even be you.

So remember. Even when dealing with errors there's more to life than the happy path, and there are lots of learnings there to be had, by keeping your focus broad enough to see them.

by Leon Rosenshein

Naming Is Hard

There are only two hard things in Computer Science: cache invalidation and naming things.
-- Phil Karlton

We all know that. So what do you do? First, be consistent. A foolish consistency may be the hobgoblin of little minds, but this kind of consistency is not foolish.

Names, method or member, internal or external, form the basis of a ubiquitous language. Language lets us communicate what’s inside our heads with others, and having a consistent language reduces cognitive load. And as I’ve mentioned before, reducing cognitive load is a good thing. For example, if you decide to have a method that gives you the number of entries in an array called Length(), then you should have the same for sets, lists, maps, and any other collection. Calling it Length in some, Count() in others, and having a member variable called NumberOfItems for one will drive your users insane.

And not just consistency in names, but in style. Again, whether it’s CamelCase, SCREAMING_SNAKE_CASE, or kebab-case, pick one and stick with it. And when I say pick one, I don’t mean pick one for yourself and use it everywhere, I mean pick one for something with very clear boundaries, like a team, or a company, and then stick with it. Consistency is your friend.

That’s just the first thing though. If you are consistent your customers will eventually figure out your pattern/style and be able to follow along. Having a good name is also important. Something that lets the reader know immediately what the thing is/does. So what goes into a good name?

It should be Short, Intuitive, and Descriptive. Something like documentPageCountsectionPageCount, and chapterPageCount. Just from looking you can tell that they’re page counts for different parts of a document. It’s a Count, so it measures things. You don’t expect it to change as you move around in the document like pageNumber might.

Manage your context. Think about those different counts. The first word sets the basic context, and each word after gets more precise. By the time you get to Count you know exactly what you’re talking about. And always be consistent (there’s that word again) with which way the context increases. Order matters.

Speaking of context, avoid stuttering. If you have a context, use it to disambiguate. Don’t make the name longer to re-add the context. gofmt even has an error message just for that case, type name will be used as by other packages, and that stutters;

And finally, if you’re doing something, be very clear about what you’re doing. Put the verb up front. And be precise. If you’re updating an element in a collection then by overwriting, call it overwriteElement or updateElement. If you’re replacing an element in a collection then call it replaceElement. And whatever you do, don’t call it createElement if the element exists.

But again, and most importantly, be consistent. This is another example of when not to surprise your customer.

by Leon Rosenshein

Another Year In Review

At least I hope the year ends tonight, but I'll have to check covidstandardtime.com tomorrow to be sure. Going with the assumption that the year does end, some thoughts on the year that was.

It started like any other year. Lots of interesting challenges, what with shutting down one DC and turning up another. Lots of new tech to learn about and fit together. Then things changed. And they didn't change. Life went on. Work went on. We still turned down one DC and turned up another. Learned a lot about Authentication, AWS, Kubernetes and Prometheus, *and* how to fit them all together into a coherent whole. And I wrote a bunch of daily dev topics, over 225 of them. Here's a few of them, presented in chronological order, that either I liked or got a lot of interest. Enjoy and see you next year.

Jan 15th Naming Things

Feb 25th Breaking New Ground

Apr 29th Fear, Courage, and Professionalism

Jun 30th Vorlons vs. Shadows

Aug 3rd Careers

Sep 30th OKRs and KPIs

Oct 13th Silos vs Ownership

Oct 21st Stories vs Tasks

Oct 27th Commits vs PRs

Nov 20th Implicit vs Explicit

Nov 25th Take Time


by Leon Rosenshein

Unhappy Customers

Back at the beginning of the year, in the before times, I wrote that APIs are for life, and that if you change something, even if you make it better, you're likely to have unhappy customers. And that's true even if what you've changed isn't a public API.

One of the things we did with Combat Flight Simulator 3 (CFS3) was improve the graphics. We took advantage of newer hardware to increase resolution and fidelity. We did this in two main areas. The aircraft themselves and the terrain engine. We knew that the FlightSim community had already created hundreds of aircraft, and we wanted to allow them to be used, so while we improved rendering and animations on the aircraft we made sure to handle last years formats as well. Since CFS3 had a lot more air to ground action than earlier versions we also wanted to make sure that we could get sufficient detail to get a sense of speed and action everywhere. Because we started with the FlightSim codebase and data set we already had the entire world covered, so we didn't worry about add-on scenery.

According to our customers, we chose poorly. Even though we didn't officially support 3rd party updates to the scenery, a small portion of the customer base had figured out how to add scenery in earlier versions. They made changes not only for themselves, they shared those mods with others. There were even a few companies that had commercial products with that scenery. Which meant that when those add ons didn't work we didn't just have a few hard-core fans who were upset. All of their customers were upset with us too.

This was in the days of box products, so it was too late to do anything for CFS3, and that bad taste certainly impacted sales. Because of the low sales and reduced profit (CFS still made money, but not as much as expected) CFS4 was cancelled. Was that entirely because we changed a private API? Of course not. The entire market was down, flight sims were losing favor, and Microsoft was focusing on higher return on investment. But if we hadn't made that change and had happier customers and better press things might have turned out differently.

We did take the lesson to heart though, and while we incorporated many of the improvements to the terrain engine into the next version of Flight Simulator we took the extra time to ensure that the improvements worked with the old formats instead of replacing them. Flight Simulator was much better received, and managed to survive for 2 more versions and some add-ons over 5 years before it too was shut down for not having a high enough profit.

So remember, surprising your customers isn't always a good thing.