Recent Posts (page 1 / 69)

by Leon Rosenshein

New Code Vs. More Code

I’ve said before that as developers we read more than we write. And that’s still true. Here’s something else we do more often than we probably think. We modify (extend) existing code way more often than we write brand new code.

Writing green-field code is fun. It’s easy. There are far fewer constraints. But how often are you really writing green-field code? If you want to be pedantic about it (and sometimes I do), everything after the first time you save you code is working on existing code. You could even go further and say every character after the first one typed is extending, but that’s a little extreme. Even if you think of it as the first time you share your code with others (git push for most of us these days), every subsequent push of the same file is modifying existing code.

Given that, it’s incumbent on all of us to make sure it’s as easy as possible to extend the code, and as hard as possible break the code. You want clean code, with good abstractions. You want the code to be DRY, SOLID, KISS, and a host of other acronyms. The question, of course, is how far to go with that idea? The answer, like so many of my other answers, is It Depends. Because our ability to handle cognitive load is limited. Very limited.

Taken to the extreme, you end up with the Fizz Buzz Enterprise Edition. It’s fully abstracted, extensible, and compartmentalized. It’s exactly what you want, and what you’ll get, if you follow all of the rules to their logical extreme. It’s also terrible code to pick up and maintain. Which is the exact opposite of what you should be doing.

Instead of extremes, you have to balance them. Not too much of any one thing. But not none of anything either.

You want the code to be legible. Legible in that when you look at it you know what it does. And what it doesn’t do. You want comments explaining not what it does, but why it does it that way instead of another (see Chesterton’s Fence). So that you don’t change it without understanding its purpose. You want comments explaining the constraints that made it that way. So you know when you should change it. Legible code makes it easy to know where you stand at any given moment. That’s the enabler for easier extension and harder to break.

You want abstraction because it helps reduce cognitive load. When you’re working on one thing, you don’t want to have to worry about anything else. So you want just enough abstraction to let you focus on the task at hand, and understand the things related to it well enough to work with them, without having to know all the details. Too much abstraction and you don’t know how to work with the rest of the system. Not enough and the cognitive load overwhelms you and things get lost in the details. The right abstractions make things easier to extend.

You want clear boundaries. Partly because they too help reduce cognitive load, but also because they tell you where your extensions should be. They belong with things in the same domain. Clean domain boundaries reduce coupling. They reduce unintended and unexpected side effects. They keep things legible for the maintainer. Maintaining clear boundaries also makes things easier to extend.

You want guardrails because they tell you when you’re drifting away from where you’re supposed to be, and help you get back. Unit, integration, and system tests give you that. These tests tell you when your interfaces start to function differently. Then you can decide if those changes are wrong or not. Hint, Hyrum’s Law tells us that there are probably some people think they are wrong. But maybe not. Regardless, without those guardrails you wouldn’t even know to check. Good guardrails make it hard to break things unintentionally.

Because when you get down to it, we almost never write new, green-field, code. Instead, we almost always extend existing code. So we should make it as easy on ourselves as possible.

by Leon Rosenshein

Listen to Your Data

I’ve touched on the importance of listening to your data before, but I decided that the topic is worth revisiting. That time it was about the difference between 0, 1, and many. As a side note, I mentioned the relationship between data and Object Oriented Programming, and how your data can tell you what your objects are.

That’s still true. When people ask me to take a look at their design and architecture and wonder what the right answer is, my first answer is usually, of course, It Depends. And when they as what it depends on, I generally say it’s a combination of two things. First, the problem you’re trying to solve, and your data. They’ve usually thought about the problem they’re solving, but when they often haven’t thought about the data. So I tell them, go listen to your data and figure out what it’s trying to tell you.

It’s about boundaries and bounded contexts. It’s about composition vs inheritance. It’s about cardinality, what things change together and what things change by themselves. It’s all of those things. How you store, access, and change your data will help you design systems that work with your data instead of fighting against it. From your internal and public APIs to your user interfaces to your logging and alerting.

But it’s also more than that. You have to listen to your data not only about how you store, access, and change it, but about how you process it. What parts do you need to process sequentially, and what parts can you process in parallel? Are you processing real-time streams, or are you analyzing huge piles of historical data? Do you want/need recency bias in your data or do you need to have long term trends? Or maybe both? All of this is going to impact your system.

The trick is to learn to listen to your data at small scale. Where you have the luxury of being able to try out something and see what the pain points are while you’re able to get things to work. Try different data structures. See what kind of algorithms they push you towards. See what the makes them work well, and what gets in the way. You can usually make any data structure work with any algorithm, but some things work better together. Trees lend themselves to depth first searches. There are other ways to do depth first, but it’s a lot easier with a tree than with an array.

One of the hard parts about learning like this is having a source of problems that have answers_. So you can check yourself. One possible source is an undergraduate comp-sci class. In many cases you can find an online class with problems and valid answers. Another is interview prep systems. Like leetcode problems. In general, I hate leetcode as an interview technique, for lots of reasons that I’ll get into another time, but as a learning opportunity, I think they’re a great place to start. Or, if you want a bit of competition to spur you on, another good place is the Advent of Code. Once you’re done speed-running it for points and have a working answer, take some time to experiment with the problem space.

Regardless of how you do it, once you learn how to listen to your data, you’ll hear it talking to you in unexpected ways. You’ll be able to look at a problem with large scale data and see how to break it down, categorize it, and work with it. So your solution works with your data. Not just today, but tomorrow as well.

by Leon Rosenshein

What Is Performance Anyway?

Performance is important. It’s also very context dependent. High performance can mean different things at different times to different people. And what your target audience is going to consider important is never fully known unit that audience actually gets your software in their hands.

That said, there are some areas that almost always go into what people consider high performing software. Things like responsiveness, latency, total run time, throughput, and resource efficiency. And of course, the actual result. If you’re talking about performance, you’re probably talking about one or more of those things.

Responsiveness

If the thing you’re building is responsive, whether you’re building hardware or software, people will feel good about it. People want to feel like they have some level of influence, if not outright control, over what they’re working on. That’s the autonomy part of what Daniel Pink talked about in Drive. From the audible click of a physical switch or on-screen button to the time a web page shows the first pixel, the shorter the time a user has to wait for something to happen, the more performant they’ll think it is.

Latency

Closely related to responsiveness, is latency. Not the time between the user’s action and the first response, but the time between the user’s action and the thing the user wants being finished. One of the big differences between cheap digital cameras and higher performance ones, outside of actually taking a better picture, was their latency. When you pushed the button on a cheap camera, it would typically beep or click immediately (very responsive), then think for a while, adjust the focus, shutter speed, and aperture, and finally take the picture. By which time the subject had moved out of the frame. A higher end camera, on the other hand, would beep just as soon, but the time taken to adjust things before the picture was taken was much shorter. You got a picture of the thing you wanted because it didn’t have time to move out of the frame.

Total Run Time

Total run time is another big one. How long does it take to do the thing? The less time it takes, the more performant the system is. Going back to those cameras, the cheap camera might take 2 seconds to go from button click to image stored on disk, while the more expensive one could do it in a second. If you prefer car analogies, how long does it take the car to go 300 miles (assuming you’re not constrained by those pesky speed limits)? One car might take 4 hours to go 300 miles. A high-performance care might be able to do it in 2 or 3.

Throughput

Just like responsiveness and latency are related, total run time and throughput are related. It’s not just how long something takes, but how long between each one, and how many can you do at once. Throughput becomes important when you have a large pile of things to do. Throughput tells you how long it will take to get everything done, not just the first one. If you’re moving one person a sports car has higher performance than a bus. If you’re moving 50 people, the bus has higher performance.

Resource Efficiency

Finally, there’s resource efficiency. For this discussion, resources consist of things like CPU cycles, memory, disk space and power. Again, this becomes really relevant at scale. If you need to do one thing, it doesn’t matter much if it takes 1 kilowatt-hour or 10 kilowatt-hours. On the other hand, if you need to do one million of them, the difference between it taking 1 or 1.1 kilowatt-hours makes a big difference.

When it comes to building high performance systems you really need context. You need to know what you’re optimizing for before you try to maximize performance. Not just what’s important, but how things are important relative to each other. That’s real engineering.

Use case 1 – Moving people

Let’s say you’ve got two vehicles, a sports car, and a bus. Which one is higher performance? Like I said, it depends. It depends on if you need to get the first person to the new location fastest, or the most people there. It depends on how many vehicles the road can handle. It depends on whst kind of fuel you have. And what kind of drivers you have.

Sports Car Bus
Top Speed 150 MPH 75 MPH
Turn Around Time .1 hr .5 hr
Count 4 2
Extra Seats 3 50
Miles / gallon 12 8

Assuming a 300 mile trip, the performance looks something like this:

Sports Car Bus
Responsiveness 2 hrs 4 hrs
Latency 2 hrs 4 hrs
Run Time 4.1 hrs 9 hrs
Throughput ~¾ people / hr ~5 people / hr
Fuel used /
person delivered
~16.6 gal/person ~1.5

The sports car can get the first 3 people there the fastest, so if nothing else is important the sports car has higher performance. If you need to get 50 people, then the bus can do it in 4 hours, while the sports car would take ~67 hours. In that case the bus is higher performance.

Use case 2 - Real time vs Batch

In a previous role I was responsible for processing terabytes of data with processes that took hours to complete and had multiple human in the loop steps. While working at a company whose business was predicated on instant responses to millions of user requests per day. And those instant responses were where the money was. Literally. Those instant responses were about prices and payments and user choice. Performance there was all about getting the user a reasonable response as soon as possible. It had to be responsive immediately and quickly give the user a choice to accept. It wasn’t about the best answer. It was about the fastest. And to top it off, load was bursty. There were busy times and slow times. Based on time of day, weather, and special events.

Almost all of the company’s systems were designed and built for that use case. Running systems at 50-70% capacity to handle surges in load or failover. Approximations instead of exacting calculations. Because the most important thing was to keep the user involved and committing to the transaction. The systems worked. Amazingly well.

But they didn’t work for my use cases. In my use cases there was always more work to do, and it was more important to get it right than to get the result fast. Step times were measured in hours, not milliseconds. Hell, in some cases just loading the data took longer that most of the steps in the user-facing system took. We didn’t have tight deadlines, but we did have priorities. Sometimes more work that was more important than the work we were doing would come in, and we’d have to schedule that in.

While most of the company valued low latency and minimum run-time, we valued high throughput and efficient resource usage. Given that, the existing systems didn’t work for us. Sure, we made use of many of the same low-level components, observability, distributed processing systems, deployments, databases, etc. But we put them together very differently. We exposed things differently. Because our idea of performance was different.

The high performing system you end up building depends on what performance means to you.

So before you go and try to make your system performant, make sure you know what performance means to you. In your particular case.

Then you can optimize accordingly.

by Leon Rosenshein

Time Passes

Hot Take. Unit tests should pass, regardless of what day you run them. Time is hard. It has a way of passing when you’re not even thinking about it. When you’re writing simulations (or unit tests, which can be thought of as simulations of some small aspect of your code) one of the most important things to do is control time. As a general rule, unless you’re measuring performance or displaying/logging the current time, you probably shouldn’t be using your language’s equivalent of Time.Now(). In fact, even in those cases, I’ll assert that you shouldn’t be calling it directly. You should at least be using some kind of dependency injection, if not a whole façade1.

The other day I was dealing with an error in unit tests. A test on a function that hadn’t been changed in a while started to fail. I was able to reproduce the error locally, and wanted to find out which change caused it. I tried to track it down, using git bisect to help me do it, but I wasn’t able to. Between the time I got the error to happen locally and when I got back to dealing with it, the error magically went away.

Heisenbugs are a terrible thing. The tests pass, but the bug is just sitting there waiting to bite you. It’s never a fun time when you have to find and fix one. One nice thing about them, to use the term nice loosely, is that they’re usually related to one of a few things. The environment (things like ENV variables, files on disk, current working directory), load on the machine (disk space, CPU or network load), multi-threading, or time.

In this case it was time. But not in the code. The code was just comparing the difference in working days between two dates. It was, in fact, correct. It used a calendar and the two dates and gave the right answer.

Instead, this was in fact a kind of Schrödinger’s test. Depending on when the test was run, sometimes it passed, and sometimes it failed. The test was checking that the number of working days between now and three days ago was always at least one.

That seems reasonable. Or at least on the surface, that seems reasonable. Since working days are Monday to Friday, with Saturday and Sunday being weekends, there are never more than two non-working days in any three day period, so there’s always at least one working day.

And that’s how the test worked. It looked something like

today = time.Now()
three_days_ago = today.add(‘day’, -3)
result = working_days_between(today, three_days_ago)
assert result >= 1

The problem was, the test forgot about the fact that it’s not always true, Like on a three or four day weekend. Like Memorial Day in the US. Run the test on most Tuesdays and it passes. The number of working days between Tuesday and the preceding Saturday is one (the Monday between them). But run it on the Tuesday right after Memorial day and the number of working days between them is zero. That Monday is not a working day. The function did the right thing. It normally returned 1, but that day it returned zero. And the test failed2.

This is actually a hard function to test correctly. Any day can be a holiday. It’s a little better defined for official holidays but add in company holidays, religious holidays, and personal holidays, and it’s un-knowable at test time. There are just too many variables. If you don’t tell the function when holidays are you either have to know when you’re writing the test or find them out at test time.

The most robust way to test this is to change the function to take a calendar, then in the test pass in not just the two days, but the calendar that should be used. And then calculate how many working days there are between the two dates in the test. Then, assert that the return value is exactly the same. Then figure out the edge cases and use boundary value analysis to make sure you test all of them.

And by the way, don’t forget that your calendar will change over time, so when you ask the question, how many working days are there between two dates, you need to think about when the question is being asked and know the calendar that was being used at that time. Just in case you didn’t think this was complicated enough 


  1. I’m not saying you should write your own time and date management/handling functions. Just like with security and cryptography, you better have a good reason. ↩︎

  2. NB: The correct answer here is NOT to change the test to be >= 0. ↩︎

by Leon Rosenshein

Licensed Engineers

Closing out this series1, the third most common reason I’ve seen thrown around for why software engineering isn’t real engineering is:

Real engineers have a license. Software engineers don’t

In the United States you can get become a Professional Engineer. Canada has licenses and the Iron Ring2, which acknowledges the responsibilities an Engineer takes on towards society. Other countries have similar systems.

To the best of my knowledge, the only place that has a Software Engineering specialty for Professional Engineers is Texas, and while that’s called Software Engineering, it’s really more about computer hardware engineering, and the number of licenses issued is vanishingly small. In the 20+ years that specialty has existed, there has been no uptick in licensed Software Engineers nor has there been a demand for them. Neither from people in the field, industry, nor from any government.

With that as background, while it is true that some engineers have those licenses, most people with engineering degrees that work in their chosen field as engineers don’t. And no one says they’re not engineers. If most traditional engineers don’t bother to get a license when they could but are still called engineers, it’s not reasonable to say Software Engineers who don’t have a non-existent license aren’t engineers.

All that said though, it is important to note that just because you write code, you’re not necessarily a Software Engineer. There are lots of extremely skilled, well trained, and talented people who can build infrastructure. In fact, you can’t build and maintain today’s society without them. But many (most?) of them aren’t engineers. They’re technicians. They’re operators. They’re builders.

The same is true for software. There are many people who develop software. From Lego Mindstorm robots to Excel macros to websites to astrophysical signal processing. There are no-code solutions like LabVIEW and now Vide Coding. That’s all programming and software development. It’s important. It can be fun. And it can be crucial to advancing the state of the art in whatever field it’s being applied to.

But just as with your home contractor or heavy equipment operator, the fact that you’re building something doesn’t mean you’re doing engineering. Engineering is about why you make the choices you do and how you go about understanding and balancing between competing constraints that exist in a specific context that you find yourself in to provide optimum value.

6 box engineering process loop: Ask, Imaging, Plan, Prototype, Test, Share

And that right there is why Software Engineering really is Engineering.


  1. Part 1: Constraints, Part 2: Engineers Estimate ↩︎

  2. Fun fact, Rudyard Kipling, seen by many as the patron of the engineer (see The Sons Of Martha, McAndrew’s Hymn, and Hymn Of Breaking Strain) authored the Obligation recited by wearers of the Iron Ring↩︎

by Leon Rosenshein

Engineers Estimate

The other day I talked about the #1 excuse people use when they say software engineering isn’t engineering, that software has no constraints. If you think software engineers don’t have to deal with constraints, here’s the post. Or just go talk to a software engineer.

The second most common excuse I’ve seen is

Real engineers can and do estimate their work. Software engineers can’t (or won’t) accurately estimate.

First, let’s agree that if you’re trying to do something that isn’t even close to something that has been done before, the estimate is going to be wrong. It doesn’t matter if you’re trying to build a Mach 3+ jet, the tallest building in the world, the first steel suspension bridge, or an online service that responds to millions of requests a day in milliseconds.

Second, have you ever been involved in a large infrastructure project, like a highway system, a water system, or building a multi-story building? What about mid-sized project, like building a house, or designing a home appliance? If not any of those, what about a small project, like a kitchen or bath remodel? Or even changing a lightbulb? If you’ve ever done any of those, you know hard it is to come up with an accurate estimate. And if you’ve never done the work, but had the work done for you, you’ve seen how those estimates just that, estimates. The reality is often different. Wildly different. Even for traditional engineers doing things that have been done before.

But it’s true. Traditional engineers are expected to, and do, estimate their work. And the smaller the delta between what is and what will be, the more accurate the estimate. Generally. And that makes sense. The better understood the problem and solution domain, the better an estimate will be. Until you get to edge cases. You can move a support piling a little bit and change nothing else. That’s easy. But if you find you need to eliminate a support piling entirely because of soil conditions you suddenly find that you’re changed from an arched bridge to a suspension bridge. That’s going to blow the schedule. Or the non-load-bearing wall you wanted to remove isn’t load bearing, but there’s plumbing in it. There are lots of surprises that can come up when you actually have to do the thing.

And all of that can happen when you have clear and stable requirements. When the requirements are in flux, anything can happen.

The same thing happens with software engineering. The closer the thing you want is to what we already have, the better the estimate. Want to add a button to the UI? Easy to do and estimate. Develop a new database query? No problem. Unless the screen is full, and adding a button means switching from one screen to two. Or redesigning the whole thing. Or finding out that the data is actually spread across three different databases. Discovering this new information means your estimates need to change.

In fact, change is the biggest reason that estimates in software aren’t as accurate as anyone, including software engineers, would like. It’s very common to start with only the vaguest idea of what is wanted, then iterate until it’s found. This may very well be the most efficient way of developing the software that best solves the user’s problems. We’ve seen how waterfall and BDUF projects end up. They have the same problems with estimation and then they add building the wrong thing just to make it worse.

There’s another thing that comes up as well. As often as not, what software engineers are trying to do is not build a mechanical system, but build a system that replicates a process. A process with people in it. People who do things a certain way, not all of them the same. With a myriad of edge cases. Going back to how things are done in medical offices, the computer-based system took all of the constraints of the old, paper system and somehow mashed them into the new system. Having to deal with both sets of constraints makes the system much more uncertain. As noted above, the more uncertainty and change, the worse your estimates are.

So there you have it. Estimation is hard in software engineering. Because estimation is hard in general. Even if you’re doing something very close to things that have been done before. You don’t know what you don’t know, and the goals can often change as well. Just like in traditional engineering.

by Leon Rosenshein

Constraints

Over the years I’ve seen many people say that software engineering isn’t real engineering. They tend to come up with the same reasons, even if they have different examples. In my mind I’ve grouped them into a few major reasons.

  1. Real engineers work with things in the physical world. Things made of atoms, and they’re constrained by physics. Software engineers, on the other hand, work on “bits”, and bits aren’t real1. There are no constraints on bits other than the developer’s imagination.

  2. Real engineers can and do estimate their work. Software engineers can’t (or won’t) accurately estimate.

  3. Real engineers have a license. Software engineers don’t.

The other day I ran across another article saying that software development isn’t engineering. It used what I think of as major argument #1 for why software development isn’t engineering. I disagreed. Besides pointing to towards the Crossover Project, there were a couple of other things that I mentioned.

First of all, as a person who was formally trained and started their career as an aerospace engineer, I have a decent idea of what goes into that work. I dealt with atoms. Mostly atoms making up aircraft in my case.

Second, it’s true that there are lots of constraints

6 sided star showing different constrainst and how the might relate

that go into aircraft design. Balancing weight vs. lift, thrust vs. drag, useful payload vs. takeoff weight. Range vs. loiter time vs. acceleration. All of these things have limits based on physics, available technology, and how you choose to balance them against each other. It’s multi-variate calculus. With no right answer, only different choices. In any given situation, the answer to the question of which design is “correct” is It Depends.

Taking them in turn, while your typical civil, mechanical, or aerospace engineer is working on buildings, infrastructure, vehicles, and other very large, very physical things, that’s not the only kind of traditional engineer there is. Electrical engineers are primarily interested in are electric fields and how they interact to transfer and transport energy. Sure, they deal with wires and physical components to do it, but that’s the medium, not the focus. After all, electric current is not the movement of atoms, but the movement of holes. When you’re concerned about negative space, that’s pretty far from being concerned with atoms.

With that in mind, software engineering is about managing information flow and storage. No one would say that the people who design hydro-electric power stations, building dams, spillways, and internal plumbing aren’t engineers. Information is handled in a very similar way. Pipelines, Queues, and Long-term storage. One is water, the other magnetic fields or electron holes, but it’s basically the same thing.

The other part of the argument is that real engineers are constrained by physics. That’s certainly true. Going back to those planes I worked on, they very much are constrained by physics. There’s only a certain amount of energy in the fuel. You can only convert some portion of that to thrust. For a given shape, the lift/drag ratio is known. You have to balance those things or the airplane doesn’t work. You can’t build a plane out of Unobtanium, no matter how much faster/better/easier it would be.

Similarly, software engineers face constraints. There are the prosaic one, like clock speeds, amount of memory, and disk space. You can’t use more than you have. Then there are others that are more dependent on the current environment. Network bandwidth is a real limit. Available power is a limit. The speed of light is a limit on communication. The speed of a wavefront in a wire is a limit. Then there are things like CAP theorem. There are lots of ways to balance these things. With no right answer, only different choices. In any given situation, the answer to the question of which design is “correct” is It Depends.

There you have it. Why reason #1 for software engineering not being real engineering is wrong. Reasons 2 and 3 are topics for a later post.


  1. On the subject of bits and atoms, way back in 2015 I sat in a company all-hands meeting while Travis Kalanick described the new Uber branding. How the company was all about bits and atoms. Using technology to move things in the physical world. ↩︎

by Leon Rosenshein

Slow is Smooth, Smooth is Fast

Move fast and break things. That’s the tech mantra, right? Do something. Might be right, might be wrong. Just do something and see what happens. Things will break. That’s OK. Just fix it later. As the Dothraki say, It is known.

There’s another saying. Slow is Smooth, Smooth is Fast. This one is courtesy of the Navy Seals. It’s saying the opposite. Slow down. Think about what you’re doing. Make deliberate choices. Every step will be little slower, but overall things will get done faster. Again, it is known.

And just as with the Dothraki, just because it is known, it’s not necessarily true. Maybe they’re both true. It’s your classic dialectic thinking. It Depends on the context.

Or maybe, thinking about it with the dialectic lens, they’re really saying the same thing, but from different perspectives, so of course they’re both true. We just need to think about them the right way. A way that honors both sayings and leads us to the deeper truth.

From an outside-in perspective, move fast and break things is saying that you should perturb the system and see how it responds. Then, with that new knowledge, you make another change. Do that fast enough and often enough and you end up changing the entire paradigm. You will have broken the old system and replaced it with a new one. Quickly.

From an inside-out perspective, you want to be deliberate. You want to slow down just a bit and consider what you’re about to do. Then do something deliberately. Which leaves you well positioned to make the next deliberate step towards your goal. Do that deliberately enough and it looks like you’re moving smoothly. If you keep doing that, you’ll find that you’ve actually moved faster than if you had rushed each step, but spent more time between steps.

Bringing this back to software development, here’s something to keep in mind as you do your work. Neither of those say you should take shortcuts or write bad code. When you move fast and break things, the thing that you’re breaking isn’t your code. You’re changing your code, but you don’t break it. You break the outside paradigm.

When you’re moving slowly and smoothly, you are always being careful to not break your code. You keep things smooth so you can keep taking the next step. You don’t need to take time to throw out your code and start again because it can change with you. You don’t need to take an extended period of time to figure out why your code has collapsed under its own weight. You use your understanding of the system to keep it the best simple system for now.

In both cases you might need to back-track a bit occasionally because you’ve chosen to move and break some paradigm, which has taught you that something you’ve done needs to change. That’s expected and it’s fine. Since you’ve done things deliberately, maintaining your optionality, it’s easy to smoothly make that change and move forward.

Which brings us right back to the dialectic. Move fast and break things. Slow is Smooth, Smooth is Fast. Statements that sound like they contradict each other. But are both true. By moving slowly and smoothly, you’re able to move fast and break the paradigm. There’s even a study showing this is true1.


  1. Code Red: The Business Impact of Code Quality – A Quantitative Study of 39 Proprietary Production Codebases. Details are a story for another blog. ↩︎

by Leon Rosenshein

Government Digital Services

A long time ago, in a country far away from, the government released guidelines. Nothing unusual about this, It happens all the time. Usually, when I hear about that I think of things that are well known, well understood, generally accepted, and now written down in obtuse language with lots of buzz words and details. Enough fluff to make it largely incomprehensible. You know, standard bureaucratic language.

When I think about the government that did this, I think of powdered wigs, stiff upper lips, and traditions that date back hundreds, if not thousands of years. Very much rooted in what worked before, with only a passing nod to the current.

Image of the UK House of lords

And then there’s this. The opposite of stuffy, hidebound, traditional, bureaucratic guidelines. From Government Digital Services in the UK, the Government Design Principles. First published in 2012. Largely unchanged since then. Very forward looking at the time. And still forward looking.

Before I get too far into this, I do want to acknowledge that the design they’re talking about is software design, not interface design. There are some principles that touch on interface design, but it’s about software design and the software design process more than anything.

It might not be quite a pithy as the Agile Manifesto, but it’s close. Remarkably close for a government publication. If nothing else, look where it starts. With the user’s needs. It includes talking to users and to recognize that what they ask for isn’t always what they need. That’s a great place to start for design.

There were 10 points in the original version, and all of them still apply. From doing only what is needed to making things open and interoperable. Because context matters and we don’t know what we don’t know.

I believe all of these principles are good principles, and I would never use an appeal to authority, but it’s nice when others agree with you.

by Leon Rosenshein

Best Simple System For Now

When you’re writing code you have lots of choices. Even when working with 20-year-old legacy code, you have options. Not all of those options are equal though. Some are cheap and fast now, but may have a large cost later. Others are expensive and slow now, but might make things easier in the future. Your job as a software engineer is to choose the right one.

A system without feedback and a sytem with a feedback loop

Which one is right? You can probably guess what my answer is. It Depends. Of course it does. It always does. Without the context, there is no up-front answer. In fact, both are usually wrong. You don’t want to choose the cheapest/fastest option, and you don’t want to the one that gives you the most options in the future.

Instead, you want to choose the one that gives you a good balance of things. You want what Dan North calls the best simple system for now. It’s a very deliberate phrase. There’s a lot to think about in there.

For Now

One of the most important parts of the phrase is at the end. For Now. Given what you know at the current moment, about where you are, about what the immediate goal is, and what is between you and that solution, and what you think the long-term goals are. What can you do right now? It’s going to change. You know that. You just don’t know how it’s going to change. So you want to maintain the options, not make more decisions than you need to.

Simple

One of the best ways to maintain that optionality is to keep things simple. Simple is easy to understand. It’s easy to reason about. And most importantly, it’s easy to change. But remember, simple doesn’t mean you get to ignore things. It still needs to work. It still needs to work at the scale you’re operating at. It still needs to work when the inputs change. Or at least it needs to work well enough to tell you that it can’t work in the new situation. Remember, KISS. The simpler it is the easier to get right and the harder to get wrong.

System

Another thing to keep in mind is that it’s a system. Even the simplest program is a system. And the important thing about systems is that the parts of a system interact with each other. Often in strange and unexpected ways. You need to remember, and minimize, emergent behavior. By keeping things simple. By remembering that you’re building a system for now.

You need to remember that systems have feedback loops. So you need to identify and understand those loops. So you can work with those loops, instead of against them. When you work against the feedback loops in a system you’re working against the entire system. If you keep trying to do that, you either change the entire system or you end up not changing anything. As John Gall said:

A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.

Best

Finally, we get to best. How can you make something the best? By ensuring that what you’re building is for now. By keeping it simple. And by working with the system. If you do all of those things, you’ve got a very good chance of ending up with the best simple system for now.