by Leon Rosenshein

What Is Performance Anyway?

Performance is important. It’s also very context dependent. High performance can mean different things at different times to different people. And what your target audience is going to consider important is never fully known unit that audience actually gets your software in their hands.

That said, there are some areas that almost always go into what people consider high performing software. Things like responsiveness, latency, total run time, throughput, and resource efficiency. And of course, the actual result. If you’re talking about performance, you’re probably talking about one or more of those things.

Responsiveness

If the thing you’re building is responsive, whether you’re building hardware or software, people will feel good about it. People want to feel like they have some level of influence, if not outright control, over what they’re working on. That’s the autonomy part of what Daniel Pink talked about in Drive. From the audible click of a physical switch or on-screen button to the time a web page shows the first pixel, the shorter the time a user has to wait for something to happen, the more performant they’ll think it is.

Latency

Closely related to responsiveness, is latency. Not the time between the user’s action and the first response, but the time between the user’s action and the thing the user wants being finished. One of the big differences between cheap digital cameras and higher performance ones, outside of actually taking a better picture, was their latency. When you pushed the button on a cheap camera, it would typically beep or click immediately (very responsive), then think for a while, adjust the focus, shutter speed, and aperture, and finally take the picture. By which time the subject had moved out of the frame. A higher end camera, on the other hand, would beep just as soon, but the time taken to adjust things before the picture was taken was much shorter. You got a picture of the thing you wanted because it didn’t have time to move out of the frame.

Total Run Time

Total run time is another big one. How long does it take to do the thing? The less time it takes, the more performant the system is. Going back to those cameras, the cheap camera might take 2 seconds to go from button click to image stored on disk, while the more expensive one could do it in a second. If you prefer car analogies, how long does it take the car to go 300 miles (assuming you’re not constrained by those pesky speed limits)? One car might take 4 hours to go 300 miles. A high-performance care might be able to do it in 2 or 3.

Throughput

Just like responsiveness and latency are related, total run time and throughput are related. It’s not just how long something takes, but how long between each one, and how many can you do at once. Throughput becomes important when you have a large pile of things to do. Throughput tells you how long it will take to get everything done, not just the first one. If you’re moving one person a sports car has higher performance than a bus. If you’re moving 50 people, the bus has higher performance.

Resource Efficiency

Finally, there’s resource efficiency. For this discussion, resources consist of things like CPU cycles, memory, disk space and power. Again, this becomes really relevant at scale. If you need to do one thing, it doesn’t matter much if it takes 1 kilowatt-hour or 10 kilowatt-hours. On the other hand, if you need to do one million of them, the difference between it taking 1 or 1.1 kilowatt-hours makes a big difference.

When it comes to building high performance systems you really need context. You need to know what you’re optimizing for before you try to maximize performance. Not just what’s important, but how things are important relative to each other. That’s real engineering.

Use case 1 – Moving people

Let’s say you’ve got two vehicles, a sports car, and a bus. Which one is higher performance? Like I said, it depends. It depends on if you need to get the first person to the new location fastest, or the most people there. It depends on how many vehicles the road can handle. It depends on whst kind of fuel you have. And what kind of drivers you have.

Sports Car Bus
Top Speed 150 MPH 75 MPH
Turn Around Time .1 hr .5 hr
Count 4 2
Extra Seats 3 50
Miles / gallon 12 8

Assuming a 300 mile trip, the performance looks something like this:

Sports Car Bus
Responsiveness 2 hrs 4 hrs
Latency 2 hrs 4 hrs
Run Time 4.1 hrs 9 hrs
Throughput ~¾ people / hr ~5 people / hr
Fuel used /
person delivered
~16.6 gal/person ~1.5

The sports car can get the first 3 people there the fastest, so if nothing else is important the sports car has higher performance. If you need to get 50 people, then the bus can do it in 4 hours, while the sports car would take ~67 hours. In that case the bus is higher performance.

Use case 2 - Real time vs Batch

In a previous role I was responsible for processing terabytes of data with processes that took hours to complete and had multiple human in the loop steps. While working at a company whose business was predicated on instant responses to millions of user requests per day. And those instant responses were where the money was. Literally. Those instant responses were about prices and payments and user choice. Performance there was all about getting the user a reasonable response as soon as possible. It had to be responsive immediately and quickly give the user a choice to accept. It wasn’t about the best answer. It was about the fastest. And to top it off, load was bursty. There were busy times and slow times. Based on time of day, weather, and special events.

Almost all of the company’s systems were designed and built for that use case. Running systems at 50-70% capacity to handle surges in load or failover. Approximations instead of exacting calculations. Because the most important thing was to keep the user involved and committing to the transaction. The systems worked. Amazingly well.

But they didn’t work for my use cases. In my use cases there was always more work to do, and it was more important to get it right than to get the result fast. Step times were measured in hours, not milliseconds. Hell, in some cases just loading the data took longer that most of the steps in the user-facing system took. We didn’t have tight deadlines, but we did have priorities. Sometimes more work that was more important than the work we were doing would come in, and we’d have to schedule that in.

While most of the company valued low latency and minimum run-time, we valued high throughput and efficient resource usage. Given that, the existing systems didn’t work for us. Sure, we made use of many of the same low-level components, observability, distributed processing systems, deployments, databases, etc. But we put them together very differently. We exposed things differently. Because our idea of performance was different.

The high performing system you end up building depends on what performance means to you.

So before you go and try to make your system performant, make sure you know what performance means to you. In your particular case.

Then you can optimize accordingly.