Recent Posts (page 20 / 65)

by Leon Rosenshein

Empowered?

That’s what we all want, right? Managers should empower their teams. Individual contributors should be empowered to design the things they’re responsible for. That sounds right, but is it? When I hear about people being “empowered” I start to wonder. Before you think I’m going full-on Taylorism on you, think about the word empower.

Empower:


  • to give power or authority to; authorize, especially by legal or official means:
    I empowered my agent to make the deal for me. The local ordinance empowers the board of health to close unsanitary restaurants.


  • to enable or permit:
    Wealth empowered him to live a comfortable life.

Giving power, authorizing, permitting. Is that really what you want? It seems kind of backwards to me. We’ve all got roles and responsibilities. We’ve been put in them because people have made the conscious decision that we can execute the roles and meet the responsibilities. But if you have to be empowered to do something, what was the situation before you were empowered? Did you have responsibility without authority? Were you overly constrained?

Don’t get me wrong. There are many times where being empowered is the right thing to do. Delegating decision authority to someone that normally wouldn't have it can have lots of benefits. The person with the temporary authority gets to stretch and learn. The person delegating frees up time to work on something else. And if you empower the right person the decision will get made closer to the action and with more context. Wins all around.

But when it comes to empowering a team or individual to do their job, I think using the term “empower” is wrong. It sends the wrong message. It says “This isn’t really your job. Normally you should do what you’re told, but in this case I’m giving you special permission to do more.” It implies that the person doing the empowering has all the power and is relinquishing it.

I believe that’s almost always not the case. When someone tells you that you’re empowered to do your job, what they really mean is that your job includes making a set of decisions, and you should do your job, making all of the decisions that are required. But the choice of words matter. It sends a message. And we should all be careful of the message we’re sending.

So next time you feel the urge to use the term “empower”, think about what message you’re trying to send. If you’re delegating, then empower away. If, on the other hand, you are trying to encourage someone to use the authority they already have, be explicit. Say something more like “What do you think we should do? It’s always been up to you.”

by Leon Rosenshein

Cost Of Time

“I have only made this code more complex because I have not had the time to make it simpler.”

   -- Grady Booch

Getting things right takes time. Making things simple takes more time. The question is, how much time do you have, and when do you have it? Expediency says make it work and move on. But that leaves you with the complexity. And complexity leads to drag.

The question is, how do you deal with the drag? That’s easy. You put in the time to make it simpler, less complex, easier to work with. But when? When you need it, of course.

The trick is figuring out when you need it. You don’t need it right after you implement and release it. It’s working, so doing extra work is taking away from adding more customer value. It’s probably not when you need to make the first change. Chances are the change is small and specific. You haven’t lived with the system that long so you still don’t have a good understanding of system interactions yet. So you make the simple change. You go through that cycle a couple of times and notice that it’s getting harder to make the simple changes.

That’s when it’s time to put the effort in to make things simpler. At some point the drag of the system means the amortized cost of making the system simpler and cleaner, over the next N changes, is lower than just making the changes as they are needed. Once you make things simpler everything is faster.

And that’s how you maintain velocity. Not by adding people to a team. That just increases the number of connections and slows things down. It’s not by doing more BDUF. You don’t know what you don’t know, and that just causes more rework. And it’s certainly not by taking on more tech debt. Over the long run the interest on the debt will make it so hard to get things done that nothing happens.

So don’t let perfect be the enemy of the good. Make it work, however complex it needs to be to work. But when the time is right, go back and make it simpler.

by Leon Rosenshein

Engineering Excellence

I maintain that we are engineers, and I’m not the only one. And of course, we all want to be excellent to each other. But what is engineering excellence, really?

It’s not doing code reviews or design docs or outage post mortems, but those are all part of it. It’s not SOLID or DRY or KISS, but those are part of it too. And it’s not staging environments, release processes, or alerting. Or at least it’s not just any of those things.

Engineering excellence, and especially a culture of engineering excellence isn’t about the things you do. It’s about why you are doing those things. It’s the difference between trying to ensure bugs aren’t released with the product at the end and trying to ensure bugs aren’t added to the product during development. A sufficiently rigorous late stage QA process can make sure bugs will never be released, but it can’t ensure you actually release something. Conversely, if you make sure bugs aren’t added (or live long if they are) you can release whenever you want.

It’s engineering excellence that takes you from executing all of the processes in the world because you’ve been told that process makes you better to “We care enough to do the best” so we want to do all those things which let us know that we’re doing excellent work. Engineering excellence is about craftsmanship and effectiveness. It’s about fit and finish. It’s about doing things in a way that makes things easier in the future, not what’s expedient not. It’s about doing things in a way that gives you the more average velocity, not more instantaneous velocity.

What you end up with is pretty impressive as well. You end up being both happier and more productive. You do things you didn’t know you could, and before you expected to be able to do them, so more value is released to your customers. You end up happier when you do it as well. You find that the little things that annoyed you are gone. There’s less friction when you try to do something. The time you used to sit around waiting for something to finish is gone.

It happens at multiple levels, not just the individual. The same thing happens at the organizational level. Build roadmaps not just for the quarterly review, but with customers so that the roadmap is built around customer needs. Collaborating with partners to reach a goal quicker. Having processes and norms to make things easier instead of using them as a constraint. A smoothing of the rough edges between and around teams and goals so they work together better.

And of course, not a simple copy of some other company’s ideas, but done in a way that fits our context.

by Leon Rosenshein

Context Matters

I’ve talked about contexts before. The contexts of what you and your audience know. The ubiquitous language of bounded contexts. And the shared context of experience. There’s another kind of context that I haven’t talked about. The kind of context that comes with size and environment.

Because size and environment change things. In a company of 10 people it’s possible for everyone to know everyone else. And for everyone to know as much (or as little) as they want about everything else. A company of 100 people can probably do it, if they work hard at it. In a company of 1000 people it won’t work.

When TK started Uber, in one city with a few people, the size of the company was right and the environment he did it in was ready for what he was doing. Trying the same thing today, in this context won’t get you far.

Consider three very large tech companies, Microsoft, founded in 1975, Amazon founded in 1994, and Google founded in 1998. They all have a search product. They all have a large part of their company dedicated to building and selling cloud services. They are all considered successful. But while their product portfolio has a lot of overlap, if you look under the covers no one would mistake one of them for either of the others. And if you tried to transplant a specific process or peice of the tech stack from one to another it wouldn’t work.

There are books by company insiders about how Microsoft and Google do software development. In both cases they’re distillations of hard-won lessons at the respective companies. And they’re different. Because they evolved over time in a specific context. A context that includes things like company processes, number of employees, company wide and org specific tools, and very different leaders at the top.

Which is a long way of saying that while there’s a lot to learn from the successes (and failures) of other companies and the tools and processes they used, we, or any other company for that matter, can’t just blindly apply those lessons and expect them to work as well.

Instead, we should view the experiences of others through the lens of our context and apply those learnings. Building the tools and processes, the culture, that works in our context.

by Leon Rosenshein

Measurement

"What gets measured gets managed - even when it’s pointless to measure and manage it, and even if it harms the purpose of the organization to do so."

Peter Drucker

"It is wrong to suppose that if you can’t measure it, you can’t manage it – a costly myth.”

W. Edwards Deming

“Managers who don’t know how to measure what they want settle for wanting what they can measure.”

Russell Ackoff

"Tell me how you measure me, and I will tell you how I will behave."

Eliyahu Goldratt

You’ve probably heard shortened versions of the first two. “What gets measured gets managed” and “If you can’t measure it, you can’t manage it”. Now compare the common, well known, shortened version to the full quote. In context the meaning is very different. But that never stopped a good quote from entering the zeitgeist. Or putting the new cover sheet on all your TPS reports. And it’s not just conventional wisdom. I’ve been in multiple training sessions over the years where the shortened, incorrect versions were used. That’s a real problem because all it does is cement the problem.

Beyond the misunderstanding, the real problem with the short version is that they lead directly to the second two quotes. That’s the real problem. Measuring how much value a team has added, especially when a company is pre-revenue is hard. You can’t look at a bump in sales or monthly average users. There are no downloads to count or even user comments to look at. What you can do though is measure activity. Like tickets opened/closed, PRs landed, or documents written. Things like that are simple and easy to measure. And get you a new minivan.

As is often the case, the antidote to incomplete understanding is to go back and look at what was actually said. Don’t just measure something because you can. First, measure something because the information conveyed by the measurement actually helps you manage things in the direction you want to go. If it doesn’t, stop measuring it.

Second, manage the things you need to manage. Some things, like the average response time of an RPC is measurable. It makes sense to track it. And do something when it’s demonstrably below a threshold. But when you don’t have a direct, explicit measurement for it there’s probably a qualitative measurement. Unmeasurable doesn’t mean unknowable. You can’t measure the happiness or energy level of a team, but you can tell if it’s higher or lower than it has been. That’s important information. Use it.

So instead of measuring what you can and then managing that data, manage so that value is added. So that, even when no one is watching, the decisions that get made all day long, the little tiny ones, are made in a way that value gets added. Which means we’re back to talking about culture. Again.

by Leon Rosenshein

Broken Windows

A system that breaks randomly but frequently…and requires very short bursts of work to resolve the issue… will sap the morale of a team far out of proportion to the “size” of any ticket

-- John Cutler

When you overload a database and it falls over no-one is surprised that things stop. When that happens the team(s) involved dig in, pick it back up, and get things going again. Next they figure out the series of unfortunate events that caused the outage and do something to make sure it doesn’t happen again, or at least warn them before it does. Then they close out the issue and move on.

But sometimes that doesn’t happen. Instead, what happens is that an alert fires, but by the time someone starts looking the alert clears. Or maybe there aren’t any alerts, but throughput drops. Or someone tries to access a different S3 bucket and gets an error. There’s no outage, but there’s always something not quite right.

In many ways those on-call shifts are worse than ones with an outage. Because you never get a chance to catch your breath. Or think about root causes. Or do something about it.

What’s a team to do in that situation? The simple answer is to not ever let it get like that. And maybe that’s possible. But probably not. Because when the rubber meets the road you find out that things aren’t the way you expected. It could be because there was a constraint you didn’t see, or a set of known constraints were reached at the same time. Or there’s just a new use case with very different characteristics. It doesn’t really matter which of those reasons it is or if it’s something different. The situation will arise, so you need to deal with it.

IME the best way to deal with it is to own it. Acknowledge both the pain you’re feeling and that it needs to be dealt with. Acknowledge it as individuals and as a team. Take the time to understand what’s really going on. Then take the time to fix it right. Sure, it’s going to impact near term deliverables. If you spend a few days analyzing, experimenting, and delivering a fix you have slipped your immediate delivery by that many days. Own it. Tell you customers. But also recognize that after you put in the effort you’re going to be in better shape.

Drag will be lower. You’ll move faster. With everything else staying the same you’ll find that pretty soon you’re ahead of where you would have been if you hadn’t taken the time to fix the issue.

And you know what happens if you don’t take care of the issue? You’ve created a precedent. Similar to the broken windows theory. If it looks like things aren't being taken care of they won't be. When the next issue comes up people will look around and see that others haven’t done anything about their issues. Which makes them less likely to do something about the new one. Now you’ve got an even stronger precedent. Which makes it that much more likely that the third issue will be ignored.

Windows setup screen

Then one day you pick your head up and realize there’s a culture of not worrying about the little things. That’s not a culture anyone wants. Friction builds up. Progress slows. Deadlines are missed. You see things like this on the big display at the airport that's supposed to tell you when your flight is departing.

So instead, build a culture where the little things are important. Where people and morale are important. Where problems are acknowledged, owned, and acted on. And you’ll find that you not only have happier, more fulfilled people and teams, but you’ll get more done as well.



by Leon Rosenshein

Drive

You’ve probably heard of Maslow’s Hierarchy of Needs. Especially the first four levels. What Maslow called the D-Needs. The basic life and safety needs. If those aren’t met, nothing else matters. Things like air, food, and shelter. And that makes sense. Without those, nothing else really matters. And for a long time That was motivational theory. In life and at work. People are motivated to have their D-Needs met, and a steady paycheck ensured that you could meet those needs. But what happens when you’ve met those needs? How do you motivate people after that. Sure, Maslow had  levels beyond the D-Needs, but they weren’t as well fleshed out and there wasn't a coordinated way to approach motivation at that level.

Then, in 2009 Daniel Pink published Drive: The Surprising Truth About What Motivates Us. Pink took as his starting point, not Taylorism and the assembly line/repetitive task worker, but the knowledge worker. According to Pink, once you meet those basic needs you need to do something else to motivate your employees. For those people, a different approach is needed. Instead of a hierarchy, it’s a combination of three distinct, yet related things, autonomy, mastery, and purpose.

Autonomy: The desire to be self-directed. To figure out what the thing is that you should be doing.

Mastery: The desire to learn and grow. Being able to develop new capabilities for yourself.

Purpose: The desire to do something important. Something that will have a broad impact.

But how do you have a team, let alone a business, when all of the team members are doing what motivates them individually? How do you keep a team from turning into a group of individuals with a common manager, if not devolving into anarchy? That’s where the interrelatedness comes in.

The interrelatedness that comes from vision. Because when the company’s vision aligns with the team’s vision/purpose, which aligns with the individual’s vision/purpose anarchy is not the issue. If everyone’s purpose aligns, then, in general, work will align.That’s a two-fer right there. Making sure you have alignment of purpose means your autonomy won’t turn into anarchy (assuming you have trust and clear communications).

One thing that alignment doesn’t do though, is ensure that all of the needed work is being done. That’s where mastery can help. There’s lots of work to be done, and there are probably people who think it would be interesting, but don’t know how to do it. Instead of silo-ing folks into narrow areas, you can encourage them to learn new things in different areas. That’s another two-fer. You get more work done in the areas that need it and you bring different viewpoints/perspectives, which give you better work as well.

The real magic is in how to use that kind of motivation. To do that you’re talking about culture. Which is a whole different topic for another day.

by Leon Rosenshein

DNS Linkage

Question: How dynamic is your linker? Or put another way, can DNS be considered a really late binding, dynamic linker? Or put a third way, is the only difference between a monolith and a gRPC microservice ecosystem when the binding occurs?

Of course DNS isn’t a linker. A static linker pre-calculates the memory address of a function call. Then it sets the value that will be set into the instruction pointer. Then the normal operation of the program jumps to the function, and when it’s done the result (if any) is in a well-known place and execution continues. A dynamic linker does essentially the same thing, only it does it much later in the process, just before it’s needed.

DNS on the other hand just translates a well-known name to a routable address. It doesn’t set any registers. The normal operation of your program doesn’t go to a different memory location based on what DNS comes up with. Instead, your program does the exact same thing, regardless of what DNS returns. It sends a message to the address. Then it stops and waits for the answer (if any) to show up in a well known place. From an architectural perspective that really isn’t any different. That’s probably why it’s called a REMOTE procedure call. It’s just like any other procedure call, but it happens somewhere else.

That’s interesting and all, but it means there’s something really interesting you can do with it. You can expose your API as an RPC and as a library call. Yes, it’s more work up front, but it enables lots of interesting things. 

Consider a system that, in a mature, Uber scale, production, needs to handle thousands of concurrent operations. You’ll probably want to scale different parts differently, avoid single points of failure, and be able to update bits and pieces individually. This might naturally lead you to a microservice architecture. It meets all of those requirements. But before you get to that scale you’re handling a few concurrent operations. Or as a developer building/testing the thing in most cases you’re looking at 1 transaction at a time. So you don’t need the scale.

So maybe at the beginning you build libraries and link them together into one thing. It’s easier to build. It’s easier to test, It’s easier to deploy. Then as load and scale increase and you really need it, you turn your function calls into gRPC calls and split the monolith into a set of microservices.

What if you’ve already got a set of microservices and you need to do some testing. Sure, you could spin up a cluster and deploy a whole new set of services and make your calls. Or you could docker compose the entire system. Good luck step debugging though. You’ll need great logs, tracing, and a lot of patience with the debugger of your choice as you chase data around the system.

The other option would be to recompose them as a monolith. Debugging becomes easy. Following the execution is step-into instead of finding the right instance of the right service and hoping your conditional breakpoint is correct so you catch what you’re looking for. And it runs on your local machine. Less hoops. Less hops. Less latency. More productivity.

So no, DNS isn’t a linker. But sometimes it acts like one. 

by Leon Rosenshein

That's a Drag

wip

Speaking of flow and WIP, the reason to reduce WIP is to increase flow. The reason for increasing flow is to reduce time to value. To add value sooner. So obviously, if you want to add value sooner, you need to go faster.

But what do you need to go faster at? Increasing to speed can help, but not as much as you think. What you really want to increase is your average speed. After all, how much of your time is spent at “top speed” (whatever that means)? If you want a car analogy, it’s the difference between the big oval at Indianapolis Motor Speedway and the road course at Laguna Seca. At Indy you spend a lot of time at top speed on the long straights. Going a little faster there can make a lot of difference. Laguna Seca, on the other hand, has a top speed, but it’s not limited by the car’s top speed. Instead it’s limited by the shape of the course and how hard you can brake, how much grip you have around the turns, and then how quickly you can accelerate towards the next turn.

Developing large, complex and complicated software projects is much more like that road course than the brickyard. If you want to bring your average speed up the best way to do it is to bring up the speed of the slowest parts, letting you spend less time there and more time at the higher speeds. That’s where drag comes in.

When I say drag, I mean the things that slow you down while you’re doing something else. Time to compile and time to run tests are part of it, and maybe the most visible parts, but by no means all of it. Time spent configuring your environment to your (and your team’s) liking is drag. Time spent trying to find the right piece of documentation about how a library or tool works is also drag. Learning a new process or tool is in that category. Planning can be a drag. Then there’s the big one. KTLO keeping the lights on. The more time you spend KTLO, the less time you’re spending at top speed. Heck, reading random blog posts on the corporate intranet could be considered drag.

You can’t eliminate all drag. It’s an inherent part of the system. You need to compile your code. You need to keep the lights on. You should always be learning (and reading blog posts :)). But you need to be aware of drag. And minimize it.

Simple things, like incremental builds and tests. Using remote builds to speed up your builds. Scripts to automate common tasks like setup and deploy.

More complex things like build/borrow/buy decisions or planning at the correct level for the time horizon. Taking a little bit longer up front to reduce KTLO later. Building test harnesses so that you can do integration testing without needing a full environment.

So next time you’re trying to figure out how to reduce the time it takes to add value, remember that increasing top speed is only one of the things you can do. In many cases you can add value sooner and easier by reducing drag.

by Leon Rosenshein

Flow and WIP

Way back in 1975 Fred Brooks told us about the mythical man month and that adding people to a project that’s late will just make it later. There are lots of reasons for that. One of the biggest, according to Brooks, is that the number of communications channels between all of the people involved goes up roughly with the square of the number of people involved (actually n(n-1)/2), and all that communication at best slows things down, but more likely creates bottlenecks, which really limit things.

The other thing that can slow down a team is too much work in progress (WIP). With lots of WIP you have a few issues. First, you often end up doing work that isn’t quite right. If you build the dependent system (B) before the thing it depends (A) on you end up either constraining A to be something other than it should be, or you end up doing rework on B to match what A ended up being.

Second, If there is no B, but there’s some unrelated task that keeps you busy you end up with extra context switches. And every one of those takes time. Which ends up taking time away from actually doing those tasks. So you actually make less progress.

But the biggest issue is that while on average tasks might take a certain amount of time, in practice they’re all different. And that difference creates bottlenecks and work queues. Especially when combined with the notion that you should be busy, so you start something else while you’re waiting. Which means you’re not ready when the work is, which delays it even more.

You’ve probably heard that before, but it’s hard to visualize. How can being busy slow you down? It just feels wrong. I recently came across this video which shows a simulation of just how this works, or doesn’t work. How adding stages makes things slower. Even when you add more people at the bottleneck. And how limiting WIP and expanding an individual's capability makes the overall system faster.

And that’s what we’re really going for. Making the system faster, not the individual pieces.