Recent Posts (page 19 / 70)

March 16, 2022 by Leon Rosenshein

Raspberry vs. Strawberry Jam

agile weinberg jeffries

“The wider you spread it, the thinner it gets.”

– Gerald M. Weinberg

Otherwise known as the Law of Raspberry Jam. You take some raspberry jam and spread it over so many slices of toast that it just add as little bit of color and a hint of sweetness. None of the wonderful flavor or texture of the raspberries you were trying to add. I’ve called it peanut buttering things before. Either way, you declare that you’re going to take advantage of something. Then you take the idea, set of resources, your time, or level of effort and spread it so thin that you don’t get any of the benefits you’re expecting. When it doesn’t work (or at least not as well as you expected) you declare that doing it was a waste of time and go back to the old way.

That just doesn’t seem honest to me. If you’re going to give something a chance, you should give it a chance. Really try it out. Put enough effort behind it to know if it works. Unfortunately, in reality you can’t always drop everything and put full effort into an attempt. There is, however, something you can do.

Because Weinberg has another law. The law of Strawberry Jam.

”As long as it has lumps, you can never spread it too thin.”

That’s the important difference between strawberry jam and raspberry jam. Strawberry jams, at least the good ones, have lumps in them. Things that you just can’t spread too thin. No matter how hard you try, those lumps of strawberry don’t get spread around. They might move around, but they don’t get smaller and they retain that great strawberry taste. The idea is to apply that idea to the thing you’re doing/trying. The indivisible nuggets of the idea, resource, time, or effort you invest.

Consider the simple idea of a sprint. If all you do is change your planning from every quarter to every 2 weeks all you’ve done is change the cycle length. You haven’t actually changed much. Instead of seeing how many stories you can fit in a sprint, start by deciding what the goal is. Define what benefit/value your customer/user will see when you’re done and then ensure that you’re working on the right things to make that happen.

That’s the indivisible nugget. Defining the customer value and implementing it. Do that and you’re not just changing the cycle time for the sake of changing the time. You’re changing the time so that you deliver more customer value sooner.

That’s giving the idea a chance.

March 14, 2022 by Leon Rosenshein

Planning and Time Sinks

planning agile eisenhower

Paraphrasing Eisenhower and von Moltke, planning is important, plans, not so much. The biggest reason for that is that no matter how much or how well you plan, things don’t always go the way you expect. Some things go faster, some go slower, and some just don’t work. Which leads to Hofstadter’s Law. Of course, Hofstadter’s Law is self-referential, so you can’t just plan it away.

What you can do, though, is understand why things take longer. There are lots of specific reasons, but generally they break down into 5 time thieves. They are:

Too much WIP: Context switching and always being busy makes you slower
Unknown Dependencies: It’s the unknown unknowns that you find later that slow you down
Unplanned Work: It’s hard to prevent the next fire while you’re fighting the current one
Conflicting Priorities: Not knowing what to do first/next. Changing focus (or even goals) leaves you with lost time (and potentially wasted effort)
Neglected Work: The tech debt that slows you down. All the things that make it harder to get things done

Their all related. Unplanned work is usually because of Conflicting Priorities. Something comes up that is suddenly higher priority than what you’re doing. Unknown Dependencies are often caused by Neglected work. A dependency pops up because you haven’t done the work to define or update your domains or context boundaries when things changed. And anything that has you switching tasks before their done leads to Too much WIP.

Together, all those things mean that your plans shouldn’t be written in stone. Things will change, and you need to be adaptable. But thinking ahead will help you know the things that might happen and lets you prepare for them. And being prepared lets you minimize their impact. You can’t eliminate the impact, but you can make reduce it.

Which brings us full circle. As Eisenhower said,

Plans are worthless, but planning is everything.

March 11, 2022 by Leon Rosenshein

Are Your KRs Really Tasks?

Planning OKRs

OKRs. Objectives and Key Results. The Objective comes from your vision, and the Key Results define what success looks like on your KPIs (Key Performance Indicators). One nice things about Objectives is that they can be decomposed into smaller, more focused objectives as you got down the org hierarchy. That works great at the company, SVP/VP, and director level, and can work for managers of managers.

When the org is using OKRs every level is expected to have their objectives support one of the OKRs at the level above. This helps with prioritization and ensuring the right level of effort/support is put behind each high level OKR. And it provides a path for issues to bubble up so they can be addressed while they’re still small issues, not big problems.

But it often breaks down at the individual team level. The team is trying to increase customer value and decrease developer workload. The problem comes when decomposing the higher level Objectives into team specific Key Results. At a task level the work looks something like “Document service X”, “Implement sharding for the database”, “Enable real-time queuing for requests”, or “Improve error handling”. Those may be critical tasks, and are probably the right things to do, but they’re tasks, not Key Results.

Unfortunately, what often happens is that those tasks end up in the OKR roll-up as Key Results. But they’re not Key Results. They’re not even results at all. They’re just tasks. Remember, Key Results are supposed to be the desired values of the KPIs.

What you need to do is figure out what the right KPIs are for the Objective. KPIs that define how things look from the outside. Then come up with how you expect the completion of those tasks to impact the KPIs. Those are the Key Results. The improvement (outcomes, not outputs) that comes about because of the work done.

That doesn’t mean the team doesn’t need to keep track of the tasks. In fact, it’s critical to understand what the work needed to reach the result is. But it’s even more important to ensure that the result is what’s being tracked, not the work being done.

March 9, 2022 by Leon Rosenshein

Heros

Agile devops

I’ve spent the last 10 years or so down in the metaphorical engine room. Working on developer experience, infrastructure, platforms, and tools. The things you often don’t notice or talk about unless there’s a problem. Then folks complain about it. Then, when it’s better they thank the person who told them about the fix. And I think that’s a problem.

Because it leads to things like this comic.

Two people each see a small problem. One fixes the one they see. The other watches it grow bigger, impacting lots of people. When everyone is aware of the big problem the second person fixes it. The first person is ignored, and the second is marked as a hero

Two people saw small problems but their responses were very different. The first person saw the problem, fixed it, and moved on. The second person just looked at theirs, or maybe didn’t even notice it. Then, when the problem was big and lots of folks saw it and were impacted, they fixed it. The first person was ignored. The second person was declared a hero.

And that’s wrong. Preventing the big fire is much more impressive. Noticing a small issue and keeping it from growing has a much bigger impact on the group as a whole. We need to be thanking the first person.

And understanding why the second person didn’t do anything originally. And getting them to fix problems when they find them.

The first person is being a hero. That’s what we should be celebrating. A culture of quality.

March 7, 2022 by Leon Rosenshein

Are we there yet?

Agile WIP

No, I’m not channeling my kids on a 20 hour car ride (although that has happened multiple times). If you’ve ever made the drive from New York to Miami you know what I’m talking about. You drive for hours. All day in fact. A long day. A very long day. You finally get to Florida. Yay. We’re almost there. Then you realize you still have 6 hours to go. You’re only 2/3 of the way there. Getting to Florida is a significant milestone, but you’re not done yet.

In many ways development in large, complex systems and environments is like that. You spend time figuring out where you want to go (identifying the requirements and sequencing them). You start out heading down the best route you’ve found (getting a tool/service/library working in your local or test environment). You make some side trips, either because you need more supplies (modify an upstream service) or something looks relevant (understanding a dependency). There can be detours along the way (pop-up tasks, outages, etc.), and sometimes you take a wrong turn (incorrect architecture and algorithm choices). And you have some intermediate stops planned along the way (milestones). Eventually you get to where you’re going (your customer is happy with the results).

Getting to Florida on the trip from NYC to Miami is like getting it to work in your local environment. It’s a key milestone, and you should celebrate it, but it’s not the then of the journey. You still need to get the work you’ve done into your customer’s hands. You have to deploy it. Until then, the journey isn’t finished.

Sometimes that’s easy. The change is small, or additive. Or the thing you’re deploying has no state. So after some testing in various environments you deploy it. Other times though, deploying it can be as (more?) complicated than writing it. Test environments can be hard to work with. There might not be enough automation. There might be stateful services that need careful coordination to replace. All of the above. And the final deploy to production, to your customers, can be all that and more.

So two bits of advice:

Don’t say something is done until it’s in your customer’s hands. If you’re using a Kanban style tracking system make sure you have a column for deployment. Track the time it’s in the To Be Deployed state and you’ll find your bottlenecks pretty quickly
Make your test/deployment process as automated as possible. Use scripts, throw away environments, stored data, live data streams, or whatever it takes to make it simple to go through the process. And since your task isn’t done until the deployment is done, you’ve got both the time and incentive to make it faster/easier to do.

March 4, 2022 by Leon Rosenshein

Tasking

PM eisenhower

As a general and President, Eisenhower had a big impact on the US. One of the more relevant to development is the Eisenhower Matrix. It breaks tasks down into 4 types and helps you prioritize which ones you should work on when. It also tells you which ones to ignore.

But what if 4 categories is too many for you? Shreyas breaks it down into 3 categories, giving you one less thing to remember.

TaskPriority

The three categories are:

Leverage: Tasks that have an inordinate amount of impact on the future. Do a great job on these. They’re the important ones and the time won’t be wasted.
Neutral: Tasks that are important, but the world goes on. Do what you can, but don’t obsess over them.
Overhead: Tasks that are overhead. Someone needs them done, so you need to do them. However, actively try to do as little as possible to meet the requirement.

If you do that you spend the time needed on the important tasks and get the overhead out of the way with minimal effort. You get to focus and your customer partners are happy.

Of course, the trick is to make sure you get the categorization right. Invert that and no one will be happy. Personally I prefer Eisenhower’s matrix, but I’ve seen many people be successful with this method.

February 28, 2022 by Leon Rosenshein

All Aboard

life

Today I started the latest chapter in my life. Just started my new job with Amazon, working on Just Walk Out technology. This is a big job change for me. It’s the first time in almost 25 years that I had to interview for a new job. It’s also the first time in almost that long that I made the transition alone.

Prior to Amazon I was at Aurora Innovation. I got there as part of an acquisition. Before that I was at Uber’s ATG subsidiary. I joined that organization, along with the rest of my team, as part of a re-org as Uber decided not to source its own maps from the ground up. I moved to Uber in a similar fashion. Microsoft decided that it wanted out of the map-making business and sold part of Bing Maps to Uber. While I was at Microsoft I spent most of my time in either the Bing Maps org or the Simulations org. Inside (and even between those two orgs) I moved to follow interesting work and advance my career. No (or very informal) interviews, and teams moved together.

This time it’s very different. I’m the only one moving. I had to interview with people I didn’t know and were unfamiliar with my work. Despite the fact that I’ve interviewed ~1500 people over the years, being interviewed was very different. Sure, the experience helped, and Amazon’s interview process was really smooth, but there’s still a lot of pressure.

And now, instead of working on the same things I’ve been working on, or learning new systems with the rest of my team, I’m drinking from the firehose. So much to learn. New technologies. New people. New processes. I can’t wait to see where it leads.

February 25, 2022 by Leon Rosenshein

Chesterton's Fence

context

I’ve mentioned Chesterton’s Fence in a couple of previous blogs, but I never did define it. Chesterton’s Fence is a parable about how lacking context can be dangerous. In his 1929 book, The Thing, Chesterton wrote:

In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.

Or, in other words, if you don’t know why something was done, don’t undo it without figuring out why it was done in the first place. Only when you know why something was put in place can you decide if it’s not be needed any more. Consider the typical bathroom sink.

Many of them have a small hole or two in the near side a little below the top. It just sits there. Nothing comes out of it. Nothing goes into it. So they should be removed, right?

Not so fast. In normal operation those holes are unused. But bathroom sink drains are notorious for being slow. Not blocked, because that’s obvious and gets fixed, but slow. And very often, when in the bathroom using the sink with the water running you’re doing something else. Brushing your teeth, shaving, removing makeup, or just distracted.

That’s when those holes come into play. As the drain runs slowly the water builds up in the sink. Eventually so much water builds up that it starts to run through those holes and down the drain. It’s not perfect, but it helps, and gives you a little more time to correct the problem. But if you never saw them used you might by a sink without them, then find a small lake in your bathroom.

The same thing can happen in code. You want to break long lines at a space if possible. As you’re going through the code you come across a strange byte-counting/examining loops. You’re confused because it doesn’t just start at column 80 and go backward looking for a ``. It starts at the beginning, checks values, skips spaces checks more, and keeps track of the last space it found. Very complicated. So you remove it and go with the simple solution.

You write your tests, deploy, and close out the task. Done and done. Then suddenly you start getting bug reports. Turns out you assumed that all of the input was simple ASCII code, but in reality it’s Unicode/MultiByte characters, so you can’t do the simple thing. Without enough context, you broke the code.

That’s Chesterton’s fence. So the next time you want to remove some seemingly unimportant code, make sure you know why it was added in the first place.

February 23, 2022 by Leon Rosenshein

Moving On

Life

Today was my last day at Aurora. Some of you I’ve worked with for 15+ years across 3 ½ companies, others only for a few months at one company. Regardless, to all of you, a big thank you. You’ve all helped me learn and grow. As a developer, an architect, a mentor, and as a person.

Even though it’s been just over a year, I look back at all that’s been accomplished and it’s truly impressive. Aurora merged two companies’ people, technology, and infrastructure. It’s gone public. It’s pulling loads for partners. It’s showing the world that they can move the goods and manage the fleets.

Although I’m moving on, I have complete confidence that they will successfully bring not just the Aurora Driver, but the entire ecosystem, to market, safely, quickly, and broadly.

Thank you again, and I’ll miss you.

February 22, 2022 by Leon Rosenshein

Exit Criteria

engineering excellence devops

Over my career I’ve written plenty of documents. Requirements docs, design docs, project specifications, white papers, and vision docs. They all have a few things in common. A definition of the current state, the problem with the current state, how the thing being written about will solve the problem (goals), and what issues it won’t address (non-goals). After that the different docs have different focuses and levels of detail appropriate for the role the doc was supposed to fill.

Over the weekend I ran across another section that should probably be in many of those docs. The exit criteria. Not the exit plan, which is a good idea, and says what you’ll do if/when you need to do something else. The exit criteria aren’t how you’ll change, it’s the set of questions/markers that tell you when to change.

Because one of the hardest things to do is to stop doing something that has been working well for a long time. After all, if it ain’t broke, don’t fix it. But what if it is broken, and you don’t recognize it, or don’t want to admit it? It’s also the sunk cost fallacy. It used to work. All I need to do is make one little tweak and all will be well. Besides, I built it, so it must be good.

This is especially true at the Process/Policy level. When I started at Uber there were 14 cultural values (14 is way too many, but that’s a separate issue). They worked really well for Uber at the beginning. They kept people working together, working quickly, and let the company grow much faster than outsiders expected. Then, over time, people started to use them not to make better decisions for the company, but to make themselves look better, sometimes at the expense of others. They took those values and used them to get what they wanted. Toe-Stepping and Be Yourself went from sharing diverse viewpoints to find the most effective to the loudest voice in the room wins. Let Builders Build went from don’t be blocked to promotion-based development. There are lots of other examples

But the values didn’t change. Instead, we started layering process on top of process to try and reduce the impact of weaponized values. Re-education and posters on the wall. An ever-changing promotion process. Locking the beer taps before 6:00 PM. Those things helped a little. Or at least helped reduce the expression of the problem, but they didn’t fix the problem. Eventually things got so bad that we got a new CEO and new values.

It’s by no means the only reason, but the fact that we didn’t revisit the values certainly contributed to the problem. And one of the big reasons they weren’t revisited was that everyone looked at the values themselves instead of the impact those values had. After all, who wouldn’t want to be themselves at work?

I don’t know exactly how to word it, but if, in the detailed description of each value it said something like “When this value starts being used to <XXX> we will reexamine ALL the values and ensure that they still represent and are being expressed as the values we want for Uber” things might have turned out a little differently.

This applies to documents, not just values. When you’re writing a design doc for a system you design it for a certain environment, with some allowed variation. Be explicit about that. If you’re designing a system to handle 100 QPS then it should probably be designed to handle 120-150 QPS just in case. And there should be a note in the doc that says “This design is no longer valid when QPS > 80 for longer than XX” so you know that you need to revisit the problem.

When you come across a system that is outside of its design parameters, make sure you understand how the current situation differs from them, and react accordingly

Older Newer