by Leon Rosenshein

A Problem Vs. The Problem

If you’re looking for the All Models Are Wrong that was here by mistake, just follow this link.

I’ve talked about Root Cause Analysis (RCA) before. Otherwise known as “Why did this really happen?” It’s an important tool in deciding if you’re fixing a problem or alleviating a symptom. Both are very doable, and both can be appropriate. The question you need to ask yourself is “What impact do you want the solution you’re coming up with to have?”

The answer to that question is very contextual. If you’re doing a market analysis or trying to find product/market fit then you’re looking to solve a problem that your customers might not even know they have yet. At the other end of the spectrum, are outages and incident response. Something that used to work is suddenly not working. Users know exactly what their problem is (they can’t do the thing they’re trying to do), but have no ability to get it working again. You, on the other hand, have no idea what the user’s problems are, but you can see that your software/service isn’t responding correctly.

In the first, arguably easier, case, you have a couple of luxuries. No one is upset with you (yet). They have limited, or no, expectations. You have the time to think deeply about the problem and figure out what the root cause is and come up with a solution that solves that root cause. Then, with the root cause addressed, you can meet the user’s specific needs and add value for them.

When you’re dealing with an outage on the other hand, you don’t have the luxury of time. Something is broken and your users are suffering. You might just be slowing them down a little, but you could also be costing them time and money. At Uber, if the rider/driver matcher wasn’t working you had both problems. The riders were unable to get where they needed to be on time, and the drivers weren’t able to earn money. That puts more than a little urgency on solving the problem. That urgency drives you away from doing RCA and solving the underlying problem, and instead solving the proximate cause. However, you still need to solve the root problem.

Which leads right back to a problem vs the problem. Sometimes you need to solve a problem, the one right in front of you. Door A or door B? It’s highly unlikely you’ll be in the same situation again, so there’s no deeper, root cause to find and fix.

Consider the case where you’re at some kind of social event and you’ve walked up to buffet and they have lots of choices you like. Take a little of everything, or a lot of one thing. Make a choice. If you’re still hungry after, make a choice again. There’s no need to investigate why you’re at the buffet, how the selection of dishes was determined, or figure out a way to change those things.

On the other hand, if you’re at the event and you’re not hungry, the situation is different. Even if you have some kind of restricted diet, you’re not hungry, so there’s no need to find something to eat. You can just grab a drink and think about what you cold do if you were counting on the buffet having something you could eat. There are lots of ways to handle dietary restrictions at buffets, from carefully picking foods to talking to the provider to eating before or bringing your own food with you. There’s no pressure so take the time to figure out what you want to do next time you end up at a social event that might have a buffet.

On the gripping hand, you’re at the event, you’ve been on the road for 12 hours, so you’re more than a little hungry, and you’ve got those dietary restrictions. Now you need to solve both problems. You’re hungry, and clearly you hadn’t fully thought through the situation. What can you do to get something to eat now (the proximate problem) and how can you make sure you don’t find yourself dealing with the same problem again (the root cause)? To make matters worse, you’re hangry because you haven’t eaten in 12 hours. In that case, fall back to the incident response pattern. Stop the damage and grab a drink. Mitigate by finding something to eat. Identify and implement the long-term fix by keeping some iron rations in the car so don’t go 12 hours without eating again.

So next time you find yourself solving a problem, make sure you’re solving the right problem at the right time.