by Leon Rosenshein

Remediation Surprise

Some call them Postmortems, Post-Incident Reviews (PIR), or Root Cause Analysis. Some call them something else entirely. Regardless of what you call them, taking the time to figure out why something bad happened and what you’re going to do to make sure it doesn’t happen again is a crucial part of a stable, performant, flexible systems. Unfortunately they also lead to remediation surprise.

A key part of the PIR is the list of action items. The things you’re going to do to ensure it doesn’t happen again. Or at least that you’re alerted to it sooner so you can react faster. After all, if you don’t do anything with the new learnings, why bother to learn them?

Things on that list of action items have varying priorities. Some are things that needed to happen soon. They’ll make the future better. They’re not going to change the day-to-day but will make the future better. Some are anti-things. Don’t turn knob “X” without first adjusting knob “Y”. Put that one in the knowledge base and a reminder to yourself to do some validation later.

And there are things that need to happen right now. Things that are critical to continuing to make forward progress. If you don’t do them, you’re going to continue to spend way too much time fighting fires. Some of those are simple changes, so you just do them immediately. They eat into your slack time, but that’s what it’s there for and doesn’t change any of your schedules.

Others though are bigger. And algorithm change. A dataflow change. Things that are much bigger then would fit into your slack time. You need to get them done, but something’s got to give. You just don’t know what. And neither do the folks depending on you. Which is bad. You want to surprise and delight your customer/user, but that’s not the kind of surprise they want.

By the time you get to this kind of remediation you’ve already surprised your customer once, so you really want to avoid surprising them again. One really good way to do that is to tell them what you’re not going to do. What you need to delay in order to do the work to stop giving them bad surprises. And then ask them if the trade-off makes sense from their perspective. Maybe there’s something else they’d rather delay. Or drop entirely. You won’t know until you ask. When you ask, you avoid the remediation surprise.