Firefighters, especially professional ones, have buckets of slack time. According to one study firefighters should have less than 25% utilization (time responding to incidents) to avoid burnout. 75% slack time. Built into the fabric of the system. If they don’t have that much slack time they scale up. Cities build more firehouses, buy more engines, and hire more firefighters to get that slack time back.
That’s probably a reasonable goal for your on-call person. First of all, you want your on-call to be fresh if something happens. Second, slack time doesn’t mean idle time. It means there’s not something specific scheduled to be done. There’s always plenty of maintenance work to be done, so the on-call isn’t idle. Things like updating runbooks, alerts, and documentation, automating common on-call tasks, and digging into perennial trouble spots.
But what about the rest of the team? No slack time there. That’s the way to get the most done, right? Wrong. I don’t know about you, but I know the planning I’ve been involved with does a pretty good job of spec’ing out the known knowns, and identifying the known unknowns, but the unknown unknowns and the things we know that just ain’t so always come up.
Even with perfect planning and no unknowns you need some slack. Vacations. Injuries. Illnesses. Outages (other people’s). All of those things, and more, say that if you schedule 100% of the time you’re not going to complete your plan. That seems to be true even if you follow Hofstadter's Law.
One option is to not commit to anything. Just work on the most important thing at any given moment. Things take as long as they take, but you’re never late. That works really well at a small scale, when there’s only one person/group deciding what the most important thing is. And that thing doesn’t change often while you’re working on something. But when the most important thing changes a lot, and the cost of your context switch is high, that leads to lots of churn and peanut buttering your progress. And that ignores anyone potentially waiting for you to be done.
If there are others depending on your work being done by a certain time and you miss then they’re going to miss as well. And their dependencies will miss something. Small delays compound and you end up with long delays. So we do need to make commitments to dates and hit them. Especially when working in a deeply interconnected system (which we do).
Which brings us back to having enough slack time. Or conversely, only committing a certain (less than 100%) amount of your time to work against deadlines. It won’t guarantee you meet those commitments, but if you don’t have enough slack, I can guarantee you won’t meet them.