Recent Posts (page 39 / 70)

November 30, 2020 by Leon Rosenshein

Decisions, Decisions, Decisions

Decisions are important. Who makes them. Why they're being made. When they get made. What the intended consequences are. What the unintended consequences are. Those are all important things about decisions. And there's another important thing that I didn't mention. That's how they're made.

Sometimes making a decision is complicated. Like choosing a U.S. President. First you have to jump through the hoops to get on the ballot. Then the eligible people vote in their states. The states count the votes, and the results of the people's vote is announced. Next, the state legislatures choose some other people (the number chosen based on the number of Senators and Congresscritters) who then go off in a room somewhere and have another vote. The person who gets a majority in that count is then declared to be President. Lots of moving parts, and it takes months.

Other decisions are simple. Most of us have a dominant hand, and when we need to write something down we use that hand. Not a lot of thinking about consequences, involving others, or taking time. Just pick up a pen and write something down.

At work we're all decision makers. The hard part knowing how those decisions should get made. In really broad strokes, there's a continuum, from autocratic to consensus to unanimity. A good way to approach it is to think about the scope of the decision. The name of a temporary variable in a loop in a function has small scope, and the developer should pick the name and use it. That would be the wrong time to call a meeting. collect ideas, and then discuss until everyone agrees that you've picked the perfect name.

On the other hand, if you're designing an API you need to ensure that your customers will actually want to use it. Again, you're not going to wait until everyone agrees that the API is perfect for all of their different use cases, but you should have consensus among the group that the API isn't unusable.

And on rare occasions, it can be necessary for everyone to be fully committed. If success requires everyone to actively participate then making sure everyone is in full agreement is critical. Those situations aren't common, but when they occur, everyone needs to agree.

Of course, sometimes you can't get to consensus, let alone unanimity. It could be because of viewpoint, conflicting goals, or simply lack of time. In those cases, after trying to get consensus on the API, someone needs to, with understanding of the use cases, and ensure that what's truly required can be done, make an autocratic decision.

And regardless, once the decision is made, everyone needs to work with the decision.

November 25, 2020 by Leon Rosenshein

Take Time

One of the counterintuitive things about productivity is that you can be more productive overall by taking a break. And like everything else, this scales. Taking a short break during the day can clear out the cobwebs, let your mind process what’s floating around, and surface ideas that you were too busy to notice.

Similarly, holidays and vacations give you a slightly longer period to reset. There's more to life than work and, especially these days of WFH, having the time to not work is critical. Whether it's spending time with family, working on a hobby, having new experiences, or actually doing nothing (which is very different from not doing anything), it takes time to shift from work to something else. Personally I like cruises because of how disconnected from the rest of the world they are. One thing I've noticed is that it takes me 2+ days to actually get disconnected and start to get the benefits.

Then there's the sabbatical. 4 weeks for every 5 years of service. You can really do something with 4 continuous weeks. Everything from nothing (again) to starting something that's important to you. You could do a Feynman sabbatical in another field. Help out that NonProfit that needs your help. Whatever makes you happy.

I say this today because tomorrow is Thanksgiving Day in the US and Americans, particularly in the tech field, are notoriously bad at taking vacations and enjoying their holidays. Lots of reasons why and this isn't the place for them. But it is the place to remind folks that if tomorrow is a holiday for you, take it. And for the non-US folks, when your holidays come up take them.

I guarantee you the work will be there when you get back. And you'll be better equipped to do it.

November 24, 2020 by Leon Rosenshein

Error Types

it depends

Statistics is all about the null hypothesis. You assume it’s true and try to prove if it is false. Consider a fire alarm. If it’s not ringing you assume there’s no fire. If it is ringing then the assumption is that there is a fire. The state of the alarm is a a simple binary. It’s either ringing or it’s not. And either there is a fire or there’s not. So you have the following truth table

                 Alarm
           Ringing   Not Ringing
         |---------|-------------|
 No Fire | Type I  |   CORRECT   |
         |---------|-------------|
    Fire | CORRECT |   Type II   |
         |---------|-------------|

Simple and clean. Two correct states and two error cases. The Type I, or false positive, and the Type II, or false negative.

As developers we need to deal with this kind of problem all the time. One of the more common is alerting, or error detection. If you have a perfect signal for an error case then you can always do the right thing. If your service is not running that’s an error. Simple. But what if you’re not getting the signal that your service is running? What type of error is that? Does that mean your service isn’t running or that the signal is blocked? What do you do in that case?

Well, it depends. Mostly it depends on the various costs of being wrong and benefits of being right. For monitoring the datacenter most of our alerts will fire if we don’t get the signal. It’s a Type I error and we do that for a few reasons. First, even if the DC is ok, the fact that we’re not getting a signal is a problem, and the cost of DC outages is high. Even if it’s not something in our control (i.e. the fiber seeking backhoe strikes again),

FSB

we still want to know so we can do something about it. Second, the actual cost of the alert is pretty low. Just a phone call, albeit potentially in the middle of the night. Actually, the cost for a single event is low.

The problem is that this is a distributed system. There are latencies. There are networks. There are many reasons why we might not get a datum on time, and if we fired the alert every time that happened we’d quickly succumb to alert fatigue and start ignoring them. That’s not an error, but it is a real problem.

So to avoid that we build some latency into the system, The signal needs to be bad for some time before we fire. The longer the time, the less likely we are to have a Type I error. Unfortunately, the longer the time, the more likely we are to have a Type II error, a false negative. And the cost of those is high. 100’s of people and thousands of tasks failing. We really don’t want that, so it’s a balance.

Your situation might be different. For mission critical safety decisions you might choose to eliminate Type II errors in favor of more Type I errors. It’s not comfortable, but a spurious hard braking event is better than no brakes and hitting something. Other things might be OK to just ignore.

And that completely skips the Type III (the right answer to the wrong problem) and Type IV (the right answer for the wrong reason) errors, but those are topics for another time

November 20, 2020 by Leon Rosenshein

Implicit vs. Explicit

geepaw code quality

"An implicit understanding is anything a skilled geek would have to be told before being set loose in your codebase to make a change. Implicit understandings take lots of different forms, but at their simplest, they often involve hidden correlations."

-- @GeePawHill

Computers are very literal. They always do exactly what you tell them to do. Even (especially?) when that's not what you want them to do. And yet we often write code that lets us do that.

How many times have you come across a library that requires you to create an object, then use that object to set some properties before you can actually use it? It's a fairly common pattern. And it's an example of implicit requirements.

You need to know that before you can use the object you need to initialize it. That's not an unreasonable thing, but why write a library that makes the user have to remember that. Consider an HTTP request class. It probably has a member that determines if it's a PUT, GET, PATCH, DELETE, etc. And the typical use case looks something like

req = new HttpRequest()
req.Method = HTTP.GET
req.URL = "someurl"
.
.
.
resp = client.Do(req)

That works, but it's possible, and if the code is more complex, easy to not set the Method or URL members. And then at runtime you get some kind of error. So why subject yourself to that kind of delayed error?

You can prevent that kind of error by being explicit. Instead of creating a bare Http request, explicitly create a GET request, and make the creator require a URL. Something like

req = new HttpGetRequest("someurl")
.
.
.
resp = client.Do(req)

With that pattern it's impossible to execute a GET request without setting a URL.

So next time you're explaining a feature or bug fix and part of the explanation includes the line "And before you do X, don't forget to do Y", take a look at the code and see if you can turn an implicit requirement into explicit code.

For more examples of implicit requirements and explicit code to remove them, check out this thread.

November 19, 2020 by Leon Rosenshein

Just No

Intern/work-study is a hard gig. You're dropped into a company, given a few days intro, a mentor and then expected to produce something in a few weeks/months. It can be stressful, and done right the intern learns/grows a lot. I think it's the reason that graduates from Waterloo often do so well.

Years ago I had a discussion with the director of the engineering placement office at a school I was visiting for campus interviews. We were discussing the differences between an intern interview and an FTE interview. We talked about how an internship is in some ways a really long interview. There's no long-term commitment by the company so you can take more of a chance on the candidate. On the other hand, a good internship produces something useful and takes the mentor's time away from their day job. By the time the discussion was over I had just about convinced myself that intern interviews had a higher bar than FTE interviews.

That's not strictly true, but think about it. A campus FTE hire is expected to take a while to come up to speed. Months to be fully on-boarded and fully adding value wouldn't be out of the question, and any FTE hire is a multi-year investment. An intern, on the other hand, is expected to produce something meaningful in 6 to 12 weeks, and while we all want things to work out, there's no commitment beyond that. If nothing else, taking on a non-traditional candidate (someone with little/no coding experience) because you've identified that spark doesn't make much sense if the candidate is going to spend their internship learning to code, but it might be OK for an FTE. I've hired those candidates as FTEs and it's worked out well, but would have been a disaster as an intern.

I say all that as context and to say that I think internships are great for both the employer and the intern. I think everyone should do one. I said that to my kids and they did internships before they graduated and entered the workforce full time. But I also told them that they should do paid internships. If you're not getting paid it's not an internship, it's volunteer work. I also think volunteering is a good thing and my kids did that as well, but don't let anyone tell you the two are the same thing.

Which brings me to my rant. There are some industries, such as the creatives (Hollywood, fashion, music, marketing), law, healthcare, and non-profits, that have traditionally offered unpaid internships. As much as I think that's a bad idea, and potentially illegal, people should at least expect it going in. Engineering, on the other hand, traditionally doesn't do that. And I don't think we should start.

Apparently others think it's a good idea. Places like LambdaSchool, which as near as I can tell, not only gives out its students for a 4 week free trial, but does it as part of the program, so the students are not working for free, they're paying to be sent out as free labor.

And that's just wrong.

November 18, 2020 by Leon Rosenshein

Hofstadter’s Law

It always takes longer than you expect, even when you take into account Hofstadter’s Law

— Hofstadter’s Law

Even if you've never read Godel, Escher, and Bach: An Eternal Golden Braid (GEB) you've probably heard Hofstadter's Law. And like most of GEB, it's a recursive law. That doesn't make it any less relevant though. The real question though, is how can we acknowledge it and make plans while maintaining our internal honesty.

One (the?) way to do that is to be clear that we're not planning, we're forecasting. Forecasting the weather is far from an exact science, and no-one expects it to be. Long term we have climate. Colorado is sunnier than Seattle. While any given day might be different, over any significant period of time it's a true statement. The 2 week forecast is usually directionally correct, but expecting the daily high/low to be accurate more than 3 days out is asking to be surprised. Similarly, tomorrow's high/low is pretty accurate, but no-one is very surprised if it's 5 degrees off.

Estimating software completion dates shows the same kind of pattern. However, we call them plans or schedules, and that changes expectations. The fact is that for any given person's estimate, the larger in scope/longer the estimate the less accurate it's going to be. If we call them forecasts instead of schedules a couple of important things change.

First, and most important, is that the expectations of accuracy change. Forecasts aren't just off by a consistent percentage, the accuracy gets worse the farther out they are. So when you say something will be done today people expect it to be done today or early tomorrow. If you say it's going to take 6 weeks then people won't be surprised with anything between 3 and 12 weeks. They'll be unhappy, but if you keep them updated they won't be surprised.

Second, is the internal expectation. You expect a forecast to change as much as anyone else, so there's much less internal struggle to change it. If your schedule is wrong you've missed a deadline, but if a forecast is wrong you update it as soon as you have better information and don't feel bad about it. So you're more likely to update it and keep it as accurate as possible. Which, paradoxically, makes it more reliable and better for planning :)

So, next time you're thinking about what you're going to do over the next month/quarter/year, try forecasting it, not scheduling it. Get it directionally correct. Make sure the sequencing is correct (as far as you know now) so you don't end up blocking yourself. And update your forecast as new information becomes available. You'll end up being more accurate that way.

And if you haven't read GEB, take a few weeks and read it. It's worth it.

November 17, 2020 by Leon Rosenshein

Feedback Please

Back in September I mentioned that O'Reilly was doing an Architectural Kata competition and that I thought it would be fun to get together a team and enter the competition. Well, we did put together a team, and we our team, selfdriventeam was selected as a semi-finalist. You can see our submission on the github. The final step is a live presentation to the judges on December 3rd.

My ask is for volunteers to provide feedback on our final presentation. I'd like to do a run-through the week of November 30th/Dec 1st. If you're interested in providing feedback let me know.

November 16, 2020 by Leon Rosenshein

Thinking Like An Engineer

learning

First, a link to something I posted about 7 months ago. It was relevant then, and it's still relevant now.

That said, there are lots of things that go into thinking like an engineer, but here are a few that I think are important.

Balancing constraints: Everything we do has some sort of constraints. They could be memory, bandwidth, execution time, development time, short vs long term gains, user value, or something else entirely. Our job as engineers is to look at the set of constraints and figure out the best solution to the problem with the information currently available.
Making it practical: The solutions we come up with need to be doable. Part of it is balancing constraints, but it's also about not limiting yourself to the perfect solution when there is a good enough solution that meets all the requirements. If the perfect solution we come up with needs unobtanium then it's not a practical solution and it doesn't count.
Solving a problem: Theoretical physicists do important work and without their additions to the body of knowledge engineers wouldn't be able to build/design things, but theoretical physicists aren't engineers. Sheldon Cooper might get the Nobel prize for his work, but it took Howard Wolowitz to turn things into devices people could use. Also, one of my favorite answers when someone asks me how to use a system in a non-standard way is "What are you really trying to do?" This makes sure that I'm not just solving a problem, I'm solving the right one.
Always learning/teaching: Speaking of "What are you really trying to do?", another reason I like that question is that at least one of us, and often both of us, learns something, and often it's both. I get to understand use cases better, so I can provide a better solution. The person with the original question either learns how to do what they asked or they learn a better way to approach the problem.
Additionally, as an engineer you recognize that there are other engineers out there working on similar problems. It's great to learn from your own mistakes, but it's even better if you can learn from someone else's. Good engineers stay aware of what's going on in their fields _and_ related fields and figure out how to use that knowledge going forward.
Laziness: Great engineers are lazy. They'll put a lot of effort into something up front so they never have to think about the problem again. Designing automation and feedback loops so that proper function is maintained despite changing conditions. In the software world it's things like scripts, crontabs, triggers and redundancy that let us sleep soundly at night.

Of course there are lots of others. What do you think it means to think like an engineer.

November 12, 2020 by Leon Rosenshein

Ubiquitous Language

naming

I like bounded contexts. More generally, I like both bounds and contexts. Together they're really good at defining scope, both "in scope" and "out of scope". Knowing what you're talking about goes a long way to reducing cognitive load. If it's in scope you think about it. If it's out then you don't need to worry about it.

But that's only part of the battle. Even when you know what's in scope and everyone is in agreement on the scope, you still need to use the same language to discuss things. And you need to use the same language everywhere. Because when it comes to the Venn diagram of what everyone knows, what you know, and what your customers know, the overlap can be surprisingly small.

Ubiquitous language comes from the Domain Driven Design world. And language in this case isn't English vs German vs Spanish, or even C++ vs Go vs Python. It's about terminology. It's about naming. And naming is hard. It's one of the really hard problems in programming. DDD's solution is to ask the expert in the field (the Domain Expert) and use that language. And use it everywhere. In the user docs. In the design docs. In the code. Especially in the code. Because the developers are not the domain experts, and the developers need to be able to communicate with the users and domain experts. If you're already using the same terms for the same things then there's no translation needed and no additional cognitive load. On the other hand, if you're not using the same term, every time you have a conversation the terms need to be translated. In real time. With loss of accuracy. And additional cognitive load.

In the batch processing world we deal with jobs a lot. Customers want to run jobs. They want to start them. They want to know if they're finished. They want to know what the errors were. But what exactly is a job? To Kubernetes it's a structure/API. To Amazon it's a feature in the console. To ML model developers it's a bunch of GPUs running tensorflow on a dataset. To others it's a directed acyclic graph (DAG) of tasks. So what's a "job"? So what's a job? We all need to agree at all times or we're setting ourselves up for problems later.

We decided a job is DAG of tasks (task definition is another part of the ubiquitous language). It could be a simple DAG of one thing, it could be a DAG of N things that need to happen together, or it could be a complicated DAG. In general customers don't need to care. The model they use lets them do everything they want regardless of the complexity of the DAG. As developers we don't care because it's all a DAG and we can treat it as such. And we all agree to ignore Amazon's and Kubernete's definitions. Because they just get in the way and add ambiguity. If we (the developers) need to care about those definitions we wall them off carefully behind facades so that they don't interfere with our ubiquitous language.

And of course, every domain/bounded context can (should?) have its own ubiquitous language. And the interfaces/translators need to be clear. But at least inside a domain/scope/context you have a language that's shared ubiquitously.

So next time you need to abstract something, think about the language you use, because language matters.

November 10, 2020 by Leon Rosenshein

Just One More Thing

There are almost as many reasons to ask questions as there are questions to ask many of them fit into a few categories. This is by no means an exhaustive list, but some of the bigger ones are:

You're looking for context to help identify a problem
You know what your problem is and you're looking for a solution
You've got a solution and you're looking for an implementation
You know the answer and you're leading the witness (Columbo style)

When you have the first kind of question you ask open ended questions, follow up questions, and why questions. You're trying to find a gap in your knowledge or understanding so you need to start, at least internally, by defining what you do know and where the boundaries are. You're an explorer and you're working with a guide. The guide might not have the answer, but together you can find it. And you need to be comfortable with silence while the guide thinks about the question.

A good question of the second kind takes some up front thought. Asking your peer or Stack Overflow a good question requires that you understand your problem. That you know what you've tried and why it didn't work. And you need to be able to explain that well so you don't waste anyone's time.

The third type of question is when I get to ask my favorite question, "What is it you're really trying to do?" It's the kind of question you get when someone knows just what they want to do, but can't figure out how to bend the system to their will. If you find yourself asking that kind of question, take a step back and think about what you're trying to do and if your approach makes sense.

That last kind of question is when you don't really have a question. You're trying to make a point or get buy-in. After all, if you can get the other person to convince themselves of something you don't need to do it. And that convincing is much more likely to last.

Just one more thing. If you've never watched Columbo, take some time and watch an episode or two. You'll be glad you did.

Older Newer