Recent Posts (page 42 / 70)

by Leon Rosenshein

Not Even Wrong

Sometimes I run across articles that make me want to reach through the screen, find the author, and say "Stop it. You missed the point entirely and you're just making things worse." The article The essential difference between microservices and APIs is one of those cases.

To use an analogy, consider an article titled The essential difference between Ethernet and communication theory. And then you read the article and it talks about how Ethernet is better than the old networks because you can get better overall throughput if you have relatively sparse transmitters. It doesn't name the older network technology, just talks about how much better Ethernet is. Yes, Ethernet can do better than a TokenRing network, but that's not what you said you're going to be talking about.

Basically, the title compares an implementation of one concept to a different (but related) concept itself, then goes on to compare two implementations. Which does a disservice to both implementations and the concept.

Microservices and monoliths are architectural design patterns, and the both have their pros and cons. For example, microservices let you deploy small pieces and scale them up/out individually, but are hard to test end 2 end. Monoliths are easy to test and debug end 2 end, but hard to scale up/out. There are lots of other differences and other options besides those two, but that's a whole different topic. And choosing the correct implementation is important if you expect your system to be flexible enough to last. (If you want to dive deeper let me know and maybe we can do a Kata event with your team)

APIs, on the other hand are all about high level architecture and design decisions. Your API (and the way this article is written we're talking about the public API) defines what you can do from outside the system and how you interact with it. It defines the interface but explicitly does not define the implementation. You could have a single monolith, you could have an API gateway over microservices, or something completely different behind the gateway. The whole point of an API is that the consumer of the API doesn't need to know (or care) what the implementation is. And defining the correct API for a project is critical to making it easy for others to use, easy to expand, and easy to adapt to future requirements.

So, the next time you need to compare two things, make sure you know what you're comparing and make sure the things you're comparing are, in fact, comparable. Otherwise you might find yourself in this situation. Or, as Wolfgang Pauli is supposed to have said,

That's not right. It's not even wrong

by Leon Rosenshein

Happy Birthday

Way back in the depths of time (fall of 1988) I was a new grad working for a 3rd tier aerospace company called Eidetics and I got the task of connecting an Apollo DN10000 to an SGI 70GTX to create my first flight simulator. It turned out that we had a cockpit mockup that was used for instrumentation development and a batch simulation system that was running faster than real-time since the latest CPU upgrade. _Unfortunately_ the Apollo network we had used a proprietary token ring network, and the SGI was sitting in the corner by itself. So what's a new college grad to do in a time like that? I became a network engineer/administrator.

Of course first I had to figure out what a network was, and what could be used to connect two different systems, one System V Unix and one that claimed to be "Just like real Unix" but wasn't. Luckily there was a relatively recent addition to the networking world designed for rapid communication between disparate systems, Ethernet. So I got out a catalog, bought some thick-net backbone, a few transceivers and drop cables and hooked it up. Then I learned about terminating resistors. And kernel drivers. And TCP/IP.  And single source of identity in a cluster. And IRC. And email management. It was a great learning experience, and it taught me the importance of stepping up to a task I might not have known how to do when I started, but was able to figure out along the way.

Of course Ethernet has spread and grown since then. Gigabit ethernet over Cat5 twisted pair wires is a thing, and no-one blinks at connecting 10s of thousands of computers in a datacenter as a single network/cluster. Now we have Ethernet clusters riding around in cars that have more devices and compute power than the biggest network Eidetics ever put together.

So Happy Birthday Ethernet, and thanks for everything you've done for us.

by Leon Rosenshein

OKRs and KPIs

Here we are in the middle of planning and people are talking about OKRs and KPIs, but what's the difference between them and when should you use one over the other?

Starting from the definition, OKRs are Objectives and Key Results, while KPIs are Key Performance Indicators. Sounds similar, no? Key Results and a Performance Indicators sure sound like the same thing. And your KPIs might be some of your Key Results, but they're absolutely not Objectives

And that's where the important difference is. Very much like the difference between a vision statement and a mission statement. A vision statement defines the goal, the end state, the objective. On the other hand, a mission statement defines what you do to enable the vision, and the KPIs/Key Results measure how well you're doing in that mission.

Or put another way, the KPIs are how you define done. Like everything else we do, the most important thing to do is decide what to do (the objective), and the second most important is to decide what it means to be done/succeed (the key results).

Objectives come in a few flavors, but the big ones are existential and aspirational. Existential ones are the ones that you need to do to keep going. Things like RTR and NVO. Down here in the engine room it's things like "Increase Capacity" and "Secure the Data". If we fail to do those things we hit a wall and everyone has problems.

Aspirational objectives, on the other hand, are longer term, and progress is important, but not reaching the end state at the specified time is not fatal. These are things like "Replacing personal car ownership" or, for Infra, "Workloads seamlessly and transparently transition between on-prem and cloud datacenters". We won't get there this year, let alone this quarter, but we'll make progress.

Similarly, KRs/KPIs often fall into two categories. actionable and vanity. Actionable metrics let you know what's going on quickly and accurately and can help you make decisions through uncertainty. Vanity metrics look and sound impressive, but don't really tell you much. As an example, consider a simple website, It's got a handful of static pages and a few images from a database. Vanity metrics are epitomized by that hit counter at the bottom of the page. It just keeps going up. Millions and millions of views. Or maybe you're a little more advanced and you're tracking P95 page load times. And it's always under 100ms. That's great, right?

Not necessarily. On the surface those metrics look wonderful, but they don't tell you anything. Actionable metrics would be unique daily user counts, or even better, unique daily user trends. PLTs are important, but since you've got a 50ms timeout on your database calls that p95 doesn't mean much. You should be looking at 2XX vs 4/5XX responses. And not just raw numbers, because 1 error out of 3 is really bad, but 1,000 errors out of 100,000,000 isn't bad at all (but still might not be acceptable), but could indicate a brewing problem you need to take action on.

So as you think about planning, not just for Q4, but longer term (1, 3, or 5 year plan) as well, think about how you choose your Objectives and Key Results and the Key Performance Indicators you're going to use to measure yourself. Are they really driving you where you want to go?

by Leon Rosenshein

OOP Isn't Dead

I've been hearing that object oriented programming is dead for a while now. Like 15 years or so. But I don't believe it. I don't believe it for a bunch of reasons, but the simplest are that it's still around, and it's still useful.

That's not to say that there aren't problems. There are problems with everything. With OOP the problems can get pretty tangled and it can be almost impossible to sort them out without refactoring everything. Usually the problem can be traced back to one of two things. Either an explicit decision that everything is an object, or if only some things are objects, a poor choice of naming/segregation of the hierarchy.

When everything is an object nothing can stand alone. You want your objects to interact, so they need to understand each other, at least partially, Or you create some other object that knows about the different objects so it can handle the interactions. Despite your best efforts you end up with a big ball of mud.

For the second problem, having to make decisions with imperfect knowledge is a given, especially if you're being honest with yourself. And invariably something will come up that makes your current hierarchy less than ideal. Eventually you get to a point where what you really want is multiple inheritance, but that never works out well.

On the other hand, if you go back to how OOP was originally defined there's no mention of inheritance or polymorphism. It was defined as a group of small things that communicated by message passing, or, in today's language, a distributed system of microservices.

And if you consider a function call and return as message passing (which it is), then there's still a lot of life left in OOP. What OOP is really about is encapsulation, decoupling, and late binding, which are all things that make your system more stable, easier to test, and more resilient to change.

So the next time someone tells you OOP is dead, tell them no, we're just now actually getting to use it they way it was intended.

by Leon Rosenshein

What is Scrum?

Do you do Scrum? Do you have an official ScrumMaster who isn't your EM/PM? Do you rotate the job of Scrum Master? And what's the job of your Scrum Master anyway? According to the official scrum guide,

The Scrum Master is responsible for promoting and supporting Scrum as defined in the Scrum Guide. Scrum Masters do this by helping everyone understand Scrum theory, practices, rules, and values.

The Scrum Master is a servant-leader for the Scrum Team. The Scrum Master helps those outside the Scrum Team understand which of their interactions with the Scrum Team are helpful and which aren’t. The Scrum Master helps everyone change these interactions to maximize the value created by the Scrum Team.

The Scrum master's responsibility is to make the Scrum better. Both inside the team and through team interactions with others. The Scrum Master role is more about kaizen than anything else. It's not sprint management. It's not translating business requirements to a User Story. It's not presenting the team's work at demo days or delivering a burndown chart.

I've spent a lot of time on teams that said they were doing Scrum, but really were just doing two week planning cycles and daily standups. That's not Scrum. It can be agile, and it can be effective, but it's not Scrum.

Personally, the thing I like best about Scrum, and something you can do with "doing Scrum" is the retrospective. After a cycle, however you define it, look back at what you've done. And more importantly, look back at how you did it. What were the process problems, not the technical ones. What were the gaps in User Stories that you didn't notice until you were busy implementing them? How could you have found those gaps sooner? What are you going to do next time to prevent those gaps?

Because Scrum, at its core, falls under the scope of the Agile Manifesto. As such, it's about people over process and continuous improvement. So blindly following the forms of Scrum, without following the principles, isn't Scrum at all.


by Leon Rosenshein

How Big Is That File Anyway?

There are lots of hard problems in computing, but you wouldn't think counting bytes is one of them. Counting bytes is easy, right? If you want to know how big a file is just count the number of bytes in it.

Or maybe not. It depends on what you're counting and what you're going to do with the number. If you want to know how many bytes of RAM it will take to hold the data in a file (assuming it's just a blob of data) then that might be correct. But what if it's a compressed file?

Or maybe you want to know how much disk space you'll get back if you delete it. In that case you need the number of bytes in the file, rounded up to the next whole block size. Because your disk allocates things by block. Different OS's and different devices have different block sizes, so the space used on one storage device could be different than that on another, even on the same computer.

You can't forget the overhead of actually remembering where you put that file and it's blocks. That data gets stored on the disk somewhere, usually with multiple copies.

If you're trying to figure out where your disk space went, things get even more complicated. How do you count a soft link? What about a hard link? What are you really measuring, disk space used by a directory, or how much data you would transfer if you copied the directory?

And what if you have file versioning enabled (at the OS level)? Windows Shadow Copy/ZFS/LVM Snapshots all take space. Is that included in the file size? Should it be deleted when you delete a file? Replicated file systems like HDFS make this particularly complicated by sometimes reporting the number of bytes in a file and sometimes reporting the total bytes used for all replicas.

Or, to paraphrase Clausewitz, Everything in computing is very simple. But the simplest thing is difficult.

https://devblogs.microsoft.com/oldnewthing/20041228-00/?p=36863

by Leon Rosenshein

Excuses, Excuses

Sometimes it's so tempting to not test your changes. You're in a rush, it's just a config change, it's a small change, or maybe you know it has no side effects. And if it's late on Friday or you're about to go on vacation and who has time? What could possibly go wrong?

Everything, that's what. If you've been around for a while you might remember the entire SJC datacenter going dark over a 20 minute period. It was just a simple change. To an IPTABLES config. That got distributed far more widely than it should have. With no oversight. By mistake.

The bottom line is that you do need to test all changes. Or at least run the automated tests you already have (you do have them, right?). And get code reviews. Even on the simple things.

But just in case you need it, here's a quick set of common excuses for not testing things. What's your favorite excuse?

by Leon Rosenshein

Code Golf

Code golf is fun and all, and sometimes it can be fun to see just how terse you can write something and still have it work but it's a good idea to

Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live. Code for readability.

The comma operator in C/C++ is a great example. You can save a handful of keystrokes, but it's almost never a good idea. 

Another place that people are often a little too terse is bash scripting. Especially when you're sharing scripts or saving them for future use. When you're writing a script it's very tempting to stop as soon as it's completed its task once. And if you delete it right after that it's probably OK. But if you're going to share it or save it for future use experience says that you probably want to be a little more verbose. And probably a little more rigorous. Maybe even a lot more.

Here's a few good ideas to use when building scripts that are going to be used multiple times. Especially if they're going to be used by others on random machines as part of a complex workflow.

Fail Fast: Use set -euo pipefail to make your script stop as soon as something goes wrong so you don't make things worse

trap: Use handlers to ensure cleanup in error conditions. You can handle signals, ERR, and EXIT something like:

on_error() {

  # Log message and cleanup

}

trap "on_error" ERR

shellcheck: is part of the `rNA` toolset. Use it and pay attention to the results. Consistency is a good thing.

Use default values for variables where possible: Bash variable expansion lets you get the value of a variable or a default if it's unset. You can use it to make positional command line variables optional with something like

FOO=${2:-default} # If command line option 2 is not set or null, use default.

Include common functions via shared files: The `rNA` repo has some common helpers to deal with MacOS vs Ubuntu. They might be helpful. You might have something similar for your team.

Clean up after yourself: Consider using a `/tmp` folder then deleting it when you're done. Cleanup would be a good use of a trap.

What are your favorite bash scripting tips?

by Leon Rosenshein

Kata Time

Who wants to put together a team and enter O'Reilly's Architectural Kata competition? If you're not familiar with the idea of architectural katas check out the links below. Basically it's a way to respond to these questions.

How do we get great designers? Great designers design, of course.

    -- Fred Brooks

So how are we supposed to get great architects, if they only get the chance  to architect fewer than a half-dozen times in their career?

    -- Ted Neward

Basically you get a one page definition of the problem, access to the "customer" and a short time (typically 2 hours or so) to come up with an architecture and identify the key technologies/components you're going to need to solve the problem. It's a lot of circles, boxes, and arrows on a whiteboard and justification for the choices as well as the potential areas of problem. It's also about being able to present your architecture quickly and clearly to a group of people who haven't been studying the problem and don't have all the context you do.

They're a lot of fun. I've done them a bunch of times and led a few sessions at the various ATG sites and everyone had a great time and learned a lot along the way. If you're interested in doing this let me know in the comments. If your team(s) are interested in doing a session outside the O'Reilly event let me know and we can work something out. It's a good learning/team building exercise any way you slice it.

by Leon Rosenshein

Visualizing Code

Code is complex. Understanding complex code is either complex squared or complex to the power of complex, and I lean towards the latter. So what's a developer to do? I've thought about this on and off for a while now, and one of the best descriptions of how I "see" big projects is described in this blog post.

Using those metaphors, I'm definitely a mapper, with some packer thrown in for the little nuggets of things that don't fit onto. I also like the idea of merging the spatial relationship of the code with the temporal. For me it helps with the mapping of the physical (code files) into the virtual space (mental model) and the domain space.

The visualizations in the blog speak to me, and I really want to try out the tool, but the tool isn't open source yet, and at this point I'm afraid it never will be :( The tool for visualizing concurrency was finally open sourced, so there is hope.

Meanwhile, how do you get a feel for big new codebases?