Recent Posts (page 54 / 70)

by Leon Rosenshein

Naming Is Still Hard

Back in January I wrote about naming things and I came across another article with more info so I figured I'd share that. The one I have some disagreement with is the emphatic advice to avoid hungarian notation. While I agree that prefixing with the base type information is wasteful and makes things harder for your IDE, the original usage for hungarian notation, particularly when writing C code for windows makes a lot of sense. When your language/compiler can't help you out you help yourself. That's still relevant today with some DSLs. When I was writing shaders in HLSL for instance, the compilers weren't very smart, so we had to watch out for things ourselves.

by Leon Rosenshein

Circles

I was wandering around the internet the other day and came across an article describing something called the trinity architecture. It's a way to indicate composition, dependencies, and abstractions. And this is important because everything we do in the world of 0's and 1's is a model of something that we interact with. Ideally the model and the interactions develop together or at least the model works to handle the expected interactions. Compositions, dependencies and abstractions are how you describe those things. And the better those things fit together the easier it is to reason about. The easier it is to reason about, the lower the cognitive load.

And in many ways, that's the goal of architecture. To reduce the complexity and cognitive load of understanding the entire system so you can focus on one part and get it right. That's Domain Driven Design in a nutshell. Make it possible for people to do their jobs mostly in isolation, but being able to rely on what someone else has done or is doing.

So what does trinity bring to the table? It brings a slightly different naming scheme with three layers and it uses nested and tangential circles to indicate composition, dependencies, and implementations. It's certainly a new way to look at things, but as I see it, it's a little too abstract. Sure, it's got a domain, which is the model of what you're doing, and a clearly defined Public API to interact with that model, but then everything else is thrown into the Aux bucket. Everything from the physical computers and networks to the libraries, databases, queues, and event buses that you build things out of. And there's no actual customer/user facing thing, just an API. Sure, in some B2B situations you're building an API, but in most cases, you're supposed to be solving a problem, not just building an API. Which means unless you have the Domain exactly right the API doesn't let you your customers solve their problems. Maybe you know the space you're working in that well, but I'm pretty sure I don't. 

The trinity is also a framework for building applications. And there's value in frameworks, especially when they have all the buzzwords. And of course it's an open source framework, so you can start using it for free. And since it's backed by a company you can probably buy all the support you want.

So what do you think of the trinity? Is it a good idea and a way to think about building software? Is there something we can take from it as we build massively scalable distributed systems? Is it just a thinly veiled marketing campaign for unwary developers? Or is it somewhere in between?

by Leon Rosenshein

Intro To Architecture

Name dropping today. Neal Ford and Mark Richards. I've met and talked to both of them a few times. Had some really interesting conversations around scope of influence and the difference between a software developer and a software architect. They're both very good at not only high level architecture discussions but digging in to the details and thinking about specific use cases and approaches. They've got a new book coming out, Fundamentals of Software Architecure, and they're doing a webinar about it next month. Since we're all spending all our time online anyway, if software architecture is something you're interested in, think about checking it out.

Another thing to think about if you're interested in architecture is an architectural kata. I've done them with both Neal and Mark, and led a few sessions, at Uber and elsewhere. If you're interested in doing a kata session after we're back in the office let me know. If there's enough interest I'd love to run another session.

by Leon Rosenshein

Write It Down 2

Documentation is a thing. And there are lots of different kinds, each with their own use case and requirements. But one thing is consistent. Keep the audience in mind. What do they know? What don’t they know? What are they looking to know? What do you want the reader to know?

The most important thing to remember is that it’s for the reader. Whether you’re writing API docs for the RNA Docset, end user documentation for a public facing website, a postmortem, a bug report/feature request, or notes to yourself. remember the reader. Even if you’re writing notes to yourself, what you’re writing isn’t for you now, it’s for future you without all of the contect you have in your head while you’re writing. The reader doesn’t know what you know now. So everything you think is obvious and straightforward now won’t be when you read it. And that’s the simplest case, when you have a chance to know the details.

Another important thing to keep in mind is what the reader wants to do. People are trying to get something done, and they’re looking for help. So provide it. To do <X>, perform these steps. Think about the common mistakes or problems people might have and discuss them. Provide troubleshooting tips. As the writer of a tool/UI/library you know now only what should happen, but what to avoid. The person who has nothing but the documentation doesn’t. If you don’t say it, they don’t know it.

Provide levels of detail. For codelabs we have 100, 200, and 300 level labs. Even if you’re not making it that isolated, you should tell people what they need when they need and make more detailed information easily available when they’re ready for it.

And then have someone you trust, but doesn’t know how to do the thing you’re documenting, do it. We’re working on turning up a new DC and I put together a runbook and a set of scripts to do it, then gave it to someone on my team to try. I thought about what he would know, what he would have available, and tried to document it as unambiguously as possible. Guess what happened. He got stuck on step one because there were some setup steps that I hadn’t documented. So we fixed that and tried again. Little things kept popping up where I wasn’t as clear as I thought. So we fixed those. Then he ran into an error that I hadn’t seen before. So we added more to the troubleshooting section. We’re still tweaking it and making it easier, but it’s at the point where someone else could come in, pick it up, and have a very good chance of making it work.

Having written all that, I find that I’ve actually missed the most important thing of all. You need to actually write something down. Once you have something you can make it better. But you need to start.

by Leon Rosenshein

Dates Are Hard Too

I mentioned a while back that one of the things developers make assumptions about is that time is a monotonically increasing function. If only that were so. But it's not just time, one second ticking into the next, that isn't as simple as it seems. Duration is hard too. Not just the number of seconds between two times, but how you turn seconds into weeks, months, and years. For example, the windows calculator can tell you how many days, weeks, and months there are between two dates. Seems like a relatively straightforward thing to do. But sometimes it goes wrong.

For instance, It claimed that between July 31st 2019 and Dec 31st 2019 there were 5 months, 613566756 weeks, 3 days (152 days). The 152 days part is right, but the weeks? How in the world did they come up with that number? Turns out counting is hard. I'll leave the details to the article, but it comes from mixing signed and unsigned numbers and ambiguities in what it means to advance a calendar by one month since months aren't all the same length. It usually works, but there are always edge cases. Something to think about as we work on safety critical systems.

by Leon Rosenshein

For Want Of A Nail

We build fault tolerant distributed systems. Not just fault tolerant distributed systems, but fault tolerant systems built on top of fault tolerant distributed systems. And they're connected by a distributed, fault tolerant network, using diverse fibers from different companies. That's a lot of fault tolerance. You'd think something that tolerant couldn't fail. Yet sometimes we get failures.

Most of the time they're simple failures and the system is degraded. We lose a fiber and traffic gets a little slower. We lose a few nodes and the rest of the system picks up the work. Some bad data slipped into the system so instead of surge pricing the cost stays flat. There are problems, but they're localized, isolated, and the system gets better when the problem is fixed.

But sometimes the problem triggers a feedback loop. That's when things get interesting, and get interesting fast. Think back to that fiber we lost in the earlier example. Generally, no big deal. Traffic is rerouted and things move on. But what if we lost 10% of our capacity and we only had 15% overhead? Everything's fine, right? Usually, but, not always. Suddenly things take longer. So we get a few timeouts. But since the system is fault tolerant instead of throwing an error, we retry the message. Now we've increased our network traffic, so we get more timeouts. which increases the traffic. Then some service somewhere gets overloaded because of the traffic increase. But that's ok because we've got auto-scaling that notices and spins up more instances of the service. That spreads the load, but increases the traffic further, and now our healthchecks are impacted, so the proxy decides to stop sending traffic to those "failed" instances. So the load on the remaining instances increases, and they actually fall over. Now we've got even more traffic, and nothing to respond to it.

It's always the little things. Someone with an excavator dug a little too deep or a little too wide and broke a buried fiber-optic cable in Kansas. Some messages were retired to route around the gap. Retries caused healthchecks to fail. Failed healthchecks concentrated traffic to a few instances of a service in Virginia. Increased load meant the service couldn't respond in time. Jane Public in Capetown pushed a button and a car didn't show up. And that's a cascading failure. And that's how well designed, isolated, localized, fault tolerant systems die.

One of the hardest problems when dealing with fault tolerant distributed systems is that the fault tolerance that keeps them working up to a point is the very thing that takes them out when you pass that point. And they're very hard to recover from, because all of the things you're doing to recover are suddenly making the problem worse. To the point where sometimes the best solution is turning it off and back on. Sometimes physically, and sometimes you can get away with a virtual reboot.

There are lots of ways to help prevent cascading failures, but they mostly come down to figuring out when to stop trying to be fault tolerant and just fail quickly. And restart quickly. You can read more (and find more links) in these articles.

by Leon Rosenshein

Generally Speaking

What shape are you? What shape do you want to be? What shape should you be? Is there a right shape? Is there a wrong shape? And what does shape even mean here? All good questions. Let's take them in reverse order.

In this case shape refers to a person's familiarity/ability in a cross-section of technologies. The generalist, jack of all trades, but master of none. The deep specialist, who knows more and more about less and less until they know everything about nothing, The focuser, not as deep as the specialist, but with a broad base. And the serial specialist, who focuses on different things at different times in their career. It's a histogram with N bins and an average/variance across the bins.

For the individual, there is no wrong shape, with the possible exception of the generalist who thinks they're an expert in everything. That's never a good idea. Outside of that, every "shape" has value. It just might not be valuable to a specific team at a specific time. An expert in SQL query optimization isn't the best person when the team is working on the mobile UI. Similarly, for an individual there is no right answer. If you're a UI developer with a focus on Android then you might be the right person for that team I mentioned.

So there's no shape that you should be. There's a place for every shape, and every shape can add value.

That leaves the last two questions, and only you can answer those. Figure out what you want. What makes you happy and fulfilled. Then figure out what shape you are, and close the gap. While you're closing the gap figure out where that shape can add the most value. It will be good for you and whatever team/group/organization you end up with. And win-win outcomes are the best for everyone.

by Leon Rosenshein

Guides

Last year at the Software Architecture conference I saw a presentation by James Thompson on "Beyond Accidental Architecture". Afterward he and I got to talking about the topic and my position that everything is really about scope of work and scope of influence and the difference between a senior architect and a junior developer was not in the kinds of problems being solved, (time, complexity, interfaces, etc) but in the scope. Systems, platforms, and frameworks vs classes, functions and interfaces. 3 year roadmaps vs sprint planning. Raising the effectiveness of a team vs a brown bag on a new library. Bridging the gap between disciplines (legal, finance, product) vs between developers on a team.

And not just what, but how. The wider the scope, the less direct control there is and the more influence is needed. You don't get to tell folks what to do, you work with them to find shared solutions and raise the bar for everyone. Which brings me to James' follow-on article about the "Software Architect as Guide" It's an interesting take and one that really resonates with me.

by Leon Rosenshein

Gazinta's and Gazouta's

A long time ago I was working on a simulated environment for a hardware mission computer on a 1553 bus and my boss told me to make sure and match the gazintas and gazoutas. As a freshly graduated Mechanical and Aerospace engineer who found himself doing hardware-in-the-loop work with computers I was somewhat confused. He told me that they were the things (data/information) that goes in to (gazinta) or goes out of (gazouta) each device on the bus. That made sense. And for information to flow on a 1553 bus everything had to be defined up front. We had a data dictionary we had to implement and we had to get it right because we had to talk to real hardware that already existed. And that was a challenge, because our simulation didn't necessarily work the way the real world did.

On the other hand, it was also liberating. As long as we met that requirement we were free to do whatever we wanted. It let us innovate on our side of the line and not worry too much about how it would work.

But that data dictionary also had a bunch of other good points. It defined not just the data that flowed between the subsystems, but it described the overall system. It let us understand not just what we were doing, but the bigger picture as well. It let us see how a change to one system would impact (or not) other systems. It let us reason about the architecture of the system.

Now data dictionaries (ICDs, thrift files, protobuf files, db Schemas) are critical, but they are very detailed and focused, and while it's possible to get the big picture from them, it's challenging to say the least. And when you're describing your system to someone all the details in the world don't matter without context to put it in. And that's where your software architecture diagram comes in. It's the boxes on the whiteboard diagram. The one with boxes for processes and durable stores and buffers, but no mention of the technology inside them. There are lots of ways to make them. UML, Visio. Rational. No matter how you make them, the goal is to understand the bounded context(s) of your system and how they fit together.

So whether you're writing UML, Thrift, gRPC, AXL graphs, or using LucidChart to describe the relationship between various doggos, the better you can document your architecture and data flow, the better you can work with others and be on the same page.

by Leon Rosenshein

Cycles Of History

Or, `The more things change, the more they stay the same`. A long long time ago computers were big. They took up entire rooms and required cadres of experts who would care for them, take your punch cards, and then some time later tell you that you punched the wrong hole on card 97. So you'd redo card 97, resubmit your cards, and get the next error. Rinse and repeat until lo, an answer appeared.

Then computers got smaller. You could put two or three in a room. Someone added a keyboard and a roll of paper towels, and you could type at it. And get an answer. And be miles away connected by a phone line. Time and Moore's law marched on, and computers got smaller and smaller. Teletypes turned into vt100 terminals, then vector displays, and finally maps and bitmap displays. Thus was born the Mother of all Demos. And it was good. Altair then Sinclair, Atari, Commodore, and IBM started making personal computers. They got even smaller and Osbourne gave us suitcase sized luggables. IBM and Apple made them easier to use. Then the Macintosh made them "cool". And all the power was on the desktop. And you could run what you wanted, when you wanted to.

Then Sun came along, and the network was the computer. At least for the enterprise. Centralized services and data, with lots of thin clients on desktops. Solaris. Sparc. XWindows. Display Postscript. Control. The cadre in the Datacenter told you what you could run and when. They controlled access to everything.

But what about Microsoft? A computer on every desktop (running Microsoft software). And it became a thing. NAS, SAN, and Samba. File servers were around to share data, but they were just storage. Processing moved back out to the edge. All the pixels and FPS you could ask for. One computer had more memory than the entire world did 20 years earlier. We got Doom, and Quake, an MS Flight Simulator. But all those computers were pretty isolated. LAN parties required rubber chickens and special incantations to get 4 computers in the same room to talk to each other.

Meanwhile, over in the corner DARPA and BBN had built milnet, universities joined bitnet, and computers started talking to each other, almost reliably. Email, usenet, and maybe, distributed computing. TCP/IP for reliable routing. Retries. Store and forward. EasySabre. Compuserve. Geocities. Angelfire. AOL, and the September that never ended. The internet was a thing, and the network was the computer again.

And now? The speed of light is a limit. You need things close by. Akamai isn't enough. Just having the data local doesn't cut it when you need to crunch it. and the big new thing is born. Edge computing. Little pockets of local data that do what you need, and occasionally share data back with the home office and get updates. Hybrid Cloud and on-prem systems. It's new. It's cool. It's now

It's the same thing we used to do, writ large. Keep your hot data close to where it's going to be processed. It doesn't matter if it's RAM vs. drum memory, L1 Cache on chip vs. SSD or SAN vs. Cloud. Or I could tell you the same story about graphics cards. Bandwidth limited, geometry transform limited, Fill limited, Power limited. Or disk drives. RPM, Density, Number of tracks, Bandwidth, Total storage. Or display systems. Raster, Vector, Refresh rate. Pixel density. Or, for a car analogy, internal combustion engines.

In 1968 Douglas Engelbart showed us the future. It took 50+ years and some large number of cycles to get here, and we're still cycling. There are plenty of good lessons to learn from those cycles, so let's not forget about them while we build the future.