Recent Posts (page 31 / 65)

by Leon Rosenshein

Current State

State is important. And state is hard. It's one of the two hard things in computer science. Even something as simple as checking if a file exists or the value of a variable as a test to do/not do something. Between the time you check and the time you act the state could change. Unless you do some kind of mutual exclusion to prevent it. a mutex if you will. Having multiple cores and multiple CPUs in a system running multiple simultaneous processes make it even harder, but with global locking and integrated test and set routines in the silicon it's doable. Painful and hard on performance, doable.

Move to an even more distributed system, say a bunch of computers connected over a mostly reliable network, and it gets even harder. And for the times that you distributed synchronized state there are things like Paxos, Raft, and Zookeeper. But that just provides you with a distributed/available source of truth. Using it is still hard, and can have scaling issues.

One important thing that can help here is knowing how "correct" and "current" your view of the state needs to be. Sure, you can ask the system for everything each time you need to know, but as the size of the state grows and the number of things that need the state increases simple data transfer will become the choke point.

The traditional way around that is a cache. That's great for data with good persistence. And it matches the usual mental model of pulling data when you need it. In the right situation it works really well. That's all a CDN is really, and the internet lives on them.

Another way is to turn the system around and switch from pull to push. Instead of having the user ask for state when needed, it maintains its own view and receives deltas from the source. If your rate of data change is small (relative to the size of the state) AND the changes don't happen at predictable intervals (time-to-live is random) then having the source of truth tell you about them might make sense. It's more complicated, requires changes on both ends of the system, makes startup different from runtime, and adds the requirement of reconciliation (making sure the local view does in fact match reality), but it can greatly reduce the amount of data flying around your system and make the data more timely.

We've run into this a few times with our kubernetes clusters. As a batch cluster the size (both in nodes and tasks) is pretty dynamic. Some things run per-node and try to keep track of what's going on locally (I'm looking at you fluentd) and in steady state it's not too bad, but in rapid scale-ups all of those nodes, asking for lots of data at the same time, can put the system under water. It's partly the thundering herd, which we manage with splaying, but it's also the give me everything every time problem. That we're solving with watchers. So far it looks promising. There will be some other scale issue we'll hit in the future, but that got us past this one.

martinfowler.com/articles/patterns-of-distributed-systems/state-watch.html
martinfowler.com/articles/patterns-of-distributed-systems/leader-follower.html
en.wikipedia.org/wiki/Thundering_herd_problem

by Leon Rosenshein

Moving Day, Part 2

Today is the first day in the new house. Of course many things are the same. This channel hasn't changed (yet). Still in the same office away from the office. Still using the same hardware. Still working on the same projects. Goals are still mostly the same.

But, lots of things have changed. And again, in the spirit of a physical move, some things to keep in mind as we settle into our new home.

Know who to go to when you have questions. Know both the new and old IT and HR links. Know the slack channel(s) for realtime tech questions.

First,make sure you're in the right place. Get out your map and read over the directions before you get started. Always a good idea to understand the bigger picture before you dive into the details.

When you've gotten the all-clear and you're ready, unpack your stuff. For some (Windows) it's trivial. For some (linux) it's easy, but make sure you don't miss a step. For others (MacOS) it's a complete reset. It's not trivial, and it's not instant. Don't expect it to be fast and you won't find yourself disappointed.

If you're on MacOS you'll have the opportunity to fix those little things that were annoying you. While you reinstall everything and reconfigure things take the opportunity to make adjustments you never quite got to. If nothing else you'll get rid of a lot of cruft and be on the latest version of things :)

Once you have things mostly settled, get the list of TODO's you put together and get started. Don't be surprised if you realize there is more setup and adjustment needed. That's a normal part of getting back to work after a move. It's going to take a while to get moving again, and that's OK.

Finally, remember to stop occasionally and take a deep breath. There's a lot going on, and change is stressful. Instead of denying it, acknowledge it and work with it. It will make it easier for everyone. Don't expect to get too much done today, or for the next couple of days. Work back up to it and you'll see things beginning to work smoothly again.

by Leon Rosenshein

Moving Day, Part 1

In case you haven’t been paying attention, today is our last day in our old house. Which means it’s part 1 of moving day, the packing. Of course this is a virtual move, since there's been an acquisition and we're working for a new company, not really changing jobs. Also, we’re all pretty much working for home already, and the commute isn’t going to change much.

But lots of things are going to change. And today is the last chance to get ready. So, in the spirit of a physical move, here’s some things to keep in mind as you prepare:

Know who to go to when you have questions. Know both the new and old IT and HR links. Know the slack channel(s) for realtime tech questions.

Pack your stuff. Yes, yes, yes, it’s all digital, so there’s no real packing, but there are changes coming. Certificates will be changing. Macs will be wiped. Identities will change. You don’t want to lose anything along the way, so back it up. Google Drive is a good choice for work artifacts. and config files. So is Git. Make sure you push any branches and stashes.

Close out work in progress. The more things you can close out, the fewer things you’ll have to keep track of/pick back up after. Get your PRs out and queued. If you have PRs to review, get it done. Don’t start something that you know will get interrupted. And don’t forget to write down what you were in the middle of and what should be done next. That’s a good idea every day, and especially helpful now.

Make sure you have a map of where you’re going. Sure, you’re not actually moving physically, but there are enough things changing that having a map makes sense. Or at least a printed page of the instructions you’ll need to follow. Stored in a place you KNOW you’ll have access to in the middle of the move. I know someone who didn’t have their driver’s license with them for a move because they left it in the bathroom when the packers came. They were good packers and asked him if he had his wallet with him, and he said yes, but he didn’t. That night when he got to the temporary housing he realized he had left his wallet in the bathroom and it was now in a box on a truck heading for Seattle. Luckily his wife had credit cards and they all had passports with them, but it could have been a problem.

Finally, remember to stop occasionally and take a deep breath. There’s a lot going on, and change is stressful. Instead of denying it, acknowledge it and work with it. It will make it easier for everyone.

See you on the other side.

by Leon Rosenshein

Captain Crunch

This morning I ran across an article on crunch time in the game development industry. As many of you know, I’ve been doing the development thing for a while now. And much of the early years was gaming. Mostly flight simulations, but also other simulations and general gaming. And I can tell you from first hand experience and countless discussions with peers at other companies that crunch time is real. And yes, it’s not just the last 18 months of a 12 month project. There’s always pressure. It just reaches ridiculous levels as the end-game approaches.

There are lots of reasons for it. The biggest two though, don’t apply any more. Back in the day games were sold in boxes, in retail stores. And what you sold was what people played. For years. No downloadable content. No user generated content. No patches or bug fixes. Just what was in the box. So you had to get enough in the box to make people happy. So the pressure to “make it work” at the end was high.

Another sign of the times was that about 50% of sales happened in the 4 weeks between Thanksgiving and Christmas. And most of those sales were of the hot new games. Which meant that you box had to be on the shelves by Black Friday. Getting on the shelves before Christmas was so important that when we shipped Falcon 4 in England we put a person on a plane from San Francisco to NY to London (on the Concorde) to pick up a few hours of manufacturing time. If you missed the season your game was sunk. By next year it was dated and no one cared. So slipping the schedule by 3 weeks wasn’t an option from a business perspective.

Things are different now. Patches are expected. DLC is expected. Back then it was 4 or 5 developers and 1 artist. Now it’s 50+ artists and world builders and 10 developers using a game engine. Multiplayer has gone from turn based play by mail to split screen to MMORPG. Development and ad budgets have gone up by orders of magnitude.

But apparently some things haven’t changed. Crunch time is still around. Talk of unionization is still around. Burnout is still around. And lots of fresh new faces who want to get into gaming show up every year. Which, in my opinion, is the reason nothing changes. As long as there’s a steady stream of developers showing up to make the games there’s not a lot of incentive for things to change. And that makes me sad.

Because we know better. The constraints have changed. Boxes on shelves are no longer the driving force. Dates are still important, but much less so. And we know a lot more about the development process. We have better ways to get from idea to product that take into account the knowns, the known unknowns, and the unknown unknowns.

I’m not in the game industry any more. In large part because I wanted to get away from crunch time. Things are much better now. Work/Life balance is better. And I think the work is better too. Sure, for any given week I can get more things done by crunching, but over the longer development period a slower, steadier pace gives better results.


Bringing this back to the article I was talking about, I believe it to be an accurate depiction of the current state of affairs. Because I’m pretty sure I was living it before the author was born. I think there’s a better way. And I’ve been talking about it here for almost 2 years.

by Leon Rosenshein

Discoverability

For a while we had a column on our sprint board called “Documentation”. It was a catch-all column to help us remember that things weren’t “done” until they were shared. They say that if you build a better mousetrap the world will beat a path to your door. That may be true, but only if the world knows you built it. Yes, iIt’s possible for word of something to leak out and for usage to grow, but that’s not the normal case. Better is to tell people what you’ve built and why it’s better for them. That’s where documentation comes in.

We wrote the infra tool to help us manage our infrastructure and help our customers use it. One of the first things we added was the auth module. Because we wanted a good way to update all of the different contexts and tokens once and not have to worry about it. We didn’t tell anyone at first. Just used it ourselves and enjoyed it. Some people noticed and started using it. Then we started getting questions about how to use it. That didn’t make sense since it was fully documented, wasn’t it?

Well, it was, in that the docs, online and off, explained exactly what options were available and what every option meant. But there’s more to documentation than that.

It starts with a good description. What does this thing do, and why should a user care? What’s the benefit? Remember, people generally buy things for the benefits, not the features. In this case,

This utility allows you to add authentication profiles for external services
that can't integrate with uSSO (currently only AWS). Then, at the start
of each day, you can refresh the credentials for for each profile en masse,
only entering your password once (though potentially 2FA-ing multiple
times).

So a clear benefit.

Then there’s how to get it. With the infra tool, as long as you’re in the rNA repo, it’s just there. And it auto-updates. No need to do anything. That’s the best kind of docs.

And of course you need to have a full description of the API (or CLI in this case). All of the functions and parameters. Each with not just a what, but a why. In many ways this is the easy part. For the infra tool you can see it at any level with the --help option, like infra auth --help.

You want to have examples. Both targeted ones and general ones. For the `infra` tool it includes things like a specific example of adding a new profile (`infra auth add-aws-profile --help) as well as the more general codelab that show you how to use it day to day.

If you’re writing a library (or expecting others to contribute to your tool) you also want a “how we did it and why” section of your docs. For the infra tool that part lives with the code and in our internal docs, but it’s there. We’ve even got a presentation on how you can extend the tool. But we keep it separate because it would get in the way of our users knowing how and why they will benefit from it.

So next time you think you’re done with something, look at the documentation and make sure you’re really done.

by Leon Rosenshein

Adapters, Facades, and Decorators - Oh My

The GOF gave us Design Patterns in 1994. It's been helping and confusing folks ever since. There's good high level advice and a bunch of patterns that can be reused. The confusion comes around because some of them are very similar, but as usual, the details are important.

Consider the adapter, facade, and decorator patterns. All of them take existing functionality and wrap it, changing the interface slightly. So why are there 3 different patterns and which one should you use? The answer is, it depends. It depends on why you're doing whatever it is you're doing.

The simplest is the adapter. Consider a trip from the US to EMEA. You've got a bunch of electronics with you, and you'd like to be able to plug them in wherever you are. So you carry a bunch of adapters with you. They're just lumps of copper and insulator, carefully designed to allow you to plug your device in on one end and into the wall on the other. It just changes the shape of the plug. If you're in Austria the outlet has 220V, 50hz, and that's what goes into and out of the adapter. Hopefully your device can handle it. Software adapters are the same. All of the same parameters go in/out, and they're not modified on the way through. The purpose is to allow you to adapt one system to fit another, but the functionality/meaning better match, or else.

Next is the facade. Facades are false fronts, and make things simpler to use. If your API offers 20 different flags and options, but really you only care about changing one of them you might make a facade that takes that one parameter, adds your own defaults, and then passes it on. Back to the power adapter analogy, consider the Qi charging on your phone. Instead of plugging in a 24 pin USB-C connector you throw your phone onto the charger and get more power. Very simple, but you lose the ability to transfer data.

Finally, there's the decorator. What do you do if your US electronics can't handle 220V at 50 Hz? Then you need a decorator. Sure, on the surface it looks like an adapter, but under the covers it has a transformer and frequency converter. Plug it in to an Austrian wall socket and you can plug your US device in and, if it's a good decorator, the device will never know. Software decorators are the same thing. They add functionality. If you have a data source that uses push to send its data and a sync that uses pull you need something between them. One example might be a statsd metrics exporter (push) and a Prometheus Time series dB (pull). You need a decorator between them if you want them to talk to each other. The decorator would take the push from the service, buffer it, then respond to the Prometheus pull. So not entirely unlike an adapter, but more.


by Leon Rosenshein

YAML

Hard coded config is bad. Making code flexible is good. We can debate that at another time, but let's go with that for now. One of the ways to make your code more generic is to move the instance specific data to an external config file that gets read and used at runtime. That way the same code can run in multiple environments and you don't have to worry about which version is deployed where. And one of the common ways to define that config file is yet another markup language, YAML.

The first thing to keep in mind is that YAML is a superset of JSON, which means that every JSON file is valid YAML, but the converse is definitely not true. The other thing to keep in mind is that while it's possible to have multiple YAML documents in one YAML file, a YAML document can not span files. Which means that while a common YAML file can be used by code in multiple languages, every implementation of merging YAML files is custom and probably not portable across languages.

That said, there are a bunch of things you can do inside a YAML file that are supported cross-platform, and can make your life a lot easier.

Most important, to me, is the ability to define "variables" and then use them multiple times in the file. This lets you put together a set of information and use it in multiple places without having to worry about typos. It can be a single value, or it could be a complex sequence of values, maps, and sub-items. Whatever it is, it is exactly the same everywhere it is used. You can even use the "variable" as a base and extend it differently in different places.

Another good one is the ability to handle multi-line strings or long strings without having to scroll 6 screens to the right or slam things against the left. Combined these features let you keep your YAML file readable while maintaining the look/format you want for the end user.

The last one that gets used a lot, especially when working with Kubernetes and kubectl is multiple documents in one file. Instead of having multiple files that you need to keep in sync you can add a separator and then just keep them all in one file. I don't recommend putting ALL your YAML in a single file, but if you need to define multiple related Kubernetes resources like a role, role-binding, and namespace, putting them all in one file can help you keep things in sync.

Which brings me back to merging multiple files. In this case I don't mean concatenating, or extending, but merging as in overriding a base for an instance specific result. Like having a base YAML file, then overriding the dB endpoint differently for dev, staging, and prod, but having the resulting document have the same structure. While there's no globally portable way (that I know of), there is Uber's config/configfx for Go. The base config library lets you specify how the different environment specific files are merged into the base, and configfx offers a pre-defined structure that uses the always-available ENV variables to automatically choose the correct files to merge together for you. If you're writing Go code, check it out.

by Leon Rosenshein

Roadmaps

One of the key things that makes a shared journey possible is a shared destination. If you're not all going to the same place the chances that you'll end up in the same place are pretty slim. But while a shared destination is key, it's not enough. To have a successful shared journey you need a shared roadmap.

It's not that everyone needs to be traveling the same path. Although many will travel the same path, there are multiple ways to get from A to B, and if you need a stop at C, D, E, F, G, H, and I to pick things up it's probably more efficient to split up and have parts of the group go to each destination. You don't just tell everyone that you'll see them in New York City and hope everyone gets there and they pick up everything that's needed along the way. Instead, you take some time up front and do a little planning. Where are you going? Who's going? Why are you going? What are the stops along the way? What will we need along the way? What will we need when we get there? That's more than a destination, that's a travel plan. A roadmap if you will.

That's just as true for an organization's journey. An org is just a group of people, and whether you're going from one city to another, or rAV to rNA, or say Uber ATG to Aurora, the same basic principles apply. Just saying "Go ahead. I'll see you there on Monday" and expecting everyone to get there on time with everything they need is, shall we say, optimistic ? Better to have a roadmap.

So what goes into a roadmap? Every roadmap is different, because every roadmap starts from a different place/situation and goes to a different place, but there are some good questions to ask to figure out what goes into your roadmap. These questions apply not just to the "destination", but any stops (releases) along the way.

  1. What are we building/learning/proving?
  2. Who are the users/customers?
  3. How do we share this?
  4. What are the assumptions?
  5. What do we depend on?
  6. What do we need that we don't have?
by Leon Rosenshein

Momentum

Every object persists in its state of rest or uniform motion in a straight line unless it is compelled to change that state by forces impressed on it
-- Sir Isaac Newton

People have an enormous tendency to resist change. They love to say, ‘We’ve always done it this way.’ I try to fight that.
-- Grace Hopper

Inertia and momentum are real. Change is work. Sometimes hard work. Even when applied to non-physical things, like code. There’s a constant tension between changing everything and changing nothing. In development, as in life, the answer is somewhere in the middle because “it depends”.

Very often things are the way they are for a good reason, even if we don’t know the reason (there’s Chesterton’s fence again). So think critically before you change something. Just because you don’t know the reason for a method/process doesn’t mean it’s not needed. Figure out why it’s there. You might find that it’s actually more broadly applicable and you should be doing more of it.

On the other hand, sometimes we do things the same way out of habit, or cargo culting. We’ve seen it work before, and it’s worked for us, so don’t change it. But again, you need to think critically about it. It’s like Grandma’s cooking secret. Is what you’re doing the real cause of the outcome or would it have happened anyway? Does the current method/process/code solve a problem we once had, but don’t have now?

All of us are on the cusp of a large transition. That liminal state between the before times and the after times. That means we have an opportunity to think before we act and set ourselves up for success in whichever direction we end up going, both individually and collectively. So think about where you want to be, then figure out how to get there. Don’t pick a direction and see where you end up.

by Leon Rosenshein

Big Bang Theory

tdd

Big Bang TheoryIn the beginning was the command line (actually it was switches, buttons, and wire wrap). There was no "software industry" and everyone did everything themselves, or at least for in-house customers, and the pace of change was low. After all, that's how things were always done. In business, changes on the factory floor were expensive and there was lots of planning. Software was treated the same way.

As software became less bespoke we moved into the era of box products. Come up with an idea. Think about what it would look like. Build it. Put it in a box and send it to resellers. Never think about it again. Repeat every 2 years or so. Between 1980 and the mid 2000's that's how software was done. Updates were hard to impossible to deliver, and lots of (most?) people wouldn't bother to get them anyway.

And the way software was developed followed the same cycle. The big bang approach. Think about it for a while. Make plans. Execute on them. Slip the schedule to fix the problems that weren't understood. Release. Start again.

Things have slowly shifted over the last decade or so. We deliver, especially to internal customers and on the web, a lot more often. Like every day. Or multiple times a day. One might say continuously. Which leads me to test driven development.

Or at least automated unit and integration tests. One of the reasons for the long cycle time in days of yore was the need for extensive, manual testing. The only way to be sure something was right was to build it then throw it over the wall for testing, look at the bug reports, fix them, and repeat. That took time. Because every test cycle tested everything.

Now it is important to test everything, and to test it together, at least at some point. But you don't always need to test everything. The better you can do at identifying the tests that need to be run for a given change the fewer tests you need to run. Fewer tests means less time. Automated tests mean less time. Less time for testing means shorter cycle times. Short cycle times means more releases. More releases means move value over time.

So yes, it's really impressive if you can do a big reveal of the be-all, end-all on day one. But it's not the best way to make your customers happy.