Recent Posts (page 52 / 70)

by Leon Rosenshein

Murder Mystery Theater - Acting All Roles

-- By Andrew Gemmel

There’s been a mild annoyance bothering developers on our team - and likely others - for a few months now. Occasionally the ssh-agent on development machines will die. Needing a ussh cert for most remote actions, the remedy for one terminal session is a quick eval $(ssh-agent) or more permanently, restarting the machine. We all chalked it up to a bad chef configuration or similar, at least until today.

Today, @mike.deats was debugging a separate IDE issue on his machine and noticed something odd. Without fail, he could reproduce this issue by running all tests in the atg-services repo. Ok, that’s disconcerting. A quick bisect effort isolated the problem to a single Golang package. One that I had written. Heavily unit tested, in fact notoriously so. This package is the taskhost program for the BatchAPI. If you’ve ever run a BatchAPI job, you can thank this code for its success. 

The taskhost is the thin wrapper between kubernetes and your user code that reports any issues back to the BatchAPI and ensures that your logs end up in the right place. The tests for this program basically mimic various job scenarios in kubernetes, kicking off a number of taskhost processes masquerading as docker containers and observing the state of the filesystem and output streams that result. 

In order to do this in a test environment, the taskhost always interacts with the outside world through a dependency injection context that provides things like a filesystem, log writers, AWS clients, and a shell process runner. 

Or at least, that was true until pull request 981 was landed. This was a late-night code change that I deployed while on-call to mitigate an outage. Long story short, an issue with the rNA log-reader was overwhelming the disks in our cluster and causing machines to hit 100% disk usage and get wedged. To mitigate this, that change deletes the log-reader cache in /tmp between each BatchAPI task run.

If you read through that PR carefully, you’ll notice that the RemoveContents() function I so carefully copy and pasted from StackOverflow does not use the dependency injection filesystem. That’s right, every single time the taskhost unit tests run on a machine, they delete everything in /tmp on the user’s machine, including the ssh agent’s ussh cert.

Wow. It’s a miracle that killing ssh agents was the worst thing that this mistake did. The corresponding fix was as simple as deleting that mitigation code, as the underlying log-reader problem has long since been remedied.

There’s a few lessons here. One, hot fix code is a necessary evil but checking it in without careful audit is A Bad Thing. Second, when that evil code is checked in, a ticket to ensure it’s removed as soon as possible would be A Good Thing. Third, debugging can often become a game of murder mystery theater where you are not only the detective, but the murderer and victim too.

by Leon Rosenshein

GIGO

Even the greatest algorithm can't correct for bad data. Ever hear of photogrammetry? Probably. It's using images to understand the physical world. We use it to map the world. Using stereoscopic techniques and two (or more) pictures of a scene from a known position, you can extract 3D information. Roughly speaking you find points in each image that are the same thing, then, correcting for all sorts of distortions, use the difference in camera locations and the directions to the point from each camera to calculate the position of that point relative to the cameras. Do that for enough points and you get a depth map. One way to find those points is with the SIFT algorithm. It's really nice because it handles differences in scale and orientation. And with our SDVs the images are taken at the same time, so the world hasn't changed between the images.

For aerial photography that isn't the case. Typically there's one airplane, with one camera flying over the area, taking one picture at a time, then looping around and flying a parallel track slightly offset. Repeat this pattern all day. To make the needed stereo pairs images are taken with lots of overlap, typically 80+% in the direction of flight, and 20+% between image strips. Using differential GPS, some Kalman filters, and lots of math, you can get pretty good location info for where the camera was when the image was taken, so that part is covered.

What isn't covered is that the world changes. Trees blow in the wind. Cars move. Waves wash up on the shore. Cows walk.

As part of the Global Ortho project we mapped the continental US and Western Europe with 30 cm imagery and generated a 2.5D surface map with about 4 meter resolution. We did this by splitting the target areas into 1° cells and collecting and processing data in those chunks. Turns out that flying each track, then turning around and flying back takes a few minutes. That means that pictures taken at the beginning of one strip and the end of the next can be 3-5 minutes apart in time.

And lots can happen in that time. Fast things, like planes, trains, and automobiles have moved far enough that the SIFT algorithm doesn't try to match them across images. Things that don't move far, like treetops blowing in the wind get lost in the image resolution. But things that move slowly, but keep going have a wonderful effect. Remember that cow that was walking? It probably gets the same SIFT id since it's a 3x5 black spot against a green pasture. And it didn't move that far, so it gets matched with the one from 3 minutes ago. The same thing happens with whitecaps on open water. Then we triangulate. And depending on which way it moved, you either get a spike or a well in the surface model. All because the cows don't stand still.

And those spikes kept lots of folks employed. Their job was to look at the model, find anomalies, then go into a 3D modeling program, and pound them flat. Yes, we gave them tools to find the issues and we did automatic fixup where we could, but we still needed eyes on all of the data to make sure it was good. All because a cow thought that patch of grass over there looked better. Which meant our data was a little messy. And the automation didn't understand messy data.

So keep your data clean. The earlier you identify/fix/remove bad data the better your results, the less manual correction and explaining of what happened you need to do, and the more your results will be trusted.

by Leon Rosenshein

Test It Again Sam

tdd

Unit tests, integration tests, black box tests, end to end tests, user tests, test driven development, demo days. There are lots of kinds of tests. And they all can provide value. But only if you run the right tests at the right time. And as with so many things, it comes back to context, scope, and scale.

You want to have enough inputs to test that it works, that the different combinations of flags/features/datasets all work together the way you expect. But not just that the correct cases are handled correctly. You need to test that you detect and provide useful error information if the inputs don't make sense, you can't handle them, or something goes wrong during processing. That's the context part.

For scope, you want to run just enough code to test the system under test. There's lots of range to scope. From an individual algorithm to a class/package/executable to a service/distributed service/ecosystem. And your tests and framework need to reflect that.

If you're testing an algorithm then write the algorithm and enough code around it to test that it works per the above. Mock out everything but the algorithm. Provide the data in the expected format. Know what the answer is supposed to be. Remember what you're testing (the algorithm). These kinds of tests are generally called unit tests.

Unit tests can also have a slightly bigger scope. If you're testing the external interface of a class/library/exe then you need to provide enough environment around it to run, but you need to control the environment. This isn't the time to run against the live dB in production. You don't want to upset the production system, and it's hard to make it respond consistently to a test. You want to provide enough constraints so that you're sure what you're testing and that when there's a failure you know where to look.

The next step in scope is the integration test. This is where you're making sure that two things that you know work "correctly" (however that's defined) by themselves work well together. In the Bing GlobalOrtho project we spent a lot of time using WGS84 coordinates. We threw around a lot of latitudes and longitudes. We did this in the image stitcher and the poisson color blender. And all of the unit tests worked. Perfect. Let's hook these things together. And it worked. Mostly But the further east/west we went the weirder it got. Then all of a sudden things started crashing. Turns out some things took in latitude, longitude, others took longitude, latitude. It was only during the integration that we found the problem. and of course, you need a more complex system to do integration testing, but it's still not the full thing.

Then there are end-2-end tests. *That's* where you run the whole thing, in something not entirely unlike the production environment. With known inputs. Expecting known outputs. Really good for making sure nothing has broken, but not good at all for telling you what went wrong. In Global Ortho when the color of the output images changed by more than a certain amount we first had to figure out why. And that usually took longer than the actual fix. But again, without that kind of testing we never would have known.

So what kind of testing is there after end-2-end? You've run out of scope, but now you get to scale. There are a few kinds of scale. Maybe your system can handle blending 50 images of roughly the same place, but what if you have 1000? Or 10,000? Or your system behaves correctly at 100 Queries/Sec (QPS), but sometimes you get 10,000 QPS or more? What happens when your dataset grows by 10x? 1000x? More? What about parallelism? Breaking things into 10 pieces might cut your almost 90%, but at 100 pieces it fails or takes longer.

Then there's the kind of scale that describes the test space. Your system does the right thing in a few cases, but there's a combinatorial explosion of possible cases and there are millions of tests to run. How do you scale to that?

Then there's black box testing. Go outside your system. Act like a user. Using an entirely different mechanism, test what you're doing with no knowledge of the system other than the external APIs. Even here there are two kinds of tests. Those that make sure things work right, and those that make sure things don't break. Because those are two very different things. And remember, as Bill Gates saw 20+ years ago, even with all the testing, sometimes things go worng

by Leon Rosenshein

Syntax Matters

But memorizing all of the possible syntaxes (syntaxi?) doesn't. In my career I've spent months/years with _at least_ Ada, Assembly (x86), Bash, Basic, Csh, C/C++, C#, Fortran (4/77), Golang, HTML, Java, Javascript, Pascal, Perl, Python, Scala, SQL, and VisualBasic (v6 and VBA). Then there's "config" file formats, css, ini, json, xml, and, yaml. What about .bzl, .csv, .proto, and .thrift? What about your favorite DSL? Are they config files? Languages? Who knows? Who cares? 

Can I sit down in front of a compiler and pound out syntactically correct code in all those languages today? Not even close. I could manage "Hello World" in most of them, with a little help from the compiler/interpreter, but others (ada) I don't even remember where to begin other than there's a header that defines everything, and a separate implementation.

And that's OK. The important thing is to be able to read what's there, understand what the impact is, and understand the structures and data flow well enough to make the change you want without having unintended impact on something else. And in any sufficiently large system the syntax can't tell you that. It can hint, it can guide, but it can't tell you what the class/package/method/library in the next file is actually doing.

Plus, there are lots of good resources available online to help with the syntax part. Between them and your IDE memorizing where to put a ;, the order of method parameters, or whether it's int foo; or var foo int isn't the best use of your time.

So focus on the important things. Understanding the code in front of you. Writing code that the next person can understand. Thinking about WHY you're doing the thing you're doing and if there is a better, more systemic solution. And look up the syntax when you need it.

by Leon Rosenshein

Rubber Ducky, You're The One

On the silver lining front, one nice thing about WDP is that I get to spend more time with my kids. My daughter has taken to sitting with me on and off during the day, sometimes doing her schoolwork, sometimes watching videos, and sometimes being my debugging aid.

The other day she noticed I was arguing with my computer, doing some Google searches, then yelling (quietly) at my computer again. After she got over being surprised that I was using Google to figure things out I started explaining to her what I was trying to do. I was writing a bash script to get the members of an LDAP group and then see which members of that group weren't in a different group. Sounds simple, right? Conceptually, yes, but I wanted to be able to share the code, so I was making it a little more "production ready" than I might otherwise have. It also involved some relatively simple usage of jq to extract some fields and I wanted to pretty print the results in a way I could pipe into the next part of the chain. And things weren't going exactly how I wanted.

So I explained to her the services I was calling, what I expected the results to be, and what I wanted to extract. I explained the weird symbology of bash variables and why there were single quotes, double quotes, pipes and what a /dev/null was. I told her what cerberus was and why I needed to use it. I even complained a little about yab and YARPC and why I wished I didn't have to use it. She asked me some questions and I explained the answers to her. And I got it figured out, got the results I needed, and was able to share the tool and the results I needed. Then I thanked her for being my rubber duck. Initially that confused her even more, but when I explained rubber duck debugging she got that immediately.

For those that don't know, rubber duck debugging is how you do pair programming when you're alone. You explain the problem, the invariants, the processes and the intermediate results to something, traditionally a rubber duck. And you go into as much detail as you need to make sure the duck understands it. What happens quite often is that you realize where your assumptions and understanding don't match reality. It could be a problem with your memory, the documentation, or something else entirely, but you find the disconnect, and you fix it. Or you find the disconnect and you go update your understanding and then you fix it. And even if that doesn't happen your understanding of the problem goes way up and you can then ask a much better question, which means you're much more likely to get an answer that helps. So next time you run into a problem and get stuck, ask a rubber duck.

by Leon Rosenshein

Remote Extreme Programming

You've probably heard of pair programming. That's where two programmers work on one screen with one keyboard, taking turns being the typist and the navigator/observer. Not my favorite way to work long term, but I've definitely taken advantage of it during debugging or exploring a new area with another developer. It's really easy to do in person, and not too hard even now with WDP. Zoom and screen sharing are almost like being there in person.

At the same time we've got codesignal, which we use for phone screens, zoom interviews, and even some in-person interviews. That's (usually) not pair programming, just following along, but the interviewer has the ability to edit at the same time, if desired. In my experience it works great with simple audio, even better when there's video.

What if there were a way to take that experience and use it for live development of code in our codebase, but inside a fully featured IDE? Turns out such a thing exists, at least if you use VSCode. I haven't tried it out yet, but it's a full co-editing session, with editing of the same file with visible cursors, and live debugging, where both parties can look at/inspect the parts they want. Picture that. Working with someone, and if you want to inspect a variable, just check. Noo more saying. Set a breakpoint on line X. Hove over that variable for me. You could just do those things and explain what you're looking for. I haven't tried it out myself yet, but I'm going to Real Soon Now™. Hopefully someone reading this already has and can let us all know if it's as cool and useful as it looks.

by Leon Rosenshein

Bob The Builder

There are lots of design patterns. The most famous are probably the ones in the Gang of Four's book, Design Patterns. There are lots of good patterns in there. Each pattern has its strengths and weaknesses. Of course, like any good tool in your toolbox, you can use any particular pattern for multiple things. As important as it is to know when to use a pattern, knowing when to NOT use a pattern is more important. Just like you can use a screwdriver as a prybar when you need to, you shouldn't reach for a screwdriver when you need a prybar.

One such pattern is the Builder pattern for constructing new instances. There's a fairly narrow set of use cases, so most of the time it doesn't apply. The Builder pattern is most useful when you need create an immutable object with lots of overridable defaults. You could create a set of overloaded constructors with the common sets of options, and then another with every option, but that's hard to write, hard to use, and hard to maintain. You could write single-use setters, but how do you ensure they're all called before anything is used? What about validation? How do you know that enough has been set to validate? How do you make it readable? Extendable?

Enter the Builder pattern. Basically a collection of chained setters to a Builder that is then used to create the object with a final .Build() method. There are a few advantages. The selection and order of the setters is up to the user. There's no partially constructed object laying around to be misused. You get an explicit indication of when all the parameters are set and that it's time to do validation. Your immutable object springs into existence fully formed and ready to use. Got a new parameter with a sensible default? No problem add it to the Builder and your users won't know until they need it and it's already there. Need more than one of something? Call the .Build() multiple times. Need a bunch of things that are mostly the same? Instead of a single chain of builders bifurcate it at the right point.

Of course, nothing is free. You still need to set all those parameters on your object, so that code doesn't go away. You still need to do validation. Now you need to create a whole new "friend/embedded" thing called the builder. And your builder needs a getter/setter for every parameter, with some validation. So there's a bunch of code you wouldn't otherwise need. If you only have a handful of parameters and they're always needed there's a lot of overhead you should avoid.

But when it's appropriate Builders make things much easier to read/maintain and can help reduce the cognitive load, so next time you find yourself in that situation, consider using a builder.

by Leon Rosenshein

Alles Ist In Ordnung

Everything is in order. That's what coding standards are all about. Making sure everything looks right. Especially the ones we have linters for. Correct?

Sort of. Yes, coding standards include formatting and layout. And that's the part linters are best at finding/fixing and annoying us with. And it's important too. Especially in languages that we _mostly_ agnostic to the amount and location of whitespace. Like C/C++. You can pretty much throw spaces and tabs anywhere you want (outside of string literals) and the compiler won't care. Of course taken to the extreme that leads to the obfuscated C contest, and no-one calls that code readable. And that's led to lots of "standards" for formatting.

Golang is similar, in that it doesn't care much about whitespace, but there's also `go fmt` which *will* format your code the correct way, as defined by the language. So you have two options. Do it your own way or do it the official way. Other languages have more rigid spacing requirements. Python makes you pick an indentation for a block and stick with it, but doesn't care what you pick. Some languages let you specify the type for any variable, some have a default type based on the first letter of the variable's name (FORTRAN).

So formatting is kind of arbitrary, but consistency helps us all. Whether it's conventions for variable/method/class/constant names or spacing, indentation, or one-true-brace, the less mental gymnastics you need to go through when you see a new piece of code, the lower the cognitive load, and as mentioned many times, that's a good thing. Especially when your code-base and programming team(s) are large. What works for 5 people doesn't work for 500.

But coding standards are about more than just raw formatting. That's just the most visible part. The meat of standards is really the set of recommendations about how to structure your code. When to use classes, methods, interfaces, libraries, etc. And those are much harder to quantify/lint for. But to me they're even more important. And it goes back to cognitive load. Making things digestible. Making things clear. Being up front about side-effects and possible error cases. Making sure the code clearly identifies the programmers intent.

And that's where it gets hard. Because there are going to be exceptions. And if you've mandated strict adherence to a standard then there's no clean way to express what you want. So how do you handle that?

First and foremost, by making sure the "standards" are recommendations. Very strict recommendations, but not absolute laws. You need to have a good reason to override, but there needs to be a way to silence the linters when needed. Without that things get obfuscated just to placate some code. And we should never let formatting define us.

Second, by making sure everyone knows why the standard includes that rule in the first place. What's it really there for? Is there a better way to achieve that goal than by strictly applying the rule?

Third, by reexamining the standards periodically. Not changing them on a whim, but looking to see which parts are helping, which are hurting, and what areas need better coverage.

Of course, not everyone agrees with me. I've included a link to someone who feels almost entirely the other way around. It's an interesting read, and I encourage all of you to check it out. And then share what you think in the thread. I'll start it off, then go make some popcorn. This should be fun.

by Leon Rosenshein

Time For A Walk

I don't know about you, but I've been spending a lot of time in one spot for the past few weeks. It's great that I've got a workspace with a stand-up desk and enough pixels to keep all the text I need to read big enough to see, but I don't even get to walk to the various conference rooms anymore. And that's a problem for me. No changes of scenery. No break in the pattern. Less exercise. So what's a developer to do?

One option is to take an older idea and update it for today's WDP times. The walking meeting. We've probably all had them. Sometimes it was a 1:1 with the boss, sometimes it was walking down the block to grab a cup of coffee that isn't from the office kitchen. Obviously we can't do things quite the same way now, but in some ways it's gotten easier to have a walking meeting. The meeting is already on zoom, so location doesn't matter. Especially if it's a brainstorming session or a 1:1, it might not matter where you are. Just grab your headphones and phone and take a walk during the meeting.

And, there are real benefits. A change in scenery. A little exercise. Fresh air. Some direct sunlight and vitamin D. And there's evidence that people are more creative when they're moving. Even back in the office when I got stuck on a problem a few circles around the floor often got me out of my rut and moving towards the solution again. Give it a shot. It can't hurt.

by Leon Rosenshein

Staying In Sync

Technology changes. A lot. All the time. So how do you stay on top of it? How do you know what’s coming so you can be prepared?

One important thing to remember is that you can’t be an expert in everything, so don’t try to be. You’ll have your strengths, things that you’re better at, or enjoy working on more, and you should always play to them. There are going to be things that are “outside your wheelhouse”, and that’s fine. Unless you’re working completely isolated, you probably shouldn’t work to make those things your strengths. But that doesn’t mean you should ignore those things.

It starts with knowing the tools you use. Knowing the capabilities of the standard libraries and what the common extensions are. What their strengths and limitations are and when to use them and when to avoid them. That includes not just the built-in libraries, but also the open source and internal libraries. Don’t forget about the other dev tools we use, such as simple editors (vim vs emacs), IDEs(VSCode, JetBrains, emacs), command shells (bash, zsh, fish?), Git, GitHub, and Buildkite to name a few. Sometimes a bash one-liner is the right solution, sometimes you need PySpark or Zeppelin.

Then there’s the ecosystem we work in. AWS and all its offerings. DevEx/Infrastructure tools (bonsai, sq, infra, batch-api, HDFS, Piper, atlantis, etc). Core Business infrastructure (M3, umonitor, usso/ussh, uOwn, Querybuilder/Queryrunner, etc). To understand that we have the Product Catalog which lets our internal customers know what technologies are supported and should be used.

But what about new things? The up and coming tools/technologies that might not be ready, but you should be thinking about so you’re ready for them when they’re ready for you. For general software development/architecture Thoughtworks is a good place to start, particularly their Technology Radar. InfoQ and DZone are other good resources, and they’ll send you a daily list of articles for topics you sign up for if you want.

And however you learn about new things, share them. That way everyone benefits.