Recent Posts (page 52 / 70)

by Leon Rosenshein

It Depends

Time for a car analogy. What's the right way to make your car faster? More reliable? More efficient? Have higher resale value?

There's really only one answer to all those questions. And that answer is "It depends." It depends on what your priorities are. It depends on where you're starting. It depends on what you mean by those questions. It depends on how much you can spend to meet your priorities. Does faster mean top speed, trap time in the ¼ mile, or 0-60 time? Is reliability about MTBF, cost to repair, or total downtime? Is efficiency about moving one person from home to office, 50 people from a suburb to an urban core, or moving 400T of stuff from one end of a strip mine to another?

The same is true in software development. Want your software to be faster? Want it to crash less? Use less resources? Reduce time to market? If someone comes in with a silver bullet and says they know the right answer to that question a-priori, they're almost certainly wrong, and if they happen to be correct, in your exact case, they got lucky.

Sure, we have best practices, and we should probably follow them, but when you get down to it, those best practices are guidelines. If you really have no clue about what you're trying to do and why then best practices are a good place to start, until you know better. And that's the thing.

When you know better you should choose to do the right thing. Because the right thing depends on knowing why you're doing something. Engineering is about tradeoffs, but the only way to make informed decisions is to know what you're trading between, and why. Because *it depends."

Once you know what you're minimizing and what you're maximizing and what the cost functions are between them, you can get something close to the right answer. For your specific situation. At that particular time. With those particular constraints.

by Leon Rosenshein

Real Engineering

Here’s a question for you. Are you a programmer, developer, computer scientist, software engineer, hardware engineer, or something entirely different? Maybe you’re an artist working in the medium of bits? A data wrangler? Some combination of all of these, depending on the day and the task at hand?

For the last 50 years or so people have been trying to figure out if software development was an art or a science. Or was it engineering? When I was in college there was no such thing as a degree in software engineering. There were specialized electrical engineers that built computers, there were computer scientists that tried to figure out what to do with them, and the rest of us engineers that used them. The math department in the School of Arts and Science had a lot to say too, particularly around formal logic and correctness. But for most of us who were writing programs the computers were tools to do a job. Sometimes we wrote programs to help other people do their jobs, but writing code was almost always in service of some other task. And we treated it that way. Just get it done. Small groups, late nights.

Then I got out into the real world and something changed. I became a “software” engineer instead of a Mechanical and Aerospace engineer. But really, nothing else changed. Then I went to work for a game company, and instead of building software to do something, we built software to sell. And we had deadlines. And we missed them. So we tried to engineer harder. And we still missed our dates. Then I went to work for Microsoft. And they really engineered hard. Waterfall development. Months of planning. Then start doing. Still missed our deadlines a lot, but at least we saw it coming. But it was engineering. Requirements. Design. Plan. Build.

Then came Scrum and Agile and Extreme. Throw all that planning out. Just do something. Figure out the goal along the way. Don’t worry about done, just move fast and adjust as you go. We did ship things more often, but big changes got hard and we never really knew where we were going. It sure didn’t feel like engineering.

So the debate continued. Is it art or science? Craftsmanship or Engineering? Lots of people have thought about it and talked about it. I say it’s engineering. Engineering is not about doing the “perfect” thing. There is no perfect thing. It’s about tradeoffs and dealing with uncertainty and doing the best you can to meet the goals and priorities with what you have available. And one of the best explanations of not only that journey, but where we are now and how we can get even better at the process of what we do, comes from Glen Vanderburg in his Real Software Engineering talk. It’s about an hour long (45 minutes at 1.5x), but well worth the time.

by Leon Rosenshein

Outdoor Sports

Continuing on with the string of GlobalOrtho stories, image capture, both aerial and terrestrial, is, just like operating a robot car, an outdoor sport.

At the heart of the GlobalOrtho project was the UltraCam-G. Designed and built by our team in Graz Austria. Something like 200 MP, taking simultaneous RGB, Monochrome, and NIR images at 30cm resolution for the RGB image. And this camera was tested. Countless flights over Graz and the surrounding areas. Calibrated for physical construction, lens distortion, thermal drift, chromatic aberration and anything else the designers could come up with. The pictures were stunning. The 3D modeling was amazing. Not just 2.5D shells, but full 3D models with undercuts and holes. So we sent it out into the field.

And the feedlots were purple. The edges of the images were red. As I mentioned the other day there were spikes and holes. How could this have happened? These cameras were tested. Over and over again. And all the tests came back great. We sent one back for recalibration, but the before and after results showed no change, and the test images were spot on.

So we kept digging. And we realized a few things. Color balance. It turns out that Graz and the surrounding areas are Austrian Alps (who would have guessed). Lots of alpine forests and orange tiled roofs. And the software did great in those areas. But there aren't a lot of feedlots. And color correction was done in a lab. Yes, we used sunlight equivalent lighting, but the room was a few meters deep. Outside there were cloudy days, dusty days, humid days, and in some places smoggy days. Plus, the camera flew at 5000m, and with a +/-40° FOV, the amount of air between the camera and the ground was very different between the center of the image and the edge.

Geometry. Lots of church steeples and building corners. But no miles square corn fields with waving stalks. Or pastures with walking cows. Or large lakes. Or high rise urban cores with deep canyons. Lots of environments that weren't part of the test set. And the software struggled.

Why, because even though we captured hundreds of thousands of test images, and ran hundreds of test jobs. they were all basically the same operational domain. For all the hours we spent testing, we really only ran a few tests. Then we got out into the real world and the situations were different. So we had to evolve. Make things more dynamic and adaptive. Because that's the way the world is.

by Leon Rosenshein

Murder Mystery Theater - Acting All Roles

-- By Andrew Gemmel

There’s been a mild annoyance bothering developers on our team - and likely others - for a few months now. Occasionally the ssh-agent on development machines will die. Needing a ussh cert for most remote actions, the remedy for one terminal session is a quick eval $(ssh-agent) or more permanently, restarting the machine. We all chalked it up to a bad chef configuration or similar, at least until today.

Today, @mike.deats was debugging a separate IDE issue on his machine and noticed something odd. Without fail, he could reproduce this issue by running all tests in the atg-services repo. Ok, that’s disconcerting. A quick bisect effort isolated the problem to a single Golang package. One that I had written. Heavily unit tested, in fact notoriously so. This package is the taskhost program for the BatchAPI. If you’ve ever run a BatchAPI job, you can thank this code for its success. 

The taskhost is the thin wrapper between kubernetes and your user code that reports any issues back to the BatchAPI and ensures that your logs end up in the right place. The tests for this program basically mimic various job scenarios in kubernetes, kicking off a number of taskhost processes masquerading as docker containers and observing the state of the filesystem and output streams that result. 

In order to do this in a test environment, the taskhost always interacts with the outside world through a dependency injection context that provides things like a filesystem, log writers, AWS clients, and a shell process runner. 

Or at least, that was true until pull request 981 was landed. This was a late-night code change that I deployed while on-call to mitigate an outage. Long story short, an issue with the rNA log-reader was overwhelming the disks in our cluster and causing machines to hit 100% disk usage and get wedged. To mitigate this, that change deletes the log-reader cache in /tmp between each BatchAPI task run.

If you read through that PR carefully, you’ll notice that the RemoveContents() function I so carefully copy and pasted from StackOverflow does not use the dependency injection filesystem. That’s right, every single time the taskhost unit tests run on a machine, they delete everything in /tmp on the user’s machine, including the ssh agent’s ussh cert.

Wow. It’s a miracle that killing ssh agents was the worst thing that this mistake did. The corresponding fix was as simple as deleting that mitigation code, as the underlying log-reader problem has long since been remedied.

There’s a few lessons here. One, hot fix code is a necessary evil but checking it in without careful audit is A Bad Thing. Second, when that evil code is checked in, a ticket to ensure it’s removed as soon as possible would be A Good Thing. Third, debugging can often become a game of murder mystery theater where you are not only the detective, but the murderer and victim too.

by Leon Rosenshein

GIGO

Even the greatest algorithm can't correct for bad data. Ever hear of photogrammetry? Probably. It's using images to understand the physical world. We use it to map the world. Using stereoscopic techniques and two (or more) pictures of a scene from a known position, you can extract 3D information. Roughly speaking you find points in each image that are the same thing, then, correcting for all sorts of distortions, use the difference in camera locations and the directions to the point from each camera to calculate the position of that point relative to the cameras. Do that for enough points and you get a depth map. One way to find those points is with the SIFT algorithm. It's really nice because it handles differences in scale and orientation. And with our SDVs the images are taken at the same time, so the world hasn't changed between the images.

For aerial photography that isn't the case. Typically there's one airplane, with one camera flying over the area, taking one picture at a time, then looping around and flying a parallel track slightly offset. Repeat this pattern all day. To make the needed stereo pairs images are taken with lots of overlap, typically 80+% in the direction of flight, and 20+% between image strips. Using differential GPS, some Kalman filters, and lots of math, you can get pretty good location info for where the camera was when the image was taken, so that part is covered.

What isn't covered is that the world changes. Trees blow in the wind. Cars move. Waves wash up on the shore. Cows walk.

As part of the Global Ortho project we mapped the continental US and Western Europe with 30 cm imagery and generated a 2.5D surface map with about 4 meter resolution. We did this by splitting the target areas into 1° cells and collecting and processing data in those chunks. Turns out that flying each track, then turning around and flying back takes a few minutes. That means that pictures taken at the beginning of one strip and the end of the next can be 3-5 minutes apart in time.

And lots can happen in that time. Fast things, like planes, trains, and automobiles have moved far enough that the SIFT algorithm doesn't try to match them across images. Things that don't move far, like treetops blowing in the wind get lost in the image resolution. But things that move slowly, but keep going have a wonderful effect. Remember that cow that was walking? It probably gets the same SIFT id since it's a 3x5 black spot against a green pasture. And it didn't move that far, so it gets matched with the one from 3 minutes ago. The same thing happens with whitecaps on open water. Then we triangulate. And depending on which way it moved, you either get a spike or a well in the surface model. All because the cows don't stand still.

And those spikes kept lots of folks employed. Their job was to look at the model, find anomalies, then go into a 3D modeling program, and pound them flat. Yes, we gave them tools to find the issues and we did automatic fixup where we could, but we still needed eyes on all of the data to make sure it was good. All because a cow thought that patch of grass over there looked better. Which meant our data was a little messy. And the automation didn't understand messy data.

So keep your data clean. The earlier you identify/fix/remove bad data the better your results, the less manual correction and explaining of what happened you need to do, and the more your results will be trusted.

by Leon Rosenshein

Test It Again Sam

tdd

Unit tests, integration tests, black box tests, end to end tests, user tests, test driven development, demo days. There are lots of kinds of tests. And they all can provide value. But only if you run the right tests at the right time. And as with so many things, it comes back to context, scope, and scale.

You want to have enough inputs to test that it works, that the different combinations of flags/features/datasets all work together the way you expect. But not just that the correct cases are handled correctly. You need to test that you detect and provide useful error information if the inputs don't make sense, you can't handle them, or something goes wrong during processing. That's the context part.

For scope, you want to run just enough code to test the system under test. There's lots of range to scope. From an individual algorithm to a class/package/executable to a service/distributed service/ecosystem. And your tests and framework need to reflect that.

If you're testing an algorithm then write the algorithm and enough code around it to test that it works per the above. Mock out everything but the algorithm. Provide the data in the expected format. Know what the answer is supposed to be. Remember what you're testing (the algorithm). These kinds of tests are generally called unit tests.

Unit tests can also have a slightly bigger scope. If you're testing the external interface of a class/library/exe then you need to provide enough environment around it to run, but you need to control the environment. This isn't the time to run against the live dB in production. You don't want to upset the production system, and it's hard to make it respond consistently to a test. You want to provide enough constraints so that you're sure what you're testing and that when there's a failure you know where to look.

The next step in scope is the integration test. This is where you're making sure that two things that you know work "correctly" (however that's defined) by themselves work well together. In the Bing GlobalOrtho project we spent a lot of time using WGS84 coordinates. We threw around a lot of latitudes and longitudes. We did this in the image stitcher and the poisson color blender. And all of the unit tests worked. Perfect. Let's hook these things together. And it worked. Mostly But the further east/west we went the weirder it got. Then all of a sudden things started crashing. Turns out some things took in latitude, longitude, others took longitude, latitude. It was only during the integration that we found the problem. and of course, you need a more complex system to do integration testing, but it's still not the full thing.

Then there are end-2-end tests. *That's* where you run the whole thing, in something not entirely unlike the production environment. With known inputs. Expecting known outputs. Really good for making sure nothing has broken, but not good at all for telling you what went wrong. In Global Ortho when the color of the output images changed by more than a certain amount we first had to figure out why. And that usually took longer than the actual fix. But again, without that kind of testing we never would have known.

So what kind of testing is there after end-2-end? You've run out of scope, but now you get to scale. There are a few kinds of scale. Maybe your system can handle blending 50 images of roughly the same place, but what if you have 1000? Or 10,000? Or your system behaves correctly at 100 Queries/Sec (QPS), but sometimes you get 10,000 QPS or more? What happens when your dataset grows by 10x? 1000x? More? What about parallelism? Breaking things into 10 pieces might cut your almost 90%, but at 100 pieces it fails or takes longer.

Then there's the kind of scale that describes the test space. Your system does the right thing in a few cases, but there's a combinatorial explosion of possible cases and there are millions of tests to run. How do you scale to that?

Then there's black box testing. Go outside your system. Act like a user. Using an entirely different mechanism, test what you're doing with no knowledge of the system other than the external APIs. Even here there are two kinds of tests. Those that make sure things work right, and those that make sure things don't break. Because those are two very different things. And remember, as Bill Gates saw 20+ years ago, even with all the testing, sometimes things go worng

by Leon Rosenshein

Syntax Matters

But memorizing all of the possible syntaxes (syntaxi?) doesn't. In my career I've spent months/years with _at least_ Ada, Assembly (x86), Bash, Basic, Csh, C/C++, C#, Fortran (4/77), Golang, HTML, Java, Javascript, Pascal, Perl, Python, Scala, SQL, and VisualBasic (v6 and VBA). Then there's "config" file formats, css, ini, json, xml, and, yaml. What about .bzl, .csv, .proto, and .thrift? What about your favorite DSL? Are they config files? Languages? Who knows? Who cares? 

Can I sit down in front of a compiler and pound out syntactically correct code in all those languages today? Not even close. I could manage "Hello World" in most of them, with a little help from the compiler/interpreter, but others (ada) I don't even remember where to begin other than there's a header that defines everything, and a separate implementation.

And that's OK. The important thing is to be able to read what's there, understand what the impact is, and understand the structures and data flow well enough to make the change you want without having unintended impact on something else. And in any sufficiently large system the syntax can't tell you that. It can hint, it can guide, but it can't tell you what the class/package/method/library in the next file is actually doing.

Plus, there are lots of good resources available online to help with the syntax part. Between them and your IDE memorizing where to put a ;, the order of method parameters, or whether it's int foo; or var foo int isn't the best use of your time.

So focus on the important things. Understanding the code in front of you. Writing code that the next person can understand. Thinking about WHY you're doing the thing you're doing and if there is a better, more systemic solution. And look up the syntax when you need it.

by Leon Rosenshein

Rubber Ducky, You're The One

On the silver lining front, one nice thing about WDP is that I get to spend more time with my kids. My daughter has taken to sitting with me on and off during the day, sometimes doing her schoolwork, sometimes watching videos, and sometimes being my debugging aid.

The other day she noticed I was arguing with my computer, doing some Google searches, then yelling (quietly) at my computer again. After she got over being surprised that I was using Google to figure things out I started explaining to her what I was trying to do. I was writing a bash script to get the members of an LDAP group and then see which members of that group weren't in a different group. Sounds simple, right? Conceptually, yes, but I wanted to be able to share the code, so I was making it a little more "production ready" than I might otherwise have. It also involved some relatively simple usage of jq to extract some fields and I wanted to pretty print the results in a way I could pipe into the next part of the chain. And things weren't going exactly how I wanted.

So I explained to her the services I was calling, what I expected the results to be, and what I wanted to extract. I explained the weird symbology of bash variables and why there were single quotes, double quotes, pipes and what a /dev/null was. I told her what cerberus was and why I needed to use it. I even complained a little about yab and YARPC and why I wished I didn't have to use it. She asked me some questions and I explained the answers to her. And I got it figured out, got the results I needed, and was able to share the tool and the results I needed. Then I thanked her for being my rubber duck. Initially that confused her even more, but when I explained rubber duck debugging she got that immediately.

For those that don't know, rubber duck debugging is how you do pair programming when you're alone. You explain the problem, the invariants, the processes and the intermediate results to something, traditionally a rubber duck. And you go into as much detail as you need to make sure the duck understands it. What happens quite often is that you realize where your assumptions and understanding don't match reality. It could be a problem with your memory, the documentation, or something else entirely, but you find the disconnect, and you fix it. Or you find the disconnect and you go update your understanding and then you fix it. And even if that doesn't happen your understanding of the problem goes way up and you can then ask a much better question, which means you're much more likely to get an answer that helps. So next time you run into a problem and get stuck, ask a rubber duck.

by Leon Rosenshein

Remote Extreme Programming

You've probably heard of pair programming. That's where two programmers work on one screen with one keyboard, taking turns being the typist and the navigator/observer. Not my favorite way to work long term, but I've definitely taken advantage of it during debugging or exploring a new area with another developer. It's really easy to do in person, and not too hard even now with WDP. Zoom and screen sharing are almost like being there in person.

At the same time we've got codesignal, which we use for phone screens, zoom interviews, and even some in-person interviews. That's (usually) not pair programming, just following along, but the interviewer has the ability to edit at the same time, if desired. In my experience it works great with simple audio, even better when there's video.

What if there were a way to take that experience and use it for live development of code in our codebase, but inside a fully featured IDE? Turns out such a thing exists, at least if you use VSCode. I haven't tried it out yet, but it's a full co-editing session, with editing of the same file with visible cursors, and live debugging, where both parties can look at/inspect the parts they want. Picture that. Working with someone, and if you want to inspect a variable, just check. Noo more saying. Set a breakpoint on line X. Hove over that variable for me. You could just do those things and explain what you're looking for. I haven't tried it out myself yet, but I'm going to Real Soon Now™. Hopefully someone reading this already has and can let us all know if it's as cool and useful as it looks.

by Leon Rosenshein

Bob The Builder

There are lots of design patterns. The most famous are probably the ones in the Gang of Four's book, Design Patterns. There are lots of good patterns in there. Each pattern has its strengths and weaknesses. Of course, like any good tool in your toolbox, you can use any particular pattern for multiple things. As important as it is to know when to use a pattern, knowing when to NOT use a pattern is more important. Just like you can use a screwdriver as a prybar when you need to, you shouldn't reach for a screwdriver when you need a prybar.

One such pattern is the Builder pattern for constructing new instances. There's a fairly narrow set of use cases, so most of the time it doesn't apply. The Builder pattern is most useful when you need create an immutable object with lots of overridable defaults. You could create a set of overloaded constructors with the common sets of options, and then another with every option, but that's hard to write, hard to use, and hard to maintain. You could write single-use setters, but how do you ensure they're all called before anything is used? What about validation? How do you know that enough has been set to validate? How do you make it readable? Extendable?

Enter the Builder pattern. Basically a collection of chained setters to a Builder that is then used to create the object with a final .Build() method. There are a few advantages. The selection and order of the setters is up to the user. There's no partially constructed object laying around to be misused. You get an explicit indication of when all the parameters are set and that it's time to do validation. Your immutable object springs into existence fully formed and ready to use. Got a new parameter with a sensible default? No problem add it to the Builder and your users won't know until they need it and it's already there. Need more than one of something? Call the .Build() multiple times. Need a bunch of things that are mostly the same? Instead of a single chain of builders bifurcate it at the right point.

Of course, nothing is free. You still need to set all those parameters on your object, so that code doesn't go away. You still need to do validation. Now you need to create a whole new "friend/embedded" thing called the builder. And your builder needs a getter/setter for every parameter, with some validation. So there's a bunch of code you wouldn't otherwise need. If you only have a handful of parameters and they're always needed there's a lot of overhead you should avoid.

But when it's appropriate Builders make things much easier to read/maintain and can help reduce the cognitive load, so next time you find yourself in that situation, consider using a builder.