Recent Posts (page 52 / 71)

May 1, 2020 by Leon Rosenshein

What's In A URI?

The other day I came across an article that asked if you could identify every part of a URL, then listed the 6 parts of a URL, scheme://domain:port/path?query=string#anchor. Now that's not wrong, in that those 6 parts, put together that way do make up a valid URL, but that's hardly all of the story. In the article's defense, it does say there are other parts and that those are the most common, but if you went into an interview and insisted that was the definition of a URL you wouldn't be "acing" the interview.

In reality, URIs (which include URLs) are made up of 5 parts, scheme://[authority]path?query#fragment, with each of those having its own definition. Scheme and path are required, but authority, query, and fragment are optional.

Yes, domain:port is one example of an authority, but so is bob:Password@contoso.com:6543. You won't see that very often, but it is valid. Between the scheme and the path there are some number of /'s. Sometimes (usually) there's 2 of them, but sometimes there's 3, and occasionally 0. I'm pretty sure 1 / is also valid, but I'm not actually sure. And you can have pairs of them inside the path section. It only looks like an on-disk path. It's not.

According to the spec, query is pretty much open. It's a string. By convention, it's a set of key-value pairs, but that's not required. Fragment is even less clearly defined since it's defined as a sub-resource inside the main URI, so it's totally dependent on the scheme.

And then there's character set. A URI is basically limited to [a-z,A-Z,0-9,._~] with a bunch of caveats depending on which part of the URI you're talking about. Any other character needs to be encoded.

So, as noted, URIs are complicated. And hard to parse correctly. The solution? Don't. You'll only get it wrong and get tripped up later. Use the one built into your language or find an appropriate library. For c++ that seems to be cpp_netlib_uri, and for python, urllib. For Golang/Java/C# (anyone actually using C# ?), there are great implementations in the standard library.

April 30, 2020 by Leon Rosenshein

Else, Switch, And Map

cognitive load

Back when I was in 7th grade I entered some kind of scholastic programming competition. No idea what it was called, or what most of the tasks were. I do remember it was 3 hours on a Sunday afternoon, we were using Basic on a TRS-80, and one of the tasks was to come up with a frequency chart of characters in a block of text. The code I submitted looked a lot like

1 DIM A as string 2 DIM I as integer 3 DIM C as string 4 DIM N(26) as integer 5 INPUT A 6 FOR I = 1 to LEN (A) 7 C = MID(A, I, 1) 8 IF C <> "A" AND C <> "a" THEN GOTO LETTER_B 9 N(0) = N(0) + 1 10 LETTER_B: 11 IF C <> "B" AND C <> "b" THEN GOTO LETTER_C 12 N(1) = N(1) + 1 13 LETTER_C: 14 IF C <> "C" AND C <> "c" THEN GOTO LETTER_D 15 N(2) = N(2) + 1

° ° °

82 LETTER_Z: 83 IF C <> "Z" AND C <> "z" THEN GOTO NEXT_I 84 N(25) = N(25) + 1 85 NEXT_I: 86 NEXT I 87 FOR I = 0 to 25 88 PRINT "There were"; N(I); " "; CHR(65+I); "'s" 89 NEXT I

Hey, It's ugly, but it worked, or at least it mostly did. Lots of copypasta and lots of copypasta errors. In the comparisons, the indices, and the GOTOs. But that was early in my career. Not long after that I realized I could have just used the ASC() function to get the ascii character and use that (suitably adjusted) as the index. So all those sequential IFs turn into a simple set of assignments. With a more modern language a map of character or string to integer would be even easier. The body of the loop turned into N[ToUpper(C)]++ Much easier to read, much less error-prone to type, and handles things that aren't alphabetic characters to boot. My original attempt didn't crash on numbers or punctuation, but it didn't count them either.

The point of this isn't that my code wasn't very clean 40 years ago (it wasn't), but that while you can take a very procedural approach to coding, a better choice is almost always to let the data and data structures guide you. Cascading if's are rarely a good idea. For smaller numbers of choices consider a switch. That can make things a lot easier to read at least.

For multi-dimensional things, you could do nested switches, methods. One approach I like in such cases is a multi-dimensional map. Let's say you need to calculate something which is a function of color, width, shape, and language. You could have one function with all of the inputs and a bunch of internal logic, some of which might be very different (shape/language differences). You could come up with a class hierarchy that handles it, create the classes, make a factory, and use it. Or, write the functions that are different, use the inputs and a map, and decide which one to call. In pseudo-code something like map[shape][language](func (c color, w linewidth, s shape, l language)), and use

calculator := map[shape][language] { {CIRCLE, ENGLISH, CalcCircleRomance}, {TRIANGLE, ENGLISH, CalcTriangleRomance}, {CIRCLE, FRENCH, CalcCircleRomance}, {TRIANGLE, FRENCH, CalcTriangleRomance}, {CIRCLE, ITALIAN, CalcCircleRomance}, {TRIANGLE, ITALIAN, CalcTriangleRomance}, {CIRCLE, RUSSIAN, CalcCircleCyrillic}, {TRIANGLE, RUSSIAN, CalcTriangleCyrillic}, {CIRCLE, BULGARIAN, CalcCircleCyrillic}, {TRIANGLE, BULGARIAN, CalcTriangleCyrillic}, }

func := calculator(shape, language)

if (func == null) { throw Unsupported } result := func(color, width, shape, language) ...

With this pattern the decision logic is collected in one place and it's clear what each pair of shape/language is going to do. The list of supported pairs is easy to see, and it's easy to extend. All of which tends to reduce cognitive load. Which, as I've mentioned before, is a good thing.

Obviously you can take this too far, and at some point a polymorphic hierarchy makes sense. But if you don't need that complexity this is a good compromise.

April 29, 2020 by Leon Rosenshein

Fear, Courage, and Professionalism

Rumors of layoffs suck. Let's be upfront about that. There are no platitudes that make it easy. I know. I've been there. Take all the downsides and issues from a layoff and then layer in lots of fear, uncertainty, and doubt. You don't know if or when something might happen, or even who might be impacted, but life goes on. Work goes on. So what do you do?

I'm not suggesting that you simply ignore rumors, social media, Blind, or news reports. That's just unrealistic, and probably impossible. Yes, the rumors will impact you. It will impact your productivity. It will impact your attitude towards yourself, your work, your co-workers, and your friends/family.

What I have found that works for me, for teams I've worked on, and teams that have worked for me, is professionalism and staying focused. Regardless of the rumors, we all still have our jobs. The tasks that were there yesterday are still there now, and ignoring them won't make them go away. In all cases ignoring the work makes things worse overall, and in most cases ignoring it makes it worse for you personally.

Consider the best case scenario. It was just a rumor and nothing happened. If you had stopped working then you're just that much behind and now you and your team need to make up that lost time.

What if there were some layoffs, but you still have your job. Again, the work hasn't gone away. There will be some kind of schedule adjustment, but the amount of work is the same, just fewer people, so lost time is even harder to make up. How do you want to be thought of in 6 months or a year? How you respond now helps define how you are seen in the future.

And what if the rumors were true and you are impacted. I don't know about you, but I like my coworkers, and I want them to have a positive impression of me. It's highly likely we'll come in contact again, and I don't want to be "that person". Whether it's getting stuff done, sharing as much knowledge as possible, or just listening, it's preparing for my future. And who knows, you might want to work for the company again in the future. You don't want to burn any bridges.

But what about preparing for the future? Like I said, don't ignore the rumors. This is probably a good time to touch up your LinkedIn page and make sure your resume is up to date. Think about what's important to you in your job and your career. You should be doing that anyway, but this is a good reminder. Regardless of how this ends up you'll be having discussions about your career with your boss, so knowing what you want always helps.

Finally, a few quotes from a couple of Frank's (Roosevelt1, Roosevelt 2, Herbert). Always useful, but maybe more timely now.

April 28, 2020 by Leon Rosenshein

80/20 Rule

Ever notice how projects seem to go quickly at the beginning and then take longer and longer when you get close to the end? Or the opposite, that making a few small fixes in a codebase can make everything better? That's the Pareto Principle at work. 80% of the work takes 20% of the time, and the remaining 20% of the work takes 80% of the time. Fixing 20% of the bugs eliminates 80% of the issues customers are seeing.

The Pareto principle isn't a universal content, and it's not about cause and effect. It's a post hoc observation. It's really just fitting reality to a power law distribution. You can't use it to predict which 80% takes 20% of the time, or even to say it's why things were distributed the way they were. That's not just post hoc, that's Post hoc ergo propter hoc

But even with that said, knowing that something is going to come up and take the lion's share of the resources (time, manpower, compute, etc) is something you can use along the way. Especially when you're doing something for the first time and don't have lots of experience. If you have 10 things to do in 10 days and when you start they're all unknown that's OK. However, if you get to day 5 and you've only completed 5 of the tasks, it might be time to start worrying. Sure, you've done half the tasks in half the time, but you have no idea if you've done half the work in half the time. And the Pareto principle tells you that there's a good chance that you haven't. So maybe it's time to tell someone the situation.

The other thing to be aware of, especially when you're trying to have a big impact on things, is to remember that it often applies the other way as well. 20% of your feature set is used 80% of the time. 80% of the processing time might be spent in 20% of your code. So nailing that 20% of your total feature set first or optimizing the right 20% of the code will have a much bigger impact than starting with a random work item and then randomly picking them until you run out of work.

But remember what I said at the beginning. The Pareto principle is a post hoc explanation of what happened. It's not the cause and it isn't a real predictor. It's just something that happens a lot and you need to be aware of. And 80/20 isn't a magical pair. It could be 90/10, 70/30, or it might not even add up to 100. It's just one more thing to keep in mind. 80% of the time.

April 27, 2020 by Leon Rosenshein

Awareness

The world is always changing. And the rate of change appears to be changing as well. The stone age lasted 3000 years, and ended 5000 years ago. The bronze age took the next 1800 years, ending 3200 years ago. Iron ruled for about 1500 years, and technology stabilized for a while. Then things took off again. The industrial revolution, the atomic age, the space race, the silicon era, all in the last 250 years. Or, to put it in perspective, my grandparents grew up before phones, my parents had a party line, and the computing power you have in the phone in your pocket is probably more than the entire world had in 1965. And here we are on the cutting edge of technology. But technology isn't just deep, it's wide. So how do you keep track of things? Not only what's happening now, but what's just being talked about now and won't be a "thing" for 2 or 3 years?

One of the things you can do is find a person or group of people you trust who do think about things like that, and keep track of what they're thinking and saying. In the computer world one of those groups is ThoughtWorks. They're a cross between a consulting company and a think tank. On the think tank side they put out something called the Technology Radar. About twice a year they get together and think about the state of the industry, what's happening, what's hot, what's past its prime, and what to get ready for. I've been paying attention for the last 5 years or so, and they've done a pretty good job of prediction. I think that comes from the fact that they're consultants who need to understand how things really work if they want to get paid.

Regardless, they've been doing it for about 10 years now, and they put together a retrospective of the big changes. It's a pretty good synopsis of how things have changed, along with some predictions on the next 10 years. Something interesting to check out if you've got a few minutes.

April 24, 2020 by Leon Rosenshein

On Language

Just a little humor today.

ACHTUNG

ALLES TURISTEN UND NONTEKNISCHEN LOOKENSPEEPERS!

DAS KOMPUTERMASCHINE IST NICHT FÜR DER GEFINGERPOKEN UND MITTENGRABEN! ODERWISE IST EASY TO SCHNAPPEN DER SPRINGENWERK, BLOWENFUSEN UND POPPENCORKEN MIT SPITZENSPARKEN.

IST NICHT FÜR GEWERKEN BEI DUMMKOPFEN. DER RUBBERNECKEN SIGHTSEEREN KEEPEN DAS COTTONPICKEN HÄNDER IN DAS POCKETS MUSS.

ZO RELAXEN UND WATSCHEN DER BLINKENLICHTEN.

The inspiration for your favorite language:

Python: What if everything was a dict?
Java: What if everything was an object?
JavaScript: What if everything was a dict *and* an object?
C: What if everything was a pointer?
APL: What if everything was an array?
Tcl: What if everything was a string?
Prolog: What if everything was a term?
LISP: What if everything was a pair?
Scheme: What if everything was a function?
Haskell: What if everything was a monad?
Assembly: What if everything was a register?
Coq: What if everything was a type/proposition?
COBOL: WHAT IF EVERYTHING WAS UPPERCASE?
C#: What if everything was like Java, but different?
Ruby: What if everything was monkey patched?
Pascal: BEGIN What if everything was structured? END
C++: What if we added everything to the language?
C++11: What if we forgot to stop adding stuff?
Rust: What if garbage collection didn't exist?
Go: What if we tried designing C a second time?
Perl: What if shell, sed, and awk were one language?
Perl6: What if we took the joke too far?
PHP: What if we wanted to make SQL injection easier?
VB: What if we wanted to allow anyone to program?
VB.NET: What if we wanted to stop them again?
Forth: What if everything was a stack?
ColorForth: What if the stack was green?
PostScript: What if everything was printed at 600dpi?
XSLT: What if everything was an XML element?
Make: What if everything was a dependency?
m4: What if everything was incomprehensibly quoted?
Scala: What if Haskell ran on the JVM?
Clojure: What if LISP ran on the JVM?
Lua: What if game developers got tired of C++?
Mathematica: What if Stephen Wolfram invented everything?
Malbolge: What if there is no god?

Shooting yourself in the foot, language style

What's your favorite? Share in the thread.

April 23, 2020 by Leon Rosenshein

Priority -1

We all have lots of work to do. During the development phase there's often so much work that at any given moment there is more work to be done than there is time to do it. In what we used to call the test phase (now beta or pre-GA) there are defect reports. So how do you know what to work on next and/or when you're done? The traditional answer is priority. When you're doing feature development there are usually 3 or 4 priorities, P0, P1, P2, and sometimes P3. P0 is critical, P1 really important, P2 is nice, and P3 is thanks for mentioning, not gonna happen. Bugs are roughly the same, P0 means fix it immediately, P1 is can't ship/go GA with this problem, P2 people will be sad and we'll fix it in the first patch, and again P3 is thanks for the report, but …

Of course there are variations on this. There's "Unbreak Now" and "Recall class" bugs. Uber Core Business uses Level and Scope to define an outage, with Higher numbers being bigger problems. And as unusual as it is, in many ways that's better.

Because with any leveling scheme there's inflation. When I started at Microsoft tasks and bugs were P1 - P3. And we argued pretty loudly about it. My bug/feature is more important than yours, so I want to be higher priority. There was a lot of passion and energy, so the argument would continue and eventually end up with both at P1. Then the business would shift a little and suddenly there was a new "Most Important Thing". And instead of having the arguments again, the new thing became the most important. And we called it P0 to make sure everyone knew it was the most important thing.

Of course over the next few cycles, instead of everyone arguing for P1, they argued for P0. In general we held the line for a while, but, as with all things inflationary, we eventually lost the battle, and now P0 is the most important thing. Nothing changed except the labels in the tracking system.

I haven't seen a tool that supports P -1 yet, but I'll bet it's out there. We could do it with GitHub if we wanted to :)

But the real problem is that when you have more than 4 items in your list that breaks down. Instead of having actual priorities you just have buckets. And when you have buckets of work you don't have a priority list. You can't determine the order of things getting done because as long as you pull from the right bucket you can do whatever you want and by definition you're doing the right thing. But that's a topic for another day.

April 22, 2020 by Leon Rosenshein

It Depends

best practices context it depends

Time for a car analogy. What's the right way to make your car faster? More reliable? More efficient? Have higher resale value?

There's really only one answer to all those questions. And that answer is "It depends." It depends on what your priorities are. It depends on where you're starting. It depends on what you mean by those questions. It depends on how much you can spend to meet your priorities. Does faster mean top speed, trap time in the ¼ mile, or 0-60 time? Is reliability about MTBF, cost to repair, or total downtime? Is efficiency about moving one person from home to office, 50 people from a suburb to an urban core, or moving 400T of stuff from one end of a strip mine to another?

The same is true in software development. Want your software to be faster? Want it to crash less? Use less resources? Reduce time to market? If someone comes in with a silver bullet and says they know the right answer to that question a-priori, they're almost certainly wrong, and if they happen to be correct, in your exact case, they got lucky.

Sure, we have best practices, and we should probably follow them, but when you get down to it, those best practices are guidelines. If you really have no clue about what you're trying to do and why then best practices are a good place to start, until you know better. And that's the thing.

When you know better you should choose to do the right thing. Because the right thing depends on knowing why you're doing something. Engineering is about tradeoffs, but the only way to make informed decisions is to know what you're trading between, and why. Because *it depends."

Once you know what you're minimizing and what you're maximizing and what the cost functions are between them, you can get something close to the right answer. For your specific situation. At that particular time. With those particular constraints.

April 21, 2020 by Leon Rosenshein

Real Engineering

Here’s a question for you. Are you a programmer, developer, computer scientist, software engineer, hardware engineer, or something entirely different? Maybe you’re an artist working in the medium of bits? A data wrangler? Some combination of all of these, depending on the day and the task at hand?

For the last 50 years or so people have been trying to figure out if software development was an art or a science. Or was it engineering? When I was in college there was no such thing as a degree in software engineering. There were specialized electrical engineers that built computers, there were computer scientists that tried to figure out what to do with them, and the rest of us engineers that used them. The math department in the School of Arts and Science had a lot to say too, particularly around formal logic and correctness. But for most of us who were writing programs the computers were tools to do a job. Sometimes we wrote programs to help other people do their jobs, but writing code was almost always in service of some other task. And we treated it that way. Just get it done. Small groups, late nights.

Then I got out into the real world and something changed. I became a “software” engineer instead of a Mechanical and Aerospace engineer. But really, nothing else changed. Then I went to work for a game company, and instead of building software to do something, we built software to sell. And we had deadlines. And we missed them. So we tried to engineer harder. And we still missed our dates. Then I went to work for Microsoft. And they really engineered hard. Waterfall development. Months of planning. Then start doing. Still missed our deadlines a lot, but at least we saw it coming. But it was engineering. Requirements. Design. Plan. Build.

Then came Scrum and Agile and Extreme. Throw all that planning out. Just do something. Figure out the goal along the way. Don’t worry about done, just move fast and adjust as you go. We did ship things more often, but big changes got hard and we never really knew where we were going. It sure didn’t feel like engineering.

So the debate continued. Is it art or science? Craftsmanship or Engineering? Lots of people have thought about it and talked about it. I say it’s engineering. Engineering is not about doing the “perfect” thing. There is no perfect thing. It’s about tradeoffs and dealing with uncertainty and doing the best you can to meet the goals and priorities with what you have available. And one of the best explanations of not only that journey, but where we are now and how we can get even better at the process of what we do, comes from Glen Vanderburg in his Real Software Engineering talk. It’s about an hour long (45 minutes at 1.5x), but well worth the time.

April 20, 2020 by Leon Rosenshein

Outdoor Sports

Continuing on with the string of GlobalOrtho stories, image capture, both aerial and terrestrial, is, just like operating a robot car, an outdoor sport.

At the heart of the GlobalOrtho project was the UltraCam-G. Designed and built by our team in Graz Austria. Something like 200 MP, taking simultaneous RGB, Monochrome, and NIR images at 30cm resolution for the RGB image. And this camera was tested. Countless flights over Graz and the surrounding areas. Calibrated for physical construction, lens distortion, thermal drift, chromatic aberration and anything else the designers could come up with. The pictures were stunning. The 3D modeling was amazing. Not just 2.5D shells, but full 3D models with undercuts and holes. So we sent it out into the field.

And the feedlots were purple. The edges of the images were red. As I mentioned the other day there were spikes and holes. How could this have happened? These cameras were tested. Over and over again. And all the tests came back great. We sent one back for recalibration, but the before and after results showed no change, and the test images were spot on.

So we kept digging. And we realized a few things. Color balance. It turns out that Graz and the surrounding areas are Austrian Alps (who would have guessed). Lots of alpine forests and orange tiled roofs. And the software did great in those areas. But there aren't a lot of feedlots. And color correction was done in a lab. Yes, we used sunlight equivalent lighting, but the room was a few meters deep. Outside there were cloudy days, dusty days, humid days, and in some places smoggy days. Plus, the camera flew at 5000m, and with a +/-40° FOV, the amount of air between the camera and the ground was very different between the center of the image and the edge.

Geometry. Lots of church steeples and building corners. But no miles square corn fields with waving stalks. Or pastures with walking cows. Or large lakes. Or high rise urban cores with deep canyons. Lots of environments that weren't part of the test set. And the software struggled.

Why, because even though we captured hundreds of thousands of test images, and ran hundreds of test jobs. they were all basically the same operational domain. For all the hours we spent testing, we really only ran a few tests. Then we got out into the real world and the situations were different. So we had to evolve. Make things more dynamic and adaptive. Because that's the way the world is.

Older Newer