Recent Posts (page 37 / 70)

by Leon Rosenshein

Java Isn't Dying

Today's entry in the Betteridge file. Is Java Dying? True to Betteridge's law, no. It's one of those perennial questions, and the answer has always been the same. It fills a need, and it's been deployed to production in countless places. And as COBOL reminded us yet again this year, if it's in production it will never die.

And if you look at the Java ecosystem, it's actually growing, JVM languages continue to grow and evolve. Scala is a fairly functional language on the JVM, and Apache Spark is written in Scala. For you LISP fans there's Clojure. If you're developing for Android there's Kotlin. And of course, if you want COBOL MicroFocus has Visual COBOL. So developing in/on/with Java is far from dead.

And with all those different languages, there are some deeply ingrained Java/JVM use cases. For example, a lot of distributed processing, from general MapReduce to ML training to Log processing uses PySpark. Yes, there's a lot of Python written to process the data, but under the hood it's Spark and the JVM (Scala) doing the heavy lifting.

Even if you don't use PySpark you're using Java every day at work, even if you don't realize it. Our monorepo is on the large side. And keeping track of dependencies and what's up to date vs what needs to be built is rather complicated. So we use bazel. And bazel is written in Java.

Yes, Java is verbose. So what? You've got an IDE to help you. A bigger issue is dependency management. Yes. transitive dependencies can be hard to manage, and sometimes they can even lead to incompatibilities, where one dependency you have transitively includes version X or something and another includes version Y, but there's no version that makes them both happy (I'm looking at you guava). But you know what? Java at least has a way around the problem, the shaded JAR. If you need to, you can include both versions and the correct one will be used by each part. If you think that's ugly, try looking at npm and what gets dragged into the simplest Javascript web app. Now that's scary.

But really, none of this matters. Because the programming language you use is a tool. A means to an end, Not the end itself. I'm not using Java right now for my current work, and I probably wouldn't choose it if I found myself in a situation that really was a blank slate and there was nothing pushing me in a direction, but really, how often does that happen? I'm working with Kubernetes, which pushes me towards Go. There's also a lot of libraries/functionality around Kubernetes and gRPC which makes Go a good choice. When we built/ran Opus, which grew up in a Spark/Hadoop world, we used Java and Scala because that's what the rest of the tools we needed to talk to used. And it was very successful.

So don't worry if the language is alive or dead. Look at what's being used to solve the problems you're solving by the people you'll be solving them with. And use that language. It's good to have the discussion about what language/framework to use, but if there's something in use, you should probably use it.

by Leon Rosenshein

On Branching

There are lots of different branching strategies. And they have their strengths and weaknesses. And there is no one right strategy. Don't let anyone kid you. There are however strategies that are wrong, especially ones that are wrong in a specific time and place.

Given that there is no consistent best strategy, how do you pick which strategy to use. The answer to that strategy is highly dependent on why you're making the branch. Unless you know that, you can't make a good choice. The other thing you need to keep in mind when deciding how to branch is your exit strategy. How (if ever) are you going to merge the branch back in, and what happens to the branch after that first merge?

Are you optimizing for individual developer velocity? Product velocity? Size of development team? Stability? Each of these will push you towards a different strategy.

Long lived feature branches make it really easy for a developer to get a feature to a stable point for a demo, and that developer will feel really good about their progress. Right up until they need to share their work. Because the longer they've worked, the further their codebase has diverged from the shared view. If they're lucky their work is isolated enough that conflicts, physical and logical, are small, but that's usually not the case. So you pay for that individual velocity at the end when it comes time to bring things back together and someone, usually a build team, has to deal with those conflicts.

The most common way to avoid that divergence is to keep branches short lived, on the order of days or less. After all, how far can two branches diverge in a couple days? The corollary to that is of course, how much progress can you make in a couple of days? So lots of tension there. But there are ways to make it easier. Clear boundaries/interfaces. Feature Flags. Small changes. Lots of automated tests. At the limit this is Continuous Integration/Continuous Deployment.

On the other hand, consider the commercially released product, nominally shipped in a box. Or maybe the firmware inside a physical LIDAR product. It's going to be out in the field for years. And you need to support it. And probably make some fixes/modifications. But the hardware itself won't change, and you want to be able to support new hardware/functionality going forward. So you create a release/support branch for that work. That branch is very static. And it's got a simple merge strategy. Never. Things that happen in that branch never go anywhere else directly. You might make the same logical change, but you don't do it as a merge. That lets you continue to do new cool things while supporting the old. In that case you've got a clearly defined reason for the branch and an end state. Go ahead and branch with confidence.

As an organization we've picked, rightly I believe, to use trunk based development and optimize for overall product velocity. That lets us do lots of good things. Like minimizing the time between getting a change approved and the time it's in production. Like minimizing the time between a change be done and it being available to others to build on. Like alerting us to problems early.

And to take full advantage of it, it requires some discipline on the part of developers. To write and maintain good tests. To have and maintain clear boundaries. To use feature flags. To keep changes small. To reduce side effects of changes. All things that are straightforward to do.

We just need to do them.

by Leon Rosenshein

RCA

RCA can be a lot of things. Most of the usage of RCA I’ve seen has something to do with the Radio Corporation of America

rca

and all of the inventions and spin-offs from it. But more recently, RCA means root cause analysis. As in “Why did that really happen?

As in “Why did the program crash?” Easy. Like it says right there in the stack trace,

panic: runtime error: invalid memory address or nil pointer dereference

[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4a52c6]

goroutine 1 [running]:

main.log(0x0, 0xc0000a9f68, 0x1, 0x1)
/tmp/sandbox483371455/prog.go:28 +0x66
main.main()
/tmp/sandbox483371455/prog.go:23 +0x74

Program exited: status 2.

So the fix is to test if format is nil and drop the message or log an error. That will always work. But is that the root cause? That’s certainly the proximal, or immediate cause, but that doesn’t tell the whole story. Why did it really panic? I mean the program usually works (outside the playground. It always works there, but that’s a different issue). But is that the right fix? To know that you need to know why it really failed.

Figuring that out is doing root cause analysis. And figuring out the root cause is crucial to making sure it doesn’t happen again. One way to figure out the root cause is to keep asking why.

Why was there a nil pointer reference on line 23? Because nil was passed in. Why was nil passed in? Because it wasn’t set in the conditional. Why wasn’t it set in the conditional? Because the conditional checks for before/after noon, and before/after 5PM, but ignores those two specific times. Why are those two skipped? Because English is a slippery language and computers are very literal. The requirements say “Greet with good morning before noon and good afternoon after noon”. The code does that. The requirements don’t say what to do AT noon, and the code does nothing. Now depending on how you define noon, the chances of the test happening exactly at noon go down, but because computers are discrete, there’s always that chance of testing exactly at noon.

In this case the right answer is probably to check for hour <= 12 instead of hour < 12, but you’d have to go back to the user to be sure. Or you could go for the snarky developer solution and set the default format to `‘Good <indeterminate time of day because the specs were incomplete> %s\n’.

There’s one more why you should have probably asked in this case. Why does log take a *string as a parameter instead of a string. If you fix that problem the compiler will make it much harder to make the mistake. But before you go fix that, you need to understand Chesterton’s Fence, but that’s a topic for another day

by Leon Rosenshein

This Would Never Occur To Me

I've written code for Apple ][ hi-res graphics with it's odd interleaving and understood what I was doing. I've done (a little) Fortran77 on a Cray-YMP. I've done low level graphics, from 2D lines to 3D projections to mip-mapped textures on a CPU and bashed the results into a manually double-buffered framebuffer. I've written shaders for procedural texture mapping and phong shading (don't remember much about it, but I know I did it). I've written code to use various versions of SSE code on Intel chips, with runtime detection of capabilities. So I'm reasonably familiar with graphics and parallel operations.

I'm also familiar with iD software. I downloaded a version of Wolfenstein 3D from a General Atomics server back in 1992. I've played multiple versions of Doom, Heretic, and Quake. I've followed Armadillo Aerospace.

Which is to say that I've not only experienced the amazing things that John Carmack has put together, but I've done similar things (albeit without the commercial success he's had). But you know what? Some people just think differently. I think I follow what he's saying here, but it never would have occurred to me to do texture mapping that way.

by Leon Rosenshein

ISQ vs ESQ

There's a couple of TLAs for you. But what do they mean, and what do you do with them? Let's start with some definitions. ISQ is Internal Software Quality and ESQ is External Software Quality.

External software quality is what your customers see. How easy/hard is it for them to do the things you promised them they could do with your software. Is it easy to use? Consistent? Reliable? Full of features? Provides useful benefits? Software with high ESQ surprises and delights your users.

Internal software quality is harder to measure and talked about much less often. A good way to think about ISQ is how easy or hard it is to change the software. When you need to add a feature or fix a bug, do you know where to start? How much confidence do you have that when you've finished making the change the change does what you expect, and has no other, unexpected impact? Software with high ISQ may delight developers, but it will never surprise them. 

But which is more important? We often talk about the software triangle, trading off features, quality, and time. And there's more than a little truth to that. But the important thing to remember is that that triangle is, in general, describing ESQ. And, you can reduce ESQ overall to get even more time.

ISQ is a little different. Because increasing ISQ makes it easier to improve ESQ. From clear domains and separation of concerns to reduce cognitive load to unit tests and validation to help you be sure something hasn't broken, high ISQ code lets you get more done. Lets you add more value. Lets you be more agile as goals change.

So how do you get high ISQ? That's a question for another day.

by Leon Rosenshein

Long Live The CLI

I've talked about the command line before. How important it is, its history, and I've made some recommendations. And it's still important.

My recommendations center mostly around usability and predictability. Because if you build something and it's not usable what was the point? Predictable because while you have one or more use cases in mind, the street finds its own uses for things and you never know how someone else might want to use it. So let them

All of that still applies. And it's nice to find someone who agrees with you. It's even better when that person or group takes the time to make it easily consumable. And the Command Line Interface Guidelines (CLIG) are just that. So check them out and keep them in mind next time you write a CLI.


by Leon Rosenshein

Another 'Just No.'

Every year StackOverflow does a developer survey, and this year was no exception. Some of the questions they ask are about language, framework, library and platform usage. What gets used the most/least and what is liked the most/least. Like all surveys, some things make sense right away, some are of the "I wouldn't have thought that, but it makes sense", and others are of the "Huh???" category. The one in the last category that always makes me go WAT is Rust. Is #19 in the list of most used languages, but #1 in the most loved, and 20% higher than #2. How can so many people love a language they don't use?

But still, that's just data. It's not information, let alone knowledge or wisdom. And some people have tried to turn that data into wisdom, such as this Medium article.

According to that article, we should all be learning and using Julia, since it's the language people are most satisfied with. We don't know what they're doing with it, but they're satisfied, and who doesn't want job satisfaction?

Alternatively, you could pick your language of choice based on popularity and paycheck. You know, what all the cool cats and kittens are using. In that case, C, Java, and Javascript should be your choice. It certainly makes it more likely that you'll be able to use the Clone pattern from yesterday's list, but it doesn't say much about the quality of the result.

Or, put it all together and let the wisdom of the masses guide you. By this analysis the most well rounded choice, in terms of community size, salary, and satisfaction are the big shell languages, Bash, Shell, and Powershell. I can just imagine a self driving vehicle built using Powershell. I can imagine it, but I don't think I want to ride in it.

The right answer to which language to use is, as with most things software, it depends. Here's some wisdom for you. Instead of bringing your favorite language to the problem space, look at the problem space first. See what's being done. See what the pros and cons in the space are. Remember, you're trying to solve a problem in a problem space, not trying to use a language. The language (and it's ecosystem, including community, libraries, and toolset) is just a tool you use to solve the problem. It's not a solution you carry with you and then apply to whatever problem you come across.


by Leon Rosenshein

Real World Patterns

You've probably heard of the Gang Of Four's Design Patterns, and I've talked about some of them already. There's some really good ideas in that book, and like all things architectural, using the right pattern at the right time makes things easier in the future. Get it wrong and not so much.

Probably the most common mistake I've seen with those patterns is overapplication. People learn a pattern and then apply it everywhere. I've done it. For me the singleton was the worst. When I first saw that I *knew* that it was what I had been looking for. Singletons started popping up in my code everywhere. Inside classes. Across classes. It wasn't until I ran into the Special Order pattern of singleton creation that I realized I had gone too far.

There are plenty of other lists of patterns out there. Below is a list that I think has a lot of connection to code as seen in the wild. I have to admit that I've been known to use the Clone pattern a few times, and Retroactive Formalization is definitely the best way to make sure you meet the design requirements AND have the docs match the code, at least for a while. What other real-world patterns have you seen?



by Leon Rosenshein

Don't Cross The Streams

Domain driven design is all about bounded contexts. I like bounded contexts. They help you know what something is responsible for, and by extension, what it's not (that would be everything else). Having those boundaries (not crossing the streams) makes it easier to keep things straight and reduces cognitive load.

On the other hand, having all those different domains is adding complexity, Especially if those domains are each different services with some kind of RPC interface. In that case you not only have multiple domains to keep track of, but you could do everything correct and the RPC could fail because of a fiber seeking backhoe, so you need to be able to handle that case. And that increases cognitive load too.

Like everything else in software, when to keep things separate and when to bring them together depends on context. Remember, we're not here to build the perfect thing, we're here to add value to the customer. Value today is better than undelivered perfection.

It's a balance, and the right place for the boundary will change over time. How often things change can move the boundary. The more something changes the more important it is to make it easy to change without impacting others. How many other people/teams/programs need to use the context can as well. If there are 100 use cases then having a single version is very important. If there's only one use then it's already used in every case.

There are sometimes good reasons to cross the streams (mix your domains). One of the most common is time. And that can be OK. As long as you document why you're choosing a short term gain over long term stability and what the criteria are that will tell you when you need to fix it. For example, you might mitigate an outage by crossing a domain barrier in a client rather than make a change to a few domains, because you need to fix it now. But once things are working and you have the time, go back and clean up the boundaries so everyone gets the benefits.

by Leon Rosenshein

Experience

Some say that experience is the best teacher. And baring learning styles, there's a lot of truth to that saying. And not just learning by doing. Learning from your mistakes. And even better than that, learning from other's mistakes.

But the real key to experiential learning is not the doing of the thing, it's the reflection afterward. In school you might call it the lab report or case study, and at work we call them tech sharing or incident reports, but regardless of the name, the goal is to identify the parts that made a difference and why so that they can be incorporated into our internal understanding of the world. We can then use that understanding to inform future decisions.

Because if you don't reflect on what happened and why, you're not really learning, you're just setting up a conditioned response. The next time something happens that is similar enough to trigger your filter you'll make the same response. Of course the situation is not the same, It might be close, but if nothing else time has passed. And if the trigger is close enough the same response will probably work. Thus reinforcing your "learning"

Which brings us to the story of the 5 monkeys. You know, the one with the ladder to a banana on a string. If you don't know it, the short version is that a group of monkeys learn that climbing the ladder brings on group punishment. Once they've all really learned that the monkeys are slowly replaced. Each new monkey learns that climbing the ladder gets them punished by the other monkeys. Eventually non of the monkeys have ever experienced the punishment, but they continue to train new monkeys in the group to not climb the ladder. While the story is mostly apocryphal, there's still a lesson to be learned.

Learn from your experiences, but reflect upon them and understand what the learnings really are. As we go forward come to inflection points where a small change can have a large impact. We're approaching such a point, so let's make sure that our decisions now are based on the learnings from our experiences, not just "that's the way we've always done it here."