Recent Posts (page 61 / 69)

by Leon Rosenshein

The Juice Shop

I didn't get this posted during security awareness month, but it's timely nonetheless. Security is everyone's business, whether you're building a robot, a two sided marketplace, a shared infrastructure, or an online juice shop. Lots of things to keep in mind, and it's more than just "Who has access to what data?". What and how you deny access to something can tell an attacker a lot about your system and the protected data. Then there's the internal side, where good logging and ease of debugging are in direct conflict with keeping the data secure. And don't forget data combining. If you have a enough "innocuous" pieces of data you can put it together into PII.

There are lot's of don'ts when you think about security. At the top of the list is "Don't roll your own encryption". Instead, ask for help. Knowing you don't know is more than half the battle. Other little things like don't trust user input, don't assume good intent, and don't assume your API is only used internally.

There are also things your should do. At the top of that list is "Design security in from the start". Ask for help, Do threat modeling. Make your security model granular enough. Admin and non-admin might be enough, but do you need more? Make sure you maintain and update your systems as vulnerabilities are found (you do have a configuration based, repeatable, verifiable deployment system, right).

One way to play around with security and pen-testing is the OWASP Juice Shop. If you want to try it out and see what vulnerabilities you can find check out the opus link. Just set the User and Job Name fields then hit "Create". Follow the link that will show up in the top right of the screen, go to the details page of task 0 and click on the "http" port in the top right of that page. You should see your own little Juice Shop website to play with. Try it out and see how many vulnerabilities you can find. There's also a CTF version, which might be a fun little exercise for the group at large. If anyone is interested in getting that set up let me know.

by Leon Rosenshein

What Are You Reading?

There are lots of great books on being a developer, and not just books about how to code (although they are kind of important). Algorithms, design patterns, compilers, all important. But there are also books that aren't directly about writing code. Books like The Cathedral and the Bazaar, In The Beginning Was The Command Line, Debugging the Development Process, or maybe, Zen And The Art Of Motorcycle Maintenance. Books that talk about the bigger picture. About understanding the bigger picture, how things fit together, and why we're doing the things we're doing anyway. I ran across this list that says it's got 7 Essential Books for Programmers. What books have made a difference in your career, both in your field, and in general.

by Leon Rosenshein

Options

We've got lots of them. Almost always there's not one right answer for a given set of constraints, but there are often wrong ones. One of our jobs as engineers is to balance those constraints and make the right trade-offs for now and what we know about the future. We also need to keep in mind that the constraints can/will change over time, and the "right" answer might shift.

Consider joining two text files and doing some simple analysis on the results. You can do it with everything from an awk one-liner to a giant spark job that uses 10s (hundreds?) of nodes. Which one you chose depends, among other things, on how big your data is, where your data lives, how often you want to do the analysis, and how important it is to ensure that the analysis happens unattended on a schedule. If it's just you working on a few MB of data, and you know awk then that one-liner might be the right answer. If, on the other hand you're dealing w/ TB of data in an overnight pipeline that needs to be done by 6:00 local the next morning then ensuring the data is loaded into Hive on ingestion every day and having a distributed query that runs in a managed workflow system might be a better choice.

by Leon Rosenshein

Assumptions

You know what happens when you assume. That happens everywhere. Especially with computers. They're logical. They do exactly what you tell them, even if that's not what you want. And of course, they're all the same. Or maybe they're not. Look at this from the 90s.

How many times have you heard the phrase "It works on my machine."? Of course it does. You built a bespoke system in an environment and it works. And our machines are centrally managed, so they're all alike, aren't they? Well I don't know about you, but I know mine is a snowflake. It's been adjusted to make me happy. It has my toolset. In the places I want them. It has my bash setup. My path, my macros. You have the same setup, don't you? What do you mean you're not on a Mac. And you're running tcshell. How are we supposed to interop? Very carefully. Just look at what it takes to decide what OS you're on in a bash shell
And then there's kernel and tool versions. Right now we have a mix of Ubuntu 16 and 18. I think we've gotten rid of the last of the Jessie machines, but I'm not sure. OSx tools are different from the tools on Ubuntu. And not esoteric tools, tools like duheadps and netstat have different options and act differently. And the FileSystem is laid out slightly differently, including your home directory. So you need to be aware of those differences when you write tools/scripts/libraries for others to use.

And then there's the general assumptions developers make. Things like address formats, name formats, time is a monotonically increasing function. My personal favorite is assumptions about distributed programming, and we're all involved in distributed programming.

So what's your (least) favorite assumption you've had to fight against? Share it in the thread.

by Leon Rosenshein

Silos

They're not just places to store your grain. Information gets into silos. Processing gets into silos. And we all know that silos are a bad thing, right?

Duplication of effort. How many configuration management/payment/SWN systems does Uber need anyway? Everyone should be using thrift, and custom thrift over TChannel, not Apache thrift, unless you're working with gRPC, or over Muttley, or with Yarpc, or maybe even hyperbhan.

Working at cross purposes. Ever hear of NetDocs? Microsoft once tried to build an alternative/replacement for the Office suite. Staffed it big. Got some great ideas, then collapsed the team and brought the ideas into Office. Or maybe Kin. At that same time there were folks working on Windows Phone there was a team building the MS Kin, a feature phone for the smartphone era.

On the other hand, there are concepts like separation of concerns, DRY, IDLs, and the core Unix philosophy of do one thing and do it well ( then chain them together into a pipeline). We think of these things as good. And they are. But they're also all silos. So maybe silos aren't all bad. Being able focus on one thing and having clear, well defined boundaries and interfaces with your neighbors makes you more productive.

Like most things, silos, in and of themselves, aren't bad. Focused responsibility is good. To use Dara's term, "Who's got the D?" What is bad is when required information, plans, and goals get stuck in one place and don't make it to where they're needed. And that's the hard part. How do you balance providing enough information without wasting people's time? How do you get enough info to make the decisions you're responsible for without getting buried in the deluge? I've found that the best approach is to make as much info as possible available in a well known place (email group, file memos, architectural decision records, etc) and let those who need it consume it at their pace. On the other hand I like to see lots of information fly by and pick out the relevant nuggets and know where to go for more detail. YMMV. Share your best practices (and anti-patterns) in the comments.

by Leon Rosenshein
by Leon Rosenshein

GOLANG Oddities

Don't get me wrong, I like using go. It's clean, the standard library has almost everything I want/need, it's cross platform, and it's fast. VSCode, IntelliJ (and flavors), Emacs, and Vim have good support. Tabs vs spaces isn't a thing (gofmt has the one true answer). There are lots of people around that I can go to and get good answers. But

Golang is opinionated about where things go. GOPATH is king, and woe betide any who think they know better. Including your favorite IDE. IntelliJ/VSCode have their own ideas about how to build things and where they should go, and it's NOT bazel's idea. So we have things like setup-gopath and copy-genfiles.

Dependency hell. Golang sort of eliminates that, because there are not dependencies. You just copy everything you need into your GOPATH and build/link. Simple. Works great in a monorepo with no external decencies. But that's not our world. Go modules help, but there's still the version management problem.

Slices. Slices are not arrays. Repeat that a few times. Now do it again. There are lots of good reasons for slices to be what they are, and they let the compiler produce some very fast code. Just don't forget and you'll be OK. If you do forget things start to feel like quantum entanglement. Spooky action at a distance.

Type safety. But use …interface{} since you don't know what's coming and that lets you get anything and figure out what it is latter.

So what's your pet peeve and workaround? Share in the thread.

by Leon Rosenshein

The Gift Of Feedback

People keep saying feedback is a gift. And as the recipient of feedback, that's a good approach to take. If someone goes out of their way to give you feedback you should at least think about it. You certainly had enough of an impact to make the person take the time to provide the feedback, so take the time to consider it.

But what do you do if you're not getting feedback. You could assume you're perfect. That would be nice, but really? I know I'm not perfect so I have to ask. It's also built in to our perf process, so it must be important, right? So how do you effectively ask for feedback? How do you get someone to provide their valuable time to help you? I've found a couple of things that help.

First, enlightened self interest is a great motivator. Don't just ask for feedback, explain that you want to make <insert area here> better for the person and you want their feedback on how you could do a better job doing that that you do.

Second, generally speaking, people want to help. So ask for advice, not feedback. Feedback is work and critical. Advice is easy and helpful.

So, practicing what I'm saying, I want these little notes to be interesting and impactful for you, the reader. What advice can you give me to make these notes more useful for you?

by Leon Rosenshein

Language Security

In the spirit of Security Awareness month and in today’s homage to bad research, I present What are the most secure programming languages. There is so much wrong with this doc I don’t know where to start. I’m sure there are many data scientists here who can quote chapter and verse about the flaws, but there are a couple I want to touch on.

The first has nothing to do with data science or research, it’s the on the editorial/marketing side. 17 pages of words, charts, and graphs, and they never actually say what the most secure languages are. They list all kinds of problems, but they never follow through on their promise.<

Second, the most common vulnerabilities listed, Cross-Site Scripting (XSS), Input Validation, Permissions, Privileges, and Access Control, and Information Leak / Disclosure are stupid human tricks, Yes, C/C++ makes it harder to get memory right and has more buffer overrun errors, but come on folks. Let’s not blame the language for our mistakes. Input Validation? Permissions, privileges, and Access control? We should know better than that.

by Leon Rosenshein

Growing

How do you grow as a developer? How do you get better at your craft? What should you be doing? There are lots of recommendations online, lots of proposed approaches. I think it comes down to a couple of things though. The first is practice. Solve lots of different problems. Not the same thing over and over again.

One of my old marching band instructors said "Don't practice until you get it right, practice until you can't get it wrong." That works for marching band, where you're doing the same thing over and over again, but not for us where we're trying to solve new problems all the time. In "Outliers" Malcom Gladwell wrote that it takes 10,000 hours to be an expert. And it's 10,000 different hours, not the same hour 10,000 times. Not just trying new things, but making new mistakes and learning from them.

The second is reading and reviewing code. Reading and reviewing code does lots of different things. It helps you know what else is going on around you and how your part fits in. It helps you learn new things by seeing how other people solved problems. It also makes you teach. When you see something that seems wrong in a PR and you need to explain yourself then one of the first things you need to do is deeply understand the point you're trying to make.

What do you all think? What makes a senior developer?