by Leon Rosenshein

Zip It

Tales from the trenches.

Geometry is hard. Let's say you needed to take pictures of every street in the US, or at least every address. Because street level imagery is fairly temporal you wanted to do areas quickly so you have "squads" of vehicles in an area. And for technical reasons you needed to keep the vehicles from operating in the same area. How would you break the country into workable areas? That was our challenge gathering street-level imagery for Bing Maps.

The answer we came up with was Zip Codes. Seemed like a reasonable idea. If a small number of mail-people are supposed to be able to reach all of the addresses in a ZipCode every day then a single vehicle should be able to cover the streets in a day or two. Every address is in one (and only one) ZipCode, so that will keep the vehicles apart, which is also good.

So we went to implement it. And automate it. There were a bunch of startup problems, like the fact that the Post Office doesn't provide ZipCode data, but we figured out ways around them. It mostly worked, but oh the edge cases.

It turns out that ZipCodes aren't areas. They're sets of points, or more correctly, every address is associated with a ZipCode. But that's not too bad. ArcGIS can take sets of points and turn them into non-overlapping polygons. Except… While every address is in one ZipCode, the set of polygons covering all the addresses are not contiguous. Unless you're ArcGIS, which makes them contiguous. By creating tiny tendrils that connect what would be islands. Except where it can't. So you have islands there too. So we did the usual solution. Automated everything, including validation that the automation worked. Then take the parts that failed validation and send them to Marcus (thanks @eisen) to fix. Because Marcus is a wiz at ArcGIS and can make sense out of anything.

And if you're wondering why we needed to keep the vehicles far apart, the problem was that the original firmware in the SICK LIDAR units we used (along with cameras, INS, and GPS systems) was written in a way that assumed two units would never be pointed at each other. Because if they were they would quickly burn out each other's sensors, rendering them useless. We eventually got SICK to fix that problem, but that's a whole different story