Monthly Archives: February 2010

Truncating Floats in OpenLayers and SQLServer

A perfectly valid question when dealing with map coordinates is “How accurate do we need to be?” For some applications, a tenth of a degree is more than accurate enough while for others, several more decimal places are needed. Sometimes this question is answered for you: if your data source only stores four decimal places, then that’s all the precision you’re going to have. If you’re in the lucky (unlucky?) position of generating your own coordinates, one common answer to the “how accurate” debate is “store it all”. This is the path Sajara chose, mostly because we didn’t have a good reason to choose a less precise solution over a more precise one. It just so happens that SQLServer’s floating data type precision limit is not tied so much to the number of decimal places, as to the number of numeric digits to be stored. They allow up to 16 numeric digits, plus a period character and a negative character as needed. Sajara works with coordinate systems in both meters and degrees, so depending on which system we’re using for a given implementation, we could be storing a value far more precise than is even visible to the naked eye.

Fast forward a few years and bring OpenLayers into the mix. We rewrote the asset editing portion of the software to allow data managers to move asset coordinates using an OpenLayers map. These coordinates were saved with still considerably more precision than we needed, but remember, we’re storing whatever precision we get. So far so good.

Now back to the present and we’re working on a comparison tool for our data managers. Suddenly values in the database are not matching the values coming out of our OpenLayers map. Almost, but not quite. In fact, only the last degrees of precision are different. After a bit of digging, we discovered that OpenLayers was returning numbers with between 1 and 3 fewer decimal places than our stored coordinates. Remember that we’re talking about distance differences smaller than a crack in the sidewalk here, but programing languages don’t know anything about “close enough”. Either two numbers are the same or they aren’t and -39.6827663878 is not the same as -39.682766387 no matter how small the physical difference is. So we started digging for the reason.

OpenLayers has a value tucked away in its utility files that sets the default precision of a floating point number to 14 characters. This limit was added when a user noticed that the edges of certain coordinate systems were not behaving correctly due to some floating-point math precision errors. While the OpenLayers community recognizes that most systems allow floats to have 16 digits,  “14 significant digits are sufficient to represent sub-millimeter accuracy in any coordinate system that anyone is likely to use with OpenLayers“. So OpenLayers’ answer to the accuracy question is to save everything that will fit in a standard float, with a few decimal places pared off just in case.

So the next question is: “So what?” The difference between 14 and 16 decimal places in a meter-based coordinate system is microscopic, and in a degrees-based one it’s not much bigger. So far as storing a saved coordinate in Sajara, we didn’t really care if we had 16 digits or 14 digits; the result wouldn’t look any different to our audience. However, since our initial coordinates had 16 digits and OpenLayers only preserved 14 of them, any programmatic comparison fails! No one likes to deal with false positives, but a 100% false positive rate was unacceptable.

We had a few choices here. First we could reset the default precision value in OpenLayers to zero, which would tell the library to never truncate anything. That’s a fairly simple change but we weren’t sure it wouldn’t have unforeseen data effects. Also, there’s a somewhat vague warning about problems with the Web Mercator projection when this value is zero, which is one of the projections Sajara can use. So that option was out.

Second, we could have told SQLServer to alter the precision of coordinate values to 14, which is a fairly major change. This option was ruled out because of a difference in the definition of “precision” between SQLServer and OpenLayers. I mentioned earlier that SQLServer will store a maximum of 16 numeric digits plus a decimal and a negative sign, so a total of 18 characters. OpenLayers, however, considers the default precision of 14 to mean 14 characters instead of 14 numeric digits. So if a number has a decimal and a negative sign, we’re down to 12 numeric digits.  This little difference reintroduces the possibility of false positives, so it isn’t really a change for the better.

The solution we finally decided to use was to change the OpenLayers default precision value to 18. Why 18? That’s the maximum amount of characters that SQLServer will store for a float, so OpenLayers will always be able to deal with any stored coordinates without having to truncate. Now, if we compare our stored coordinates with OpenLayer coordinates, we only get a change notice when an asset has actually been moved. Which is exactly what we wanted.

Here are some technical details for those interested:

The full variable name is OpenLayers.Util.DEFAULT_PRECISION and can be found in the Util.js file. There are a few good comments preceding the variable in the code, but more background can be found in the OpenLayers ticket #1951. SQLServer information can be found in mdsn. Note that if you wind up changing the OpenLayers precision value, you should do it as soon as possible after loading the library, so you don’t have the possibility of code using different precision values.

Getting an ArcGIS Server Map Cache in S3

When deciding how to best handle the air photos in the new Philadelphia Water Department Stormwater Map Viewer, we kicked around a few ideas. We decided to put the cache in Amazon’s Simple Storage Service to offload some of the local disk requirements and leverage their fast data storage and delivery infrastructure. In moving the process, we learned a few things:

Tune Your Cache

Make sure you spend time planning the cache. Not only will the cache look better in the final application, but it will also load to S3 faster and cost less in the long run.

  • Set the extents in the MXD or MSD before publishing to a map service. The overhead of transferring the 254 byte empty tiles caused a lot of unnecessary burden on the upload process as well as the fact that you are paying for them to be stored in the cloud. If it doesn’t need to be there, don’t build it.
  • Choose the correct image format for the cache. If you are caching a base map and do not need to support transparency, make it a JPEG. If it needs to support background transparency, use PNG. ESRI’s suggestions for planning a map cache can be found here.

Get a Good Tool to Transfer the Files

I started using the free version of Cloudberry Labs S3 explorer. But I had to move over 90 Gbs worth of data to my S3 bucket. The CloudBerry S3 Explorer – Pro supported multithreading which allowed for up to 5 threads to either enumerate through the folders, copy the files or apply the ACL. It is a low cost application that more than pays for itself when moving a lot of files up to a bucket.

When transferring the files up, I was working in blocks of directories, not the whole scale level. It was quicker for me to work in 20 to 30 subdirectories than grabbing a whole scale level. It did require a little bit more management on my end, but more steady progress was made.

Accessing the Tiles

ArcGIS Server does not support cloud hosted caches at the 9.3.1 release. The ESRI Javascript API and Flex API can be extended to use caches hosted in the cloud (Flex example from Mansour Raad), so you’ll have to roll your own. For the Philly Storm Water project, we were using the Open Layers and someone has rolled one for us. There is a patch that can be used to access the cache without communicating through ArcGIS Server straight from the client-side library. The one thing to note is that the Tile Origin is pretty touchy, we had to make some adjustments to the origin values to make sure everything lined up correctly.

Summary

Now that the site is up there and we are starting to get some traffic hitting it, putting the tiles in S3 was the right decision. There is no reason for ArcGIS Server to waste any cycles moving tiles around, let it do the heavy lifting with the vector layers and queries. Hopefully the rumors are true, and the ArcGIS Server 10 release will be more aligned with cloud computing. Until then, there are still plenty of ways to take advantage of the benefits.