avatarArticles by Carissa Brittain

Geocoding The Centennial Expo… Without A Geocoder!

Set your wayback machine to 1876, the place: the Philadelphia Centennial Exhibition. Wandering around the festival grounds, you would have seen several cameramen snapping away. Some of those images have survived admirably as the Free Library of Philadelphia‘s Centennial Exhibition Collection. Over at PhillyHistory.org, we recently added a selection of those images, and the geocoding presented quite a challenge.

The first hurdle to be overcome was that the vast majority of the Centennial Expo’s buildings (and some of the streets) don’t exist anymore! The Expo took place in a swath of Philly’s Fairmount Park, most of which has turned back into trails and fields over the years.  Some of the buildings were torn down just after the Expo closed; some of them still exist, but in other places. Only a very few buildings and landmarks still occupy the same space they did back in 1876. But do not despair for there was a map! Included in the Free Library’s records was a map of the entire fair, complete with all of the buildings and roads built for the Expo, along with neat things like fountains and plazas. Dana Bauer, one of Azavea’s employees and a geo-referencing expert, took charge of the map and managed to line up enough of the old landmarks with existing geography to give us very accurate coordinates for all of the Centennial Expo’s old buildings! Hooray!

So now we have coordinates, but we need to be able to tie those coordinates to the image records in our system. Here the data was slightly uncooperative. Many of the image records had a building mentioned in some fashion and in one field or another. The challenge here was that the text in the record didn’t always match the official name of the building from the map. For example: the Main Exhibit Hall was mentioned in records as Main Exhibition Building, Main Hall, Main Bldg and several other permutations. We spent some time figuring out around 20 aliases for some of the more popular buildings in the collection.

The last step was to build a parsing-and-geocoding module to find the building references in the record data, match them up with the coordinates we found and add that information to our database. Easy right? Not so fast. All 150-some coordinate pairs and their building text keys needed to be accessible somehow to our data importer. We thought for a while about how best to do that. 150 entries isn’t a huge list, but it’s not short either. We didn’t want to have to mess with it at all once we set something up. This list also wasn’t likely to change over time, so the list didn’t need to be very accessible or easily editable by people. However, we were also going to need it again to return to the Free Library in some format or another at the end of the project. The coordinates were in an Excel spreadsheet by this point, but we didn’t want to deal with the overhead of accessing it directly. We wound up creating a data dictionary inside our importer; kind of a mini geocoder table just for this swath of Fairmont Park circa 1876. We left the parent Excel spreadsheet alone as both a backup archive and the basis for the document we’ll be returning to the Free Library.

While looking though the data for building aliases, we also noted the fields where these names were found. So all we had to do was write a small function to loop through these fields, do some text matching on our data dictionary and add the coordinates to the record in our system. Out of over 1500 images, we wound up geocoding over 90% using this method, and most of the rest simply didn’t have a building reference to find. The entire import, images and all, took under 10 minutes. Not bad at all!

Truncating Floats in OpenLayers and SQLServer

A perfectly valid question when dealing with map coordinates is “How accurate do we need to be?” For some applications, a tenth of a degree is more than accurate enough while for others, several more decimal places are needed. Sometimes this question is answered for you: if your data source only stores four decimal places, then that’s all the precision you’re going to have. If you’re in the lucky (unlucky?) position of generating your own coordinates, one common answer to the “how accurate” debate is “store it all”. This is the path Sajara chose, mostly because we didn’t have a good reason to choose a less precise solution over a more precise one. It just so happens that SQLServer’s floating data type precision limit is not tied so much to the number of decimal places, as to the number of numeric digits to be stored. They allow up to 16 numeric digits, plus a period character and a negative character as needed. Sajara works with coordinate systems in both meters and degrees, so depending on which system we’re using for a given implementation, we could be storing a value far more precise than is even visible to the naked eye.

Fast forward a few years and bring OpenLayers into the mix. We rewrote the asset editing portion of the software to allow data managers to move asset coordinates using an OpenLayers map. These coordinates were saved with still considerably more precision than we needed, but remember, we’re storing whatever precision we get. So far so good.

Now back to the present and we’re working on a comparison tool for our data managers. Suddenly values in the database are not matching the values coming out of our OpenLayers map. Almost, but not quite. In fact, only the last degrees of precision are different. After a bit of digging, we discovered that OpenLayers was returning numbers with between 1 and 3 fewer decimal places than our stored coordinates. Remember that we’re talking about distance differences smaller than a crack in the sidewalk here, but programing languages don’t know anything about “close enough”. Either two numbers are the same or they aren’t and -39.6827663878 is not the same as -39.682766387 no matter how small the physical difference is. So we started digging for the reason.

OpenLayers has a value tucked away in its utility files that sets the default precision of a floating point number to 14 characters. This limit was added when a user noticed that the edges of certain coordinate systems were not behaving correctly due to some floating-point math precision errors. While the OpenLayers community recognizes that most systems allow floats to have 16 digits,  “14 significant digits are sufficient to represent sub-millimeter accuracy in any coordinate system that anyone is likely to use with OpenLayers“. So OpenLayers’ answer to the accuracy question is to save everything that will fit in a standard float, with a few decimal places pared off just in case.

So the next question is: “So what?” The difference between 14 and 16 decimal places in a meter-based coordinate system is microscopic, and in a degrees-based one it’s not much bigger. So far as storing a saved coordinate in Sajara, we didn’t really care if we had 16 digits or 14 digits; the result wouldn’t look any different to our audience. However, since our initial coordinates had 16 digits and OpenLayers only preserved 14 of them, any programmatic comparison fails! No one likes to deal with false positives, but a 100% false positive rate was unacceptable.

We had a few choices here. First we could reset the default precision value in OpenLayers to zero, which would tell the library to never truncate anything. That’s a fairly simple change but we weren’t sure it wouldn’t have unforeseen data effects. Also, there’s a somewhat vague warning about problems with the Web Mercator projection when this value is zero, which is one of the projections Sajara can use. So that option was out.

Second, we could have told SQLServer to alter the precision of coordinate values to 14, which is a fairly major change. This option was ruled out because of a difference in the definition of “precision” between SQLServer and OpenLayers. I mentioned earlier that SQLServer will store a maximum of 16 numeric digits plus a decimal and a negative sign, so a total of 18 characters. OpenLayers, however, considers the default precision of 14 to mean 14 characters instead of 14 numeric digits. So if a number has a decimal and a negative sign, we’re down to 12 numeric digits.  This little difference reintroduces the possibility of false positives, so it isn’t really a change for the better.

The solution we finally decided to use was to change the OpenLayers default precision value to 18. Why 18? That’s the maximum amount of characters that SQLServer will store for a float, so OpenLayers will always be able to deal with any stored coordinates without having to truncate. Now, if we compare our stored coordinates with OpenLayer coordinates, we only get a change notice when an asset has actually been moved. Which is exactly what we wanted.

Here are some technical details for those interested:

The full variable name is OpenLayers.Util.DEFAULT_PRECISION and can be found in the Util.js file. There are a few good comments preceding the variable in the code, but more background can be found in the OpenLayers ticket #1951. SQLServer information can be found in mdsn. Note that if you wind up changing the OpenLayers precision value, you should do it as soon as possible after loading the library, so you don’t have the possibility of code using different precision values.

Nesting Comments Using Ext.XTemplate

When considering a nested comment implementation, we really only have to deal with two types of comment: root comments and child or nest comments. Root comments are easy to describe. They are not a child of any other comment. They’re what you get when someone has a brilliant new insight, hits the “New Comment” button and dazzles us all. Nest comments, on the other hand, seem like they should be more complicated. If you’re allowing replies to comments, then a nest comment could have its own nests. Wouldn’t that make a comment both a nest and a root? Not in this post. For the purposes of this example: roots don’t have parents; everything else is a nest. Simple.

» Continue Reading

Using Multiple Resolution Images to Decrease Page Load Time

Ever since browsers began supporting images, web page authors have fought a constant battle to balance image use against file size. Weapons used in this battle have ranged from compression to spriting to delayed loading. Browsers and internet connections are getting faster and faster all the time which removes some of the pressure in the tiniest-but-nicest balancing act, however perceived page load time is still arguably one of the more important ‘features’ of a web page.

» Continue Reading

OpenLayers Map Centering and the International Date Line

I recently came across another piece of OpenLayers to be aware of when working with maps that wrap the International Date Line. I store the map extent throughout a user’s session so they can leave the map page, come back, and still see the same set of results as when they left. Unfortunately, the map was sometimes taking a stored location in the North Pacific and displaying Northern Africa instead! Obviously not what I want it to do…

» Continue Reading

Using Ext.js Fx.slideIn For Image Rotating

PhillyHistory.org’s homepage will be getting a face lift soon, and I’ve turned to Ext.js and their built-in effects library to do some image sliding. The first thing I learned from the documentation was that the Fx methods applied to anything that was could also be an Ext.Element. Neat!

» Continue Reading

IE8 Menu Control – Skip Navigation Link Bug

Skip navigation links are a good thing, even if your average user can’t see them. They can become a bad thing if your layout assumes that they’re not visible to sighted users of IE8. For some of my menus, (and I don’t know why one does and the others don’t) there’s this annoying space at the top. Using IE8′s marvelously upgraded debugger, I can see that it’s a skip link… with height and width of 0 pixels… that’s still showing up? How??

After much fruitless searching, I found a lonely little forum entry from last year that offers the following one-line fix:

SkipLineText=”"

Put that line in the menu tag declaration or the skin file and you’re golden… unless you can’t see. Then there’s unfortunately nothing for your screenreader to read. *sigh* Hopefully this little ‘feature’ gets an official patch soon.