When deciding how to best handle the air photos in the new Philadelphia Water Department Stormwater Map Viewer, we kicked around a few ideas. We decided to put the cache in Amazon’s Simple Storage Service to offload some of the local disk requirements and leverage their fast data storage and delivery infrastructure. In moving the process, we learned a few things:
Tune Your Cache
Make sure you spend time planning the cache. Not only will the cache look better in the final application, but it will also load to S3 faster and cost less in the long run.
- Set the extents in the MXD or MSD before publishing to a map service. The overhead of transferring the 254 byte empty tiles caused a lot of unnecessary burden on the upload process as well as the fact that you are paying for them to be stored in the cloud. If it doesn’t need to be there, don’t build it.
- Choose the correct image format for the cache. If you are caching a base map and do not need to support transparency, make it a JPEG. If it needs to support background transparency, use PNG. ESRI’s suggestions for planning a map cache can be found here.
Get a Good Tool to Transfer the Files
I started using the free version of Cloudberry Labs S3 explorer. But I had to move over 90 Gbs worth of data to my S3 bucket. The CloudBerry S3 Explorer – Pro supported multithreading which allowed for up to 5 threads to either enumerate through the folders, copy the files or apply the ACL. It is a low cost application that more than pays for itself when moving a lot of files up to a bucket.
When transferring the files up, I was working in blocks of directories, not the whole scale level. It was quicker for me to work in 20 to 30 subdirectories than grabbing a whole scale level. It did require a little bit more management on my end, but more steady progress was made.
Accessing the Tiles
ArcGIS Server does not support cloud hosted caches at the 9.3.1 release. The ESRI Javascript API and Flex API can be extended to use caches hosted in the cloud (Flex example from Mansour Raad), so you’ll have to roll your own. For the Philly Storm Water project, we were using the Open Layers and someone has rolled one for us. There is a patch that can be used to access the cache without communicating through ArcGIS Server straight from the client-side library. The one thing to note is that the Tile Origin is pretty touchy, we had to make some adjustments to the origin values to make sure everything lined up correctly.
Summary
Now that the site is up there and we are starting to get some traffic hitting it, putting the tiles in S3 was the right decision. There is no reason for ArcGIS Server to waste any cycles moving tiles around, let it do the heavy lifting with the vector layers and queries. Hopefully the rumors are true, and the ArcGIS Server 10 release will be more aligned with cloud computing. Until then, there are still plenty of ways to take advantage of the benefits.
Most of the web applications we build are either used internally by our clients or have a steady stream of public user activity. With our recent Redistricting the Nation launch we wanted to experiment with some optimizations to make our site more resilient to traffic spikes as well as to improve the user experience.
Our strategy is broken down into a few components:
This post covers the Cloudfront CDN.
Previously, we had experimented with Amazon’s Web Services stack to host applications, but we hadn’t experimented with their Cloudfront CDN product. Pricing for the CDN is quite similar to Amazon S3 and allows organizations to build scalable applications without the upfront cost of most CDNs. We decided to use the CDN to host some large Javascript assets as well as our image components.
Cloudfront is quite easy to setup. We simply created an Amazon S3 bucket called s3.azavea.com and pointed a CNAME record for s3.azavea.com to the full bucket domain — s3.azavea.com.s3.amazonaws.com. Then, we enabled a Cloudfront distribution for the s3.azavea.com bucket using the free tool Cloudberry. Finally, we setup a CNAME record for cdn.azavea.com to the Cloudfront distribution domain d17ib0dlm1q8qa.cloudfront.net and we were rolling.
Since the CDN is heavily cached, it was easiest to use s3.azavea.com links during development to reduce the amount of file versioning that was necessary. Once we were settled on our assets, we switched to cdn.azavea.com links and started using the CDN.
The speed of the CDN is quite astounding. Splitting assets across another domain name also improves the browser’s ability to request more files at once improving the user experience. We were quite pleased with how easily we could offload assets to Cloudfront and realize gains with limited time investment.
A few notes to keep in mind when you are working with a CDN for the first time:
- Since there is no way to flush assets out of Cloudfront’s edge nodes, be sure to use file name versioning. This was a bit alien to us, but is easy to incorporate once you think it through. For instance, we decided not to set a far-future expiration header on our PDF assets as they are often directly linked to and we wanted to be able to update them regularly.
- Speaking of PDFs, it seems that while Cloudfront supports byte-range requests for assets, it doesn’t assert the “Accept-Ranges: bytes” HTTP header. This makes our large PDFs fully download before Adobe displays them within the browser. Unfortunately there is no way to add this header at the moment.
- Cloudberry is great to add HTTP headers to S3 assets. We decided that most of our assets would have a six month cache lifespan by asserting the “Cache-Control: max-age=15552000″ header.
When considering a nested comment implementation, we really only have to deal with two types of comment: root comments and child or nest comments. Root comments are easy to describe. They are not a child of any other comment. They’re what you get when someone has a brilliant new insight, hits the “New Comment” button and dazzles us all. Nest comments, on the other hand, seem like they should be more complicated. If you’re allowing replies to comments, then a nest comment could have its own nests. Wouldn’t that make a comment both a nest and a root? Not in this post. For the purposes of this example: roots don’t have parents; everything else is a nest. Simple.
» Continue Reading
I have been following the Ubiquity project from Mozilla Labs, and I gotta say, it’s pretty rad. If you are a Javascript Ninja or an aspiring one, Ubiquity can make your web surfing super slick.
I slapped together a browser command that I use to search our internal wiki in record time. Sample code included!
» Continue Reading
I recently came across another piece of OpenLayers to be aware of when working with maps that wrap the International Date Line. I store the map extent throughout a user’s session so they can leave the map page, come back, and still see the same set of results as when they left. Unfortunately, the map was sometimes taking a stored location in the North Pacific and displaying Northern Africa instead! Obviously not what I want it to do…
» Continue Reading
PhillyHistory.org’s homepage will be getting a face lift soon, and I’ve turned to Ext.js and their built-in effects library to do some image sliding. The first thing I learned from the documentation was that the Fx methods applied to anything that was could also be an Ext.Element. Neat!
» Continue Reading