Visualizing data with CartoDB

Visualizing data with CartoDB

A few weeks ago, I came across a data set published by OPEC that caught my attention. It documented global proven oil reserves, by country, from 1960 to 2011 and I was interested in exploring this data visually. Of course, there are a number of tools available to visualize data, but I had just signed up for a CartoDB account. I decided to experiment with CartoDB to serve data to an interactive web page that used Leaflet and HighCharts for mapping and charting functionality. CartoDB is a service that stores geographic data in a spatially enabled database in the “cloud”, and provides a variety of tools to access, analyze, and display the data. It is built using the open source PostGIS database and runs on a large-scale Amazon Web Services infrastructure.

My idea was to build a column chart of yearly global totals over time with a corresponding map that proportionally showed each country’s reserves. When hovering the mouse over any of the year columns, the country totals on the map would update to show their reserves for that year. I sketched out my plan that night on the back of a placemat, downloaded a few JavaScript libraries, and got to work.

I started by massaging the Excel table of OPEC data so that the field headers would play nice in PostGIS and with jQuery/JavaScript as object properties. I removed spaces and special characters from the field names, and edited the year field names so a letter preceded the number (like y1977). This made sure my SQL queries and JSON parsing would go smoothly. I created the countries centroids using ESRI’s ArcMap desktop software by joining the table on country name to the free “Countries” data ESRI provides, and then “calculating geometry” into latitude and longitude fields. I realized after uploading the data that I could have possibly skipped this step, using the CartoDB geocoding interface instead. I uploaded my .XLSX file using CartoDB’s data management UI and it recognized my latitude and longitude fields and stored them as a geometry field. I was able to immediately see a map of country centroids. Wow. That was easy.

I noticed some of the centroids were in unusual places due to country shapes. For example, Vietnam is somewhat crescent-shaped, so its geographic center actually falls outside its border. I adjusted the centroids right there in the web interface. Wow. That was really easy. Then I started playing with the symbology and switched to a “bubble map” (or graduated symbol) that proportionally set the size of a circle according to country’s oil reserves for a given year. Using the intuitive visualization interface only (I’ll get to the Carto language, a CSS-like language for styling maps, in a bit), I was able to see how the circle radii differed when looking at different years of data. Wow. I couldn’t believe how easy that was.

But CartoDB is for more than just visualizing data in their interface. It has a set of APIs that support geographic and conventional database queries, creation of map tiles, and integration into Google Maps and Leaflet. Time to test out CartoDB’s SQL API. The API is well documented and ridiculously straightforward. Hit a URL with an account name, SQL query and format parameters and a well-formed JSON or GeoJSON object is returned. Since I was using Leaflet, I chose GeoJSON, for which Leaflet has built in support. After parsing the data a bit and tweaking the way Leaflet rendered the points as circleMarkers, each country centroid was placed on the map with an appropriate radius reflecting its 2011 oil reserve amounts.

Having reached this success with the SQL API so quickly, I decided I wanted to experiment with the Maps API, which returns map images in a tiled format. I found a USGS database of existing oil fields in Shapefile format. Again, it uploaded with no trouble. CartoDB allows users to set the cartographic styles of data sets directly in the UI, but also gives access to their Carto language. Carto allows for CSS-level control for defining how a thematic map displays. These styles are preserved when the data is called using the Maps API: colors, transparency… amazing. And again, it was seamless. My map had the tiles now displaying beneath the circleMarkers I’d created using the country centroids.

The only stumbling block during this whole process was in getting the interactivity between Leaflet and HighCharts to work the way I had imagined, but this is mostly due to my clumsy JavaScript skills, and one of Azavea’s developers, Matt, was able to quickly point me in the right direction. I also took a few pointers from Mike, one of our UI/UX designers.

Years ago, after building blog sites with custom PHP, I was floored when I discovered WordPress. That software revolutionized the process of publishing content on the web, significantly lowering barriers to entry. CartoDB is like a user-friendly CMS for geographic data. It takes the pain out of setting up servers with PostGIS and Geoserver, and serving data to the browser (or writing data from it). So when friends now ask for advice on including geographic data in their web and mobile applications, I will have a few different options for them. CartoDB joins other frameworks that Azavea has used to publish geographic data. My colleague, Tamara, used MapBox for publishing some analysis related to voter ID questions in August. And we have also used Esri’s ArcGIS Online infrastructure for other projects. I’m looking forward to seeing how these different frameworks evolve over the coming year.

See the result of this project here: http://sandbox.azavea.com/cartodbvisualization/