Open Data from OpenTreeMap: Visualizing temporal data with CartoDB’s Torque

Open Data from OpenTreeMap: Visualizing temporal data with CartoDB’s Torque

I just wrote up a meaty Labs post on my idea to visualize tree, species, and user edits over time within exported data from PhillyTreeMap.org, and already covered all the joining, formatting, converting, and uploading necessary to get to this point, along with some simple visualizations at the end. If you haven’t read it, go ahead. I’ll wait here. Because with this post I’m diving straight in to the temporal visualization features of CartoDB’s Torque.

Briefly, though, to reiterate: What are my goals for visualizing the 2 years of PhillyTreeMap user edits over time? I wanted to create something parallel to Mark Headd’s homicide data visualization (also done with Torque) but that told a story over time that was more uplifiting. (What’s more uplifting than trees?) I also hoped my visualization would give us a rough idea of what neighborhoods and areas around Philadelphia have the most active PhillyTreeMap user edits, as well as what times of year seem most active. One could use that knowledge to determine and plan where or when to do outreach about PhillyTreeMap or the programs of our partners, like PHS Tree Tenders. What neighborhoods don’t have many user edits? When does participation drop off? On the flip side, where and when are urban forestry efforts succeeding in engaging the community? A time based spatial visualization can help us answer those questions – and look really cool in the process!

[iframe src=”http://azavea.github.io/philly-trees/index_trees.html” width=”100%” height=”480″]

One final caveat: it’s important to note that Torque is under very active development at CartoDB. I was looking for pointers as I was writing this blog, and the folks at CartoDB including Andrew Hill were very helpful on the mailing list and would be happy to answer other questions you have. But they told me the next generation version of the library is due to come out “soon”, with better documentation, and it may differ greatly from what I write about below. The visual effect of time based data in Torque is just so cool though, that I couldn’t wait!

Testing and Tweaking

CartoDB have set up a number of Torque demos right on Github Pages. You can look at their demo data, or plug in your CartoDB user, table, and column name into the options sidebar to visualize your own date-based data. My “plots_and_trees” table (that I created in the last blog) is set to public (yours must be as well if you want to use Torque, as currently the library doesn’t do password authentication), so feel free to use it if you wish: User “andrewbt”, table “plots_and_trees”, columns “tree_last_updated” or “plot_last_updated”.

The Torque demo gives you a number of options that affect the visual effect of your visualization, but because it’s in development I couldn’t find any good explanations of what they are or how best to use them. So I wrote some up. Congratulations, dear reader, you get to enjoy the fruits of my copious amounts of experimentation.

  • Resolution: Here you have a choice of doubling numbers from 1 to 32. What does it do? This effectively changes the granularity of the data points CartoDB will stream from your table to Torque. Or, as Andrew Hill explained on the Google Group, “resolution relates to the actual X, Y dimensions that data will collapse to coming from the server and drawing to pixels.” Point is, I noticed lower values seemed like they would more accurately reflect the location of the actual record, whereas larger values created a larger data point “dot” that gave a looser indication of actual location. It may be the case that for very large datasets, a larger resolution would make the animation faster or smoother. However, a resolution of 1 or 2 is fine for our PhillyTreeMap table.
  • Cellsize: Slider from 0 to 50. This is one of the more befuddling options. Andrew Hill explained to me on the CartoDB Google Group that it deals with buffered trim around each of the data points. In my experience, whether you set it to 1 or 50, it doesn’t change much. Perhaps my eyes can’t see the difference or tree data isn’t the best use case for this.
  • Cumulative: Checkbox. This makes a big difference visually, and determines whether data points stay on your map persistently or fade out. In the case of our PhillyTreeMap data, my earlier concern about thousands of records from bulk imports having the same last updated date rears its head. If you check the cumulative checkbox, whether you’re using the “plot_last_updated” or “tree_last_updated” column, your map will be flooded with huge imports early in the summer of 2011, and for the next two years of animation you won’t really get to see much editing activity because of the thousands of trees and plots already drawn on the map. It makes for a much more exciting and compelling visualization of user activity if you turn the cumulative checkbox off.
  • Trails: Another checkbox that makes a sizable difference visually. Turned off (with cumulative also off), the data points on your map will appear and disappear rather quickly, like camera  flashes, no matter what other options you have set. I found this to make for a jittery visualization that was tough to really make you ponder the “where and when” questions I outlined above.  Turned on (still with cumulative off), each data point will appear quickly, but will appear to slowly fade out with several “trail” points afterward – almost like a mushroom-cloud. This has a really cool effect with moving data, like the Torque demo’s default WWI ships data, but even with stationary trees the lingering trails allow you to comprehend just where active and non active neighborhoods are better.
  • Steps: Slider from 0 to 750. This is actually pretty straightforward, if I understand it correctly. Given the start date and end date of your data, this determines the number of individual animation steps that Torque will take to loop through all the data once. So, higher values equate to a slower and smoother animation with less data points in each “frame,” whereas lower values equate to a more rapid visualization from start to finish, squeezing more data points in each frame. In other words, are you visualizing activity each day, or each month?
  • FPS: Slider from 0 to 50. Also straightforward if you’re familiar with animation: Frames Per Second. Given the number of steps, or frames, we fit our data into in the last setting, how quickly do we want to loop through all of those? 20 FPS and above are a good numbers to make everything appear smooth to the human eye, but can be a bit fast with our tree data if you’re really trying to take a look at trends. In our data’s case, I found the best to be around 12-15 FPS.
  • Blendmode: So confusing at first! Andrew Hill helped clear this up a little bit by sending me this article from Mozilla with examples. I was also able to find this article from the W3C (scroll down and look at the pictures and operations) that uses the same language as the Torque demo, too. This is real HTML5 Canvas graphics at work. Abstractly, the blendmode deals with how to “blend” visual information already on the canvas with new visual information to be added. More concretely, given that Torque is temporal, and we may already have a data point drawn on the map, if another data point at a later date comes up near or on top of the point we already have, the blendmode determines the method for visualizing and coloring both of these points simultaneously. As I said on the Google Group, I’m a fan of using “source-over” for the tree data, which seems to give emphasis to the most recently drawn points by painting them over older points. But, if you have cumulative switched off, then points don’t stay around long, and how overlapping ones get blended becomes less of an issue. Still, it’s fun to experiment with the different modes.
  • Point_type: Circle or Square? This option describes what you want your data points to look like. I’ve noticed circles have more dramatic trails, leading to the mushroom cloud effect I mentioned earlier which perhaps makes individual data points easier to distinguish and visualize. Squares nest together better (they are effectively groups of pixels themselves), so each individual point appears less distinctive and they group more. This gives the squares an interesting “glowing” characteristic with trails enabled, too, but without trails they can seem to flash by even quicker than circles which makes them harder to notice.

Making our Own

So, after experimenting with the interactive demo, I’d decided upon this Torque options mix for our OpenTreeMap data:

Resolution: 2
Cellsize: 1
Cumulative: false
Trails: true
Steps: 750
FPS: 12
Blendmode: source-over
Point_type: circle

As my next step, I forked Mark Headd’s philly-homicides repository as my own philly-trees project. The first change I made was to the Google Map style – I liked the darker appearance of the CartoDB Torque demo and found the dark yellow highways on Mark’s map to be distracting:

[github file = “/azavea/philly-trees/blob/master/index_trees.html” start_line = “53” end_line = “58”]

Next I went to update the TorqueOptions with the choices I had decided upon above:

[github file = “/azavea/philly-trees/blob/master/index_trees.html” start_line = “78” end_line = “93”]

And we have our first time-animated map of user edits to trees!

[iframe src=”http://azavea.github.io/philly-trees/index_trees.html” width=”100%” height=”480″]

It is easy enough to change the map to display user edits to plots instead, simply change the column in TorqueOptions. I created a new file to do this:

[github file = “/azavea/philly-trees/blob/master/index_plots.html” start_line=”81″ end_line=”82″]

The Big Bang

However, see how as each of these animations loop, there are big explosions of thousands of trees and plots added in spring 2011? As mentioned in the last blog, these were the large bulk imports we originally seeded PhillyTreeMap with. While the rest of the animations look fine as long as “cumulative” in TorqueOptions is set to false, as we discovered, I wasn’t quite satisfied. It would be great if we could filter our CartoDB table by date and only display results after the major initial bulk imports, and turn the cumulative option on so the data points would persist on the map. That way we could visualize all user edits simultaneously, and they wouldn’t be crowded out by the bulk imports.

But how to figure out which specific dates I needed to filter out? Part of CartoDB’s power comes from being able to run straight PostgreSQL queries. After looking at the Postgres documentation and finding out about the date_trunc() function, here’s what I came up with:

SELECT date_trunc('day', tree_last_updated) as day, count(date_trunc('day', tree_last_updated)) as counted from plots_and_trees

GROUP BY day

We had thousands of trees imported on July 21, 2011! So, querying our table for all records where the tree_last_updated date was July 22 or after would do the trick:

SELECT * from plots_and_trees where tree_last_updated > '2011-06-22'

I came up with similar queries for plots:

SELECT date_trunc('day', plot_last_updated) as day, count(date_trunc('day', plot_last_updated)) as counted from plots_and_trees
GROUP BY day

//here's the SQL for all plot records after major import:
//Note the OTM project introduced the concept of a plot separate from a tree in 2012.

SELECT * from plots_and_trees where plot_last_updated > '2012-05-24'

Now, how to visualize a SQL-filtered table with Torque and not just the whole thing? Luckily, Jesus on the CartoDB mailing list had already had this same question and I just happened to find the thread. What needs to be done to visualize an SQL query with Torque is to assign the query to an alias (Jesus and I used “i”, below), and use that instead of a table name in TorqueOptions:

[github file = “/azavea/philly-trees/blob/master/index_trees_cumulative.html” start_line=”80″ end_line=”81″]

Then, you need to go into Torque’s grid_layer.js file, find where the SQL statement is constructed, and make a small modification to use your aliased query:

[github file = “/azavea/philly-trees/blob/master/src/grid_layer_for_date_queries.js” start_line=”165″ end_line=”169″]

I copied the grid_layer.js file to a new grid_layer_for_date_queries.js before I made this change, so my existing non-cumulative tree and plot visualizations would still work. In my new index_tree_cumulative.html and index_plot_cumulative.html visualization pages, I then linked to this new JS file:

[github file = “/azavea/philly-trees/blob/master/index_tree_cumulative.html” start_line=”29″ end_line=”30″]

It was a happy coincidence that my files were from Mark Headd’s 9 month old project and the same version as Jesus’ old files, so I could use his modifcation directly. Torque is under development, and the current version no longer has the same SQL-statement construction. So, while a different modification would need to be made to support queries, it’s still possible and fairly straightforward, as Jesus says.

The Big Reveal

Finally, we have four time-based visualizations of user edits to plots and trees in PhillyTreeMap over the last 2 years. Some ephemeral ones (first is trees and second is plots), which look cool:

[iframe src=”http://azavea.github.io/philly-trees/index_trees.html” width=”100%” height=”480″]

[iframe src=”http://azavea.github.io/philly-trees/index_plots.html” width=”100%” height=”480″]

And some persistent ones, which let you really see which neighborhoods are active and when (again, first is trees and second is plots):

[iframe src=”http://azavea.github.io/philly-trees/index_trees_cumulative.html” width=”100%” height=”480″]

[iframe src=”http://azavea.github.io/philly-trees/index_plots_cumulative.html” width=”100%” height=”480″]

Congratulations to all the tree mappers in East Passyunk and Center City, but looks like the residents of Northeast Philly need to get outside and map more! In all seriousness though, it’s remarkable to see all of these user edits at once. These visualizations really bring home the fact that PhillyTreeMap is a living, breathing community of active individuals doing their small parts, which amount to an even greater whole.