What if you wanted to query all of OSM for all time (for every change ever)? The data exists—OpenStreetMap publishes data on the entire history of every edit made in OpenStreetMap, and there is even a version of that data published weekly by Amazon Web Services (AWS). However, handling that amount of data and transforming it into usable geospatial information to perform analytics is a difficult challenge.

Azavea, in partnership with Pacific Atlas, developed open source tooling based on GeoTrellis and Apache Spark to enable working with OpenStreetMap history at scale. The tooling has been used to backfill the Missing Maps leaderboard data, generate global vector tile sets for monitoring change in OpenStreetMap, and power the statistics behind the Scoreboard project. It leans on the big data capabilities of GeoTrellis and the power of JTS to transform the OpenStreetMap history and snapshot data into geospatial information that can be queried and transformed into other useful datasets, such as vector tiles.

The project is currently called OSMesa, although that name is going to change*. Here are some examples of what OSMesa can do:

Find out which efforts are driving OSM edits

This heatmap shows the location of historical OSM edits by hashtag. Hashtags are used in OpenStreetMap changeset comments to allow analytics to keep track of the reason they made the edits, for example which Humanitarian OpenStreetMap Team campaign they are participating in. There are a few hashtag examples in the dropdown menu, which you can use to see a heatmap of historical node edits made as part of those campaigns. Osmesa was used to generate the vector tiles which hold the historical node edits, which are then rendered using MapboxGL.

Reveal the best path between two points

This next example is a global raster layer derived by using SRTM 30 meter resolution raster data and OpenStreetMap roads and waterway data to create a friction surface. A friction surface provides information about how difficult terrain is to cross, and can be used in a cost distance calculation to determine the best path to take from point A to point B on a map, regardless of whether or not a road network is available.

Highlight OSM edits throughout time

This map displays the history of OSM data in Rhode Island. OSMesa was used to create vector tiles with full historical data, which is then rendered in time slices based on the date selector at the top of the map:

Check out Rob Emanuele from Azavea and Seth Fitzsimmons talking about this project at State of the Map US (SOTMUS) 2018 for more information:

* Seth and Rob lead the SOTMUS crowd through a renaming process at minute 21. Joe Flasher from AWS came up with “FireHosm”, which won out with the crowd.