How Philadelphians Get To Work: Calculating Mode Share and Dot Density Maps in CartoDB

To view the full screen map of journey to work in Philadelphia, click here.

We’ve seen some great examples of dot density maps that have been made the past few years, notably the excellent map of racial groups released in 2013. Essentially, these maps are created from polygon census shapefiles by randomly distributing points within each polygon based on the total of each attribute being displayed. For example, if 3,000 people live in a Census tract, 3,000 dots will be randomly distributed in the tract. There are plugins in ArcGIS and QGIS to do this, but what about PostGIS and CartoDB?

I thought I’d take a transportation-related dataset, journey to work mode, available in the latest release of the American Community Survey, and visualize that by the census block group in Philadelphia. The journey to work data contains fairly detailed information on the number of people and how they get to work — in this example I’m going to use drove alone, walking, bicycling and public transit. Most of the time if you’re visualizing data with multiple attributes like journey to work, you’d make separate choropleth maps, one for each mode. In this case, the dot density map will represent each transportation mode, with one colored dot for each category of response.

One of the common issues with dot density maps is that they can misrepresent location. Since we don’t have household level data from the Census, each of the dots are randomly distributed within the polygons, not actually placed on the location of the house. Therefore, it’s a good idea to clip out features where you know people do not live. In this case, I clipped out water features from the block group shapefile. One could imagine taking that a step further and clipping to only residential land use or parcels, but that might get a little messy if displaying multiple attributes in smaller polygons. Next, I calculated percentage share for each mode, using the SQL below after loading the data into a PostGIS database (in this case, CartoDB):

UPDATE azavea.phila_blockgroups
SET _14_perc_drovealone = round( round(_14_totalcar_drovealone,1) round(_14_total,1) * 100, 1) WHERE _14_totalcar_drovealone  > 0

Note that I used the round() function to play nice with the field types in CartoDB and display percentages with one decimal place. I’ll use the calculated mode share percentages that we just made to style an infowindow containing the proportion of each block group using a particular mode share. Next, I’ll have to create that dot density spatial table in CartoDB.

Since CartoDB uses PostGIS to store spatial data, you can write functions directly in the SQL editor and apply them to your datasets. It’s often a bit of a trial and error process, since not every function available in PostGIS is going to work in CartoDB. Stuart Lynn, Map Scientist at CartoDB, came up with a function to create dot density maps in PostGIS. This function can be added to any CartoDB instance or even right in the SQL editor. You can find the dot density function below, which you can just copy, paste, and then run in the Editor:

CREATE OR REPLACE FUNCTION 
dot_density
  (g geometry , 
   no_points Integer, 
   max_iter Integer DEFAULT 1000 )  
RETURNS setof geometry AS$$DECLARE extent GEOMETRY;    
        test_point Geometry;    
        width NUMERIC;
        height NUMERIC;
        x0 NUMERIC;
        y0 NUMERIC;
        xp NUMERIC;
        yp NUMERIC;
        no_left INTEGER;
        points GEOMETRY[];
BEGIN 
        extent:= ST_Envelope(g);
        width:= ST_XMax(extent) - ST_XMIN(extent);  
        height:= ST_YMax(extent) - ST_YMIN(extent);
        x0:= ST_XMin(extent);
        y0:= ST_YMin(extent);
        no_left:= no_points;
LOOP 
        if(no_left=0) THEN EXIT;
        END IF;
        xp = x0 + width*random();
        yp = y0 + height*random();
        test_point = CDB_LATLNG(yp,xp);
        IF(ST_Contains(g, test_point)) THEN no_left = no_left - 1;
        RETURN NEXT test_point;    
        END IF;
        END LOOP;
END$$LANGUAGE plpgsql VOLATILE

The function finds the extent of each of the polygons in the table, creates random points within that extent, then passes the points through the polygon and keeps them if they fall inside.

Next, I used that dot density function to create a table — with one point for each of the number of people driving, walking, taking public transit, or bicycling to work. The function requires two inputs; the geometry field and the attribute field you are mapping.

select 
ST_TRANSFORM(dot_density(the_geom,_14_totalcar_drovealone::integer),3857) as the_geom_webmercator, '14drovealone' as mode, 1 as t from azavea.philadelphia_block_groups_with_mode_share 

UNION 
select ST_TRANSFORM(dot_density(the_geom,_14_totalpublic_transit::integer),3857) as the_geom_webmercator, '14publictransit' as mode, 1 as t from azavea.philadelphia_block_groups_with_mode_share 

UNION 
select ST_TRANSFORM(dot_density(the_geom,_14_bicycle::integer),3857) as the_geom_webmercator, '14bicycle' as mode, 1 as t from azavea.philadelphia_block_groups_with_mode_share   

UNION 
select ST_TRANSFORM(dot_density(the_geom,_14_walked::integer),3857) as the_geom_webmercator, '14walked' as mode, 1 as t from azavea.philadelphia_block_groups_with_mode_share

The result is a table with points in each block group that contain an attribute for the type of journey to work mode. Finally, I used Leaflet to overlay that map with the journey to mode share polygons, with no styling, but applied a highlight on click. The resulting map is below:
In summary, a dot density map can be a cool way to visualize aggregate data and identify spatial patterns, keeping in mind the caveats. In the map of Philadelphia journey to work, one can easily start to identify commuting patterns across the city. In Center City, the map is dominated by walking and public transit use. In University City, there’s a lot of walkers, and a surprising number of people drive alone in Northern Liberties and Fishtown. In Point Breeze, there’s a few pockets with lots of residents commuting by bicycle. You can find the code here.