Measuring District Compactness in PostGIS

By Daniel McGlone on July 11th, 2016

Azavea has a long history of promoting good government through civic technology. This includes our collaborations with the Public Mapping Project and our work creating District Builder and Cicero. One major civic challenge we’ve tackled over the years is the redistricting process, which is inherently spatial and of interest to many of us at Azavea. This interest, along with our Cicero work, led us to author several white papers around gerrymandering and redistricting. The redistricting cycle resulting from the 2010 Census was one of the most contentious in recent memory. Federal Courts ruled on cases in Arizona, North Carolina, Virginia and Florida, sometimes upholding redistricting commissions and sometimes striking down gerrymandered districts. I recommend reading the All About Redistricting blog to keep track of the latest updates in redistricting across the U.S.

In our most recent white papers on the topic, we used four commonly cited metrics for calculating compactness of legislative districts. Polsby-Popper and Schwartzberg measure the indentation of the district, while Area/Convex Hull and Reock measure the dispersion of the district. In this blog, I’ll describe the methods for calculating each using PostGIS. If you’re interested, you can try this at home using the congressional districts we have made available on the Cicero Github repository, or pick up the state legislative compactness data already calculated. I should mention that compactness is but one measure to consider when it comes to gerrymandering, and there has been a ton of other research around the topic. I don’t suggest that compactness is the end-all to measure whether gerrymandering exists.

Most of the methods require some basic geometric statistics about the shapes, so it’s a good idea to calculate the perimeter and area of the district (which we have included in the data). It’s important that the shapes are in an equal-area projection for these calculations. This ensures a minimal amount of distortion, so districts in different parts of the country can be accurately compared to one another. In this case, I used North America Albers Equal Area Conic. To prepare your PostGIS database, you may have to add this into your spatial_ref_sys table.

Polsby-Popper

polsby-popperThe Polsby-Popper measure is the ratio of the area of the district to the area of a circle whose circumference is equal to the perimeter of the district. The formula for calculating the Polsby-Popper score is:

Polsby popper formula

Translated into PostGIS:

ALTER TABLE districts_table ADD COLUMN polsby_popper double precision;
UPDATE districts_table
SET polsby_popper = 4 * pi() * (area/(perimeter^2));

Schwartzberg

schwartzbergThe Schwartzberg score is the ratio of the perimeter of the district to the circumference of a circle whose area is equal to the area of the district. To generate the Schwartzberg score, first the circumference of a circle with an equal area of the district must be calculated. To do so, first get the radius of a circle with equal area:

schwartzberg formula 1

Translated into PostGIS:

ALTER TABLE districts_table ADD COLUMN equal_area_radius double precision;
UPDATE districts_table
SET equal_area_radius = |/area/pi();

With the radius calculated, use the following formula to generate the circumference (perimeter):

schwartzberg formula 2

In PostGIS:

ALTER TABLE districts_table ADD COLUMN equal_area_perimeter double precision;
UPDATE districts_table
SET equal_area_perimeter = 2 * pi() * equal_area_radius;

Finally, generate the Schwartzberg score using the following ratio:

schwartzberg formula 3

In PostGIS:

ALTER TABLE districts_table ADD COLUMN schwartzberg double precision;
UPDATE districts_table
SET schwartzberg = 1 / (perimeter / equal_area_perimeter);

Area/Convex Hull

area-convex_hullThe Area/Convex Hull score is a ratio of the area of the district to the area of the minimum convex polygon that can enclose the district’s geometry. To generate convex hull polygons in PostGIS:

CREATE TABLE convexhull AS
SELECT gid, geoid, ST_ConvexHull(district_table.geom) AS geom
FROM district_table;

This formula will create a new table with the convex hull polygons. Next, make sure that the SRID is Albers Equal Area for calculations:

UPDATE convex_hull SET geom  = ST_SetSRID(geom, 102008);

Now the area can be calculated.

ALTER TABLE convexhull ADD COLUMN area double precision;
UPDATE convexhull
SET area = (ST_Area(geom));

I’ll bring the convex hull area into the original districts table to calculate the ratio.

ALTER TABLE districts_table ADD COLUMN convex_hull_area double precision; UPDATE districts_table t2 SET convex_hull_area = t1.area FROM convex_hull t1 WHERE t2.geoid = t1.geoid AND t2.area IS DISTINCT FROM convex_hull_area;

Finally, the Area/Convex Hull score can be calculated as the following ratio:

convex hull formula

Reock

reockThe Reock score is a measure of the ratio of the area of the district to the area of the minimum bounding circle that encloses the district’s geometry. Therefore, a new table with minimum bounding circles for each district must be created.

CREATE TABLE minimum_bounding_circles AS
SELECT gid, geoid, ST_MinimumBoundingCircle(districts_table.geom) AS geom
FROM districts_table;

Update the table to make sure it has the proper SRID:

UPDATE convexhull_statelower SET geom  = ST_SetSRID(geom, 102008);

Now the area can be calculated.

ALTER TABLE minimum_bounding_circles ADD COLUMN area double precision;
UPDATE minimum_bounding_circles
SET area = (ST_Area(geom));

Bring the convex hull area into the original districts table to calculate the ratio.

ALTER TABLE districts_table ADD COLUMN min_bounding_circles_area double precision; UPDATE districts_table t2 SET min_bounding_circles_area = t1.area FROM minimum_bounding_circles t1 WHERE t2.geoid = t1.geoid AND t2.area IS DISTINCT FROM min_bounding_circles_area;

Finally, the Reock score can be calculated as the following ratio:

reock ratio

If you’re interested in the compactness data we’ve calculated for legislative districts in the U.S., we have made it available on Github, along with other data used in our Cicero product.