Azavea Atlas

Maps, geography and the web

2014 Summer of Maps Fellows Announced

We are thrilled to announce the 2014 Summer of Maps fellows and the non-profit organizations they will work with.  Please join me in congratulating:

Tim St. Onge, M.S. Candidate, Geographic Information Science for Development and Environment, Clark University, working with:

  • DataHaven - Analyzing The Relationships Between Neighborhood Indicators In The Greater New Haven And Valley Region

  • Community Design Collaborative – Using Spatial Analysis to Prioritize Design Grants in Philadelphia

Amory Hillengas, Masters of Urban Spatial Analytics, Community and Economic Development concentration, University of Pennslyvania working with:

  • GirlStart - Analysis Of Funding Resources And Program Adoption Of Girlstart In Central Texas

  • City Harvest - Analysis of Retail Food Access in Low Income Communities to Measure Need for and Impact of City Harvest Programming

Jenna Glat, B.A.  in Geography and Spanish, Colgate University, working with:


We are very excited to work with our 2014 Summer of Maps fellows and support them through their work with these nonprofits.

Fellowship Sponsors

We are also enormously grateful to the following organizations for sponsoring the 2014 Summer of Maps Program:



These generous sponsorships have enabled us to continue this very important program.

The United States of Social Media Part II: This Time It’s Partisan

A few months ago, I researched how many state legislators had social media accounts and which social media platforms they were most likely to use.

That post focused on Facebook, Twitter and YouTube and found that 46% of all state legislators have an account on at least one of these platforms, with Facebook being the most used of the three. As a follow-up, I decided to look at whether one of the two major parties was more likely to have social media accounts than the other.

My first assumption was that Democrats, being more popular with younger voters, would be more likely to have social media accounts. As I entered more social media data into the Cicero API database, I discovered that Republicans more than held their own on this front. After we entered more data into Cicero I started to believe that Republicans might actually have more social media accounts than Democrats. I readjusted my expectations, but this was all assumption. It made me curious enough, though, to further explore the data and see if either of my hunches might be correct.

As it turns out, the data indicate that in both the upper and lower chambers of state legislatures, Republicans and Democrats hold almost identical patterns in their social media account membership.

Overall, nearly half of all Republicans and Democrats (48% and 47% respectively) have an account with one of the three major social media players that I was studying.

All Social Media Upper Chambers

Image 1

In the upper chambers, 51% of Republicans and 52% of Democrats have at least one account. Nearly 3 in 10 in each party (27% of Republicans and 28% of Democrats) use only one platform. That drops to 2 in 10 (21% of Republicans and 20% of Democrats) when it comes to using two platforms. Only a small number of legislators use all three platforms: 3% of Republicans and 4% of Democrats.

All Social Media Lower Chambers

Image 2

In the lower chambers, the similarities continue. Forty-seven percent of Republicans and 45% of Democrats have at least one social media account. Again, nearly 3 in 10 in both parties (28% of Republicans and 30% of Democrats) have an account with only one platform. That drops to 16% of Republicans and 13% of Democrats using two platforms. Again, only a small number of legislators use all three platforms: 3% of Republicans and 2% of Democrats.

Upper Chambers with Facebook

Image 3

The parties show similar account ownership when it comes to each platform as well. In the upper chambers, 44% of Republicans and 45% of Democrats have Facebook accounts.

Upper Chambers with Facebook

Image 4

While in the lower chambers, 41% of Republicans and 40% of Democrats have Facebook accounts.

Upper Chambers with Twitter

Image 5

The Twitter data tells a similar story: 30% of Republicans and 29% of Democrats in the upper chambers have Twitter accounts.

Lower Chambers with Twitter

Image 6

While in the lower chambers, 20% of Republicans and 19% of Democrats have Twitter accounts.

Upper Chambers with YouTube

Image 7

Only 5% percent of Republicans and Democrats in their states’ upper chambers have YouTube accounts.

Lower Chambers with YouTube

Image 8

In the lower chambers, 7% of Republicans and 3% of of Democrats have YouTube accounts.

The data hammer the same point home: both parties are using social media in almost the exact same fashion. It doesn’t matter if I look at the upper or the lower chambers, or if I tease the data out to specific social media platforms. In just about every circumstance, the results are mind numbingly similar.

We are continuing to add more social media data into the Cicero API Database, and will start tracking sites like Tumblr, LinkedIn, Pinterest and Instagram. My gut tells me that LinkedIn will take over YouTube as the third most used social media account in our data, but then again, I’ve been wrong before.

Analyzing Philadelphia Crash Data

Our first opportunity to work with crash data was last summer during our 2013 Summer of Maps Fellowship. Tyler Dahlberg, our fellow working with the Bicycle Coalition of Greater Philadelphia, analyzed bicycle crashes in Philadelphia from 2007-2012. The results of his analysis showed bicycle crashes clustering around some of the wider streets in the city —  Market Street, Broad Street and Spring Garden to name a few. Any bicyclist in Philadelphia will tell you these are not their favorite places to ride a bike. Tyler also looked at aggressive-driving related bicycle crashes and found significant clustering in University City, specifically around the 38th street area.

This type of crash analysis had never really been done in Philadelphia, at least with the results available to the public. It generated the interest of some of the city’s civic-minded journalists, including blogger-extraordinaire Jon Geeting. Jon contacted Azavea with the idea to do this analysis on all crashes in Philadelphia — with a specific interest around aggressive driving and pedestrian-affected incidents. We typically think of crashes as “accidents”. Jon’s hypothesis, and one that receives large favor in the planning sphere, is that these crashes aren’t really accidents at all. They can be a direct consequence of how our streets are designed. Therefore, if planners can design our streets in a way that favors lower speeds (traffic calming as it’s referred to) and accommodates all kinds of travelers (bicyclists and pedestrians as well as cars), perhaps the crash rate can be reduced. I’ll leave that analysis up to Jon, and you can read some of it herehere and here.

In this blog I’ll share the workflow and tools used in the GIS part of this analysis. To understand where crashes are occurring, first the dataset had to be mapped. The software of choice in this instance was ArcGIS, though most of the analysis could have been done using QGIS. Heat maps are all the rage, and if you want to make simple heat maps for free and you appreciate good documentation, I recommend the QGIS Heatmap plugin. There are also some great tools in the free open-source program GeoDa for spatial statistics.

Getting the Data into ArcGIS

Our source was Pennsylvania’s Department of Transportation (PennDOT). They maintain the crash data and sent it as a Microsoft Access 2007 database. I’m not sure how the data is stored internally, but at least with Access we can use SQL to query out just what we need. The database contains 12 tables, each with information about each crash related to a more specific topic. For example, there’s a PERSON table that contains information about all people involved in the crash such as their age, sex, drug and alcohol test results and even where they sat in the vehicle. Clearly there’s a ton of information we could look at and hopefully we’ll see some more analysis of this dataset in the future. For the purpose of this study, we’ll just need one table from the database, the CRASH table, which contains the most important information on the crash such as where, when and item counts (how many people, vehicles, pedestrians, bicycles, fatalities, etc.).

The “where” information on the crash is stored in the degrees, minutes, seconds coordinate format, which ArcGIS doesn’t understand. Therefore, it had to be converted to decimal degrees. There are actually quite a few ways to do this. Starting in ArcGIS 10.0, there’s the handy Convert Coordinate Notation tool, which accepts a wide variety of formats. It’s also possible to do this with a python script or VBscript in the Field Calculator. The PennDOT coordinate data doesn’t seem to be formatted the way ArcGIS’s Convert Coordinate Notation tool prefers, so I went the other way and used a python script. Another way to go about this would be to convert the coordinates to decimal degrees inside Access before exporting it into ArcGIS.

With the properly formatted coordinates, now the crashes can displayed on a map.


Click here for PDF version

That’s a lot of dots (53,260 to be exact). But it’s not a particularly useful map. A couple ways to make the data more useful would be to look at clusters of crashes, such as hot spots, and calculate crash rates on Philadelphia’s streets.

Hot Spot Analysis

With the data full of so many attributes describing the crash, I wanted to identify clusters of specific attributes. I used the ArcGIS Optimized Hot Spot Analysis tool which calculates a Getis-Ord Gi* statistic for each feature. This determines if there are any statistically significant areas of high or low values of that attribute. Basically, it’s identifying crashes that are surrounded by other crashes that have similar values of either high or low (say for crashes, aggressive or not aggressive). The settings here are really important. I used the SNAP_NEARBY_INCIDENTS_TO_CREATE_WEIGHTED_POINTS aggregation method since there were often multiple crashes at an intersection with slightly different geocoded coordinates. The resulting map shows us each feature and whether it is in a neighborhood of statistically significant clustering of high values and cold spots which are statistically significant clustering of low values. I ran the hot spot analysis on the aggressive driving attribute in the crash data. I didn’t include interstate roads in the analysis since I just wanted to look at where the hot and cold spots for aggressive driving were on city streets (a majority of the aggressive driving crashes overall were on interstate roads).


Click here for PDF version

Note on the map above the clusters of hot and cold spots for aggressive driving. The crashes that are not statistically significant are not displayed on the map. Aggressive driving crashes cluster along Roosevelt Boulevard, well-known for its dangerous conditions. Other hot spots appear along City Line Avenue along the western border of the city. There also seems to be quite a bit of hot spot clustering around interchanges along Interstate 676 and the Ben Franklin Bridge. Jon Geeting hypothesizes that this is the result of traffic coming off the interstate and not adjusting to the slower city streets. It’s also interesting to look at the cold spots, or where aggressive driving crashes show a significant level of dispersion. That can be seen in Chinatown, Center City West/Rittenhouse area, and the East Passyunk neighborhood — specifically right along the 9th street market. All three areas have high amounts of pedestrian activity, slower traffic speeds and lots of mixed-use. Could that be a deterrent to aggressive driving?


Click here for PDF version

The downside of course could be increased pedestrian crashes or deaths. However, this map of pedestrian crash hot and cold spots doesn’t necessarily indicate that. Center City appears as one giant hotspot. Since we don’t have any way to normalize the pedestrian data by the volume of pedestrians, it’s hard to say whether that’s just related to the higher amounts of pedestrians in Center City. Two of the neighborhoods that were cold spots for pedestrian crashes; Chinatown and Center City West, are part of the greater Center City area which is all a big hotspot for pedestrian crashes (though neither section seems particularly “hot” compared to the rest of Center City). The East Passyunk neighborhood doesn’t appear to have significant clustering either way. Perhaps this indicates that high pedestrian activity reduces aggressive driving but does not result in increased pedestrian crashes, at least in that area.


Click here for PDF version

There also doesn’t seem to be an unusually high number of pedestrian deaths in either of those neighborhoods, as you can see on the above map. It does appear that Roosevelt Boulevard in Northeast Philadelphia has a high amount of pedestrian deaths. This is especially true considering there’s much less pedestrian activity there than in Center City, due to the wide street and more suburban built environment of Northeast Philadelphia.

Calculating Crash Rates

Source: xkcd

One of the dangers of mapping without context is we may be accidentally making a map that simply serves as a proxy for population. A map of the total number of crash deaths per state is probably going to look similar to a map of the total number of people. But if we map the crash rate, we can see which states have a higher number of crashes based on the proportion of population. We can do the same thing by mapping the crash rate on each street.

So, how can we determine if crashes are simply happening because there are lots of cars? We know there will probably be more crashes on streets with more traffic, so we needed to normalize the number of the crashes by the traffic on the street. Unfortunately, I could only obtain reliable traffic count information on PennDOT maintained streets in Philadelphia. Therefore, crash rates were only calculated on those streets. First, the crashes had to be summarized by street segment which can be done with a Spatial Join and Summary Statistics. After running those tools, there are now a total number of crashes on each street segment.

A crash rate can be calculated using the following formula, which is often cited in literature and used by state DOTs:

R = (C × 1,000,000)  ÷  (A × 365 ×  N × L)

Where R is the calculated crash rate, C is the number of crashes on the street segment, is the Average Annual Daily Traffic volume on the street segment, 365 is the number of days in a year, N is the number of years in the study and L is the length of the roadway segment in miles.

What we end up with is a crash rate per one million miles driven on each street segment.


Click here for PDF version

Addition on April 8, 2014: One point about the crash rate calculations. They’re calculated on each segment and while the formula does take into account segment length, it seems as though very short segments tend to show very high rates. This only seems to be an issue on few street segments, but should be taken into consideration when looking at the map.


One very important note about PennDOT’s crash data. Many have commented that a crash they were involved in (usually these are pedestrian or bicycle crashes) is not on the maps we’ve produced. Here’s one explanation for that: crash data maintained by PennDOT are only “reportable” crashes, defined in Title 75 of the Pennsylvania Consolidated Statutes, Section 3746(a):

An incident that occurs on a highway or traffic way that is open to the public by right or custom and involved in at least one motor vehicle in transport. An incident is reportable if it involves:

  • Injury to or death of any person, or
  • Damage to any vehicle to the extent that it cannot be driven under it’s own power in it’s customary manner without further damage or hazard to the vehicle, other traffic elements, or the roadway, and therefore requires towing.

Since most bicycle and pedestrian crashes don’t do significant damage to a vehicle, it’s easy to see how many of them simply wouldn’t be reported by PennDOT. So, the problem of bicycle and pedestrian crashes could actually be a lot worse than what is actually shown. I’m hopeful that we’ll see more data released, perhaps by jurisdictional police departments, which may shed more light on this. It would also be great to combine this with some sort of crowdsourced pedestrian and bicycle crash map where users could report those minor crashes that don’t necessitate a police report. That could be a great way to further identify the most dangerous streets and intersections for bicyclists and pedestrians.

Districts of Westeros Now Available in Cicero!


Image source: uploaded by author to

The Cicero team is pleased to announce that we’ve added the highly contested and volatile political districts of Westeros. The Cicero API assembles legislative districts and elected official information for six countries and now the Lands of Ice and Fire! You can now send Westerosi addresses to the Cicero API to find detailed information (including lineage and sigil) of each king and lord ruling in any of the seven kingdoms. We also have added historical reigns dating back to Aegon the Conqueror.

We welcome you to send addresses to the Cicero API for example “8600 N. Kings Road, Winterfell of Wolfswood” or “53 Eel Alley, Flea Bottom, Kings Landing” which will return their respective Kings.

We hope you will bear with us as we do our best to keep up with the ever contentious and constantly changing rulers.  Though wildling territories are technically not considered part of seven kingdoms, we are currently monitoring the movements of Mance Rayder as we believe he has the potential to extend his rule south of the wall, into Brandon’s Gift.

Because we pledge to maintain accuracy in our data, we plan only to update the Cicero database based on an official King’s decree and never based on the Spider’s whispers.  We’ve also begun adding the districts for Essos including the Dothraki Sea and their many khalasars though we have found this challenging due to the constant migration of the Dothraki horse lords.

Please stay tuned for the planned upcoming addition of the Shire, Rohan and Mordor.

Click the search link below to access the Districts of Westeros in Cicero.

Try out the Westeros Districts in Cicero!


Philadelphia Bursts with Resources and Support for Women in Tech (Part 2)

Continuing our discussion on the immense resources and support for women in tech in Philadelphia, we interviewed Yasmine Mustafa, the founder and co-organizer of Girl Develop It Philadelphia to learn about what makes Philadelphia a great place for women in tech.

Yasmine Mustafa

Sarah: What are the best things about women in tech in Philadelphia?

Yasmine: I would say the women in tech community in Philadelphia is unique in that we’re involved. Looking at Girl Develop It Philly, we’re the second largest tech group in the area with over 1,300 members which is impressive considering [women in tech are] supposed to be a minority. I see our students giving back by coaching people next to them in class or coming back as TA’s. Finally, the enthusiasm for learning is there and it’s wonderful to see.


S: Where do you see opportunities for improvement for women in tech in Philadelphia?

Y: I just blogged about this here: I’d love to see more women take on leadership roles by organizing tech events or speaking at conferences. I’m seeing the same faces and roster everywhere I turn. There’s a tremendous opportunity to see more women take on more active roles in being mentors to other women by setting an example and walking the walk.


S: What advice do you give to women who want to get involved in the tech scene or learn new skills?

Y: I also talk about this in the same blog post. Go to tech meetups, contribute to the discussions, get up and present, write or respond to blog posts, etc. In all, asserting themselves in the community and making themselves seen and heard.


S: What are you excited about for 2014?

Y: For GDI in general, we’re launching seven tracks to help beginner, intermediate and advanced developers increase their technical proficiency. Everything from business and design, to front-end, back-end and mobile development and data. We’re capturing every element of getting and improving a developer’s skillset.


S: What makes Philadelphia special?

Y: This is a difficult question because I don’t know enough about other cities to compare Philly. I will say that from what I’ve heard, we seem to have a more supportive community which aligns with our slogan (City of Brotherly Love). For a snapshot of what it special for some people, check out Philly Love Notes – it’s a great blog that captures unique perspectives from residents themselves.