Azavea Atlas

Maps, geography and the web

2015 State Legislative Vacancy Report

2015 was a turbulent year for state legislatures in the US. It is not unheard of for legislators to step down mid-term, however the past year saw a whopping 88 seats vacated by State Senators and Representatives. When an elected official steps down before their term is over, the seat is either filled by an appointment or a special election. Twenty-three states appoint successors, either through the Governor’s office, or by the party last occupying the seat. A slightly greater number of states hold special elections to fill empty seats. In this case, a district’s constituents are without representation until the newly elected official takes office, which is often months after the seat was first vacated.

Why all the Change?

A legislator may choose to leave their post for various reasons. In 2015, the top reason for open seats was due to an official being appointed or elected to an alternate position. For example, a
A graph depicting the reasons for State Legislative VacanciesGovernor or Mayor may call on a State Senator to leave the legislature to serve in their cabinet. This appointment may then set off a series of special elections, as it is common for a State Representative to run for that open State Senate position (typically referred to as a “promotion” to a higher legislature). Similarly, a legislator may leave their seat to run in another election. This happened in February in Texas SD-26 when Senator Leticia Van de Putte left her seat to run for Mayor of San Antonio. Jose Menendez, an incumbent in Texas HD-124 won that election, setting off yet another special election for his newly vacant house seat.

The second most popular reason for an elected official retiring in 2015 was to accept a job in the private sector. Some state legislatures are considered “part-time” and many officials carry additional jobs. New Hampshire’s Rockingham 20 House District Representative Dennis B. Sweeney held a zero percent voting record four months into his 2015 term. He ultimately stepped down because his work schedule prevented him from attending sessions.

Repeat Offenders

So, which states had the most special elections in 2015? 2 or 3 elections might be typical for a state in any given year, but in 2015 Georgia held a monumental 12 special elections–a recent all time high. That is the more than any state going back to 2011. Interestingly, Georgia was the winner that year as well, with 11 polls for vacated positions. Coming up second this year was Pennsylvania, with 8 special elections, 6 of which occurred in the second half of the year.

A graph showing US States that had the most special elections in 2015


88 vacant seats may seem like a lot. But how does it compare to years past? In the previous 4 years, 2011 was the only year to exceed 2015 in terms of vacant seats up for special election, with a grand total of 94. This actually makes sense. In 4 year election cycles, the year before a Presidential election (in which many other states also hold state elections) typically sees a lot of activity.
2015 Vacancies compared to the past 4 yearsThis is due in part to legislators resigning seats to run for different, open ones. The previous years, (2010 and 2014) also saw mid-term elections during which new administrations may have taken office and appointed Senators and Representatives to cabinet positions. If appointments take time, the open seats they create may not be reflected until the following year.

2013 also saw a high number of vacant seats filled with special elections, with a total of 84. Because odd years see very few state elections, Special Elections must be called, rather than an even-year option of waiting until the next scheduled general election. Many state laws specify that if a seat becomes vacant within a few months of a scheduled, general election, the seat will wait to be filled at that time. This unfortunately leaves a district without representation, but it does save the state a significant amount of money. Special elections often cost hundreds of thousands of dollars and see very low turnout from the electorate.

Moving Forward

Wondering if 2016 will be more of the same? Heading into the second month of the year, 34 state legislative seats are already open. 20 special elections have been scheduled, and 3 remain unscheduled at this time. At least 6 more open seats are awaiting appointment. And you can probably bet there will be more.

The Cicero database keeps track of Special Elections for all 50 states in the US, at the National, State, and even local levels for our largest cities. If you’re interested in learning more, give us a shout via email ( or Twitter (@CiceroAPI). We’d love to talk elections with you!


Announcing Azavea’s Nonprofit Map Readiness Program

Where are more potential clients? How do we prioritize which service locations to improve? What programs do we choose to fund? What is the demographic breakdown of our donors? If you’re a nonprofit, you’ve probably asked yourself these questions many times. And the answer you’ve found time after time requires data. But with busy schedules and tight budgets, gathering, cleaning, analyzing, and sharing data can be a burden.

We here at Azavea understand the difficulties nonprofits face when working with data. We’ve partnered with nonprofits in the Philadelphia region for fifteen years to help them better understand their data and make compelling visualizations to share internally and with the public. As part of that continued effort we’re proud to unveil our new Map Readiness Program.

Map readiness logo with charts, a report, and a graph

Azavea’s Map Readiness Program aims to help nonprofits assess their data, improve processes, find opportunities for analysis, and become more data-driven organizations. For a low one-time fee our Data Analytics team will work with you to understand your nonprofit’s aspirations, challenges, and opportunities. The Map Readiness Program will produce a report tailored to your organization’s specific needs and circumstances with recommendations, resources, project ideas, and actionable next steps. Think of it as a roadmap to becoming a more data-driven organization, drawn just for you. With the Map Readiness Program Report in hand, you’ll be able to better understand how to use your data to improve your operations and reach the people and places you serve.

To learn more about the Azavea Map Readiness Program, visit and fill out the “More Information” request form.

Meet Hopeworks: the Camden nonprofit offering GIS and web development training to youth

Hopeworks Logo, skyline, world, and mouseRecently, I was invited to tour Hopeworks ‘N Camden and had the opportunity to meet Executive Director Dan Rhoton and GIS Director Luis Olivieri. Hopeworks does truly amazing work in Camden and I hope you enjoy learning about their organization. My interview with Dan and Luis is below.  

What is the Hopeworks organization and what do you do?

Hopeworks ‘N Camden is a nonprofit that has been working for over 15 years with Camden youth.  Utilizing an advanced training curriculum in website design/development, Geographic Information Systems (GIS) and Salesforce, Hopeworks works with youth 14-23 to get back in school and find a safe pathway to their future using the Sanctuary model!   Recently, the Greater Philadelphia Chamber of Commerce named Hopeworks ‘N Camden their 2016 Nonprofit Organization of the Year!

As part of it’s program, Hopeworks runs four businesses, offering a complete array of website design & development, along with online GIS mapping products, and a complete set of Salesforce implementation, optimization, training, and support offerings.  In addition, Hopeworks Youth Healing Team offers training and consultation to schools and nonprofits about working with youth and adults affected by trauma.  These businesses help to generate over 50 youth jobs a year.  

Can you tell us a little bit about the history of Hopeworks and how GIS became one of the focuses of your training and client work?

In 1999, members of three churches in north Camden, the most blighted section of the city, came together and organized the Hopeworks program. Our original concept was to offer training to high school dropouts. During 2000, we formed a Board of Directors, hired an Executive Director, wrote grant proposals, and got startup grants from the Campbell Soup Company and the Society of Jesus.  In March 2000, we opened our doors to the first trainees and started our non-profit website design business.

Early in our experience, we learned three vital things. First, we saw that our youth needed more than training; they also needed help in setting goals, making plans, and sticking to those plans. Second, given the weak academic backgrounds of our Camden youth and the advancing complexity of website design, we realized they would need college degrees to get jobs in website design or other high-tech fields. Third, we confirmed that in-school youth needed additional motivation to stay in school.  

Thanks to Matt Grove, a Univ of Pennsylvania Grad student who loved GIS, volunteered and Hopeworks expanded its tech expertise from Web into GIS early in 2003.  Our clients have included the City of Camden, the EPA, and many others.

In 2012, we made another important discovery.  Training just was not enough.  Our youth were successful, but many more dropped out, did not complete the training, or were not successful in internships.  Technical training was not enough.  We also needed to help our youth heal.  Hopeworks made the decision to become a certified Sanctuary organization.  What does this mean?  It means we have radically shifted our values/priorities/interactions to a “trauma” framework.  

Students doing fieldwork, data collection in wooded area.

What do you consider to be your greatest successes relating to the GIS work you do?

The fact that our youth are able to collect and process data faster, more accurately, and more economically than professional staff is powerful!

Where do you see your organization, and specifically, the GIS team heading in the next five years?

Hopeworks has been invited to grow both in Camden and replicate our model regionally.  The next few years are exciting!

We would like to have a GIS Program that is recognized as a good alternative to traditional consulting services in the area of GIS.  We want people to know that when they hire our GIS Program at Hopeworks they can expect a very professional level of service, at a very competitive cost.  

How can members of the greater Philadelphia tech or GIS community volunteer their time or get involved in your work?

We have so many ways to volunteer, mentor, and support our work!  We need help to grow!

Even though GIS has been around for years it’s still a technology that is developing.  Keeping up with everything is very hard.  We could use people that can help us with trainings or just by volunteering time to help the GIS Team.  

One example of how volunteers are helping is our GIS Advisory Board, a group of well known professionals in the area of GIS from academia, government and industry. The board provides advice, support, guidance and recommendations which are key components for the development of the GIS Program at Hopeworks ‘N Camden. Chalkboard with Hopes for Camden written on

If you’re interested in volunteering, get in touch with Hopeworks! Also check out their upcoming hackathon on April 16th!

Lastly, what is your dream for Camden?  

We dream of a Camden where our youth are starting their own tech entrepreneurial ventures, healing both their families and their communities!

Registration and Conference Grant Applications Now Open for FOSS4G-NA 2016!


As you may already know, this year’s Free and Open Source Software for GeoSpatial North America Conference (FOSS4G-NA) is located in Raleigh, North Carolina and will be held from May 2-5th.  As the Community Chair of the conference, I’m really pleased to let everyone know that registration is now open!  It’s also not too late to submit a presentation proposal or vote on the proposals.

I’d also like to share some information about the conference’s scholarship program. We realize that traveling to and attending a conference can be expensive. Not everyone who uses or develops open source software has someone to pay their way for them or the means to pay for it themselves. To help, FOSS4G-NA 2016 will be offering scholarships/conference grants to some attendees in need.

What is included?

Conference grants may include one or more of the following:

  • complimentary full access conference passes
  • lodging in the conference hotel
  • In exceptional cases, we may subsidize airfare or other transportation costs.
Who is eligible?

Postcard of Raleigh NC, Photo from Boston Public Library Flickr

The travel grants are available to anyone, not just students, but their purpose is to help those attend who would not have otherwise been able to. If you have the means to attend on your own, please do not apply. Because we believe that a diverse community is a stronger community, we’ve dedicated at least 60% of our funds for women and minority attendees. We believe in working towards ending the underrepresentation of these people in the open source community.

How do I apply?
  • Apply starting on January 8th, 2016
  • Be sure to apply by January 22, 2016
  • Applicants will be notified of their status by end of February, 2016.

Visit this form to Apply NOW

We hope to see you in Raleigh, North Carolina!

Questions? Contact us at

Using PostGIS to Summarize Philadelphia Inspections Data

Violations from Nat Henry's 2015 Summer of Maps Project for Legal Clinic for the Disabled

Violations from Nat Henry’s 2015 Summer of Maps Project for Legal Clinic for the Disabled

In August of 2015, Summer of Maps Fellow Nat Henry wrote a Python script to collect the entire universe of available Philadelphia License and Inspections Department data and hosted it on Github as part of a project for the Legal Clinic for the Disabled. There were more than 1 million inspection records in the database from January 2003 to August 1, 2014. The combined CSV text file is about 330mb in size, and balloons to more than 5 gigabytes when converted into a shapefile. A CSV that large will frequently crash Excel, and a million record file starts to slow desktop GIS noticeably (and also quickly fill up a hard drive with intermediate files). I wanted to try and use the spatial Postgres plugin, PostGIS to more efficiently clean up and query the data.

Getting the Data

You’ll first need to download the data from the Summer of Maps Github page specifically for Licenses and Inspections. After you unzip the data locally on your computer, you’ll see that it has been split into a 12-part file, and a 3-part file. I decided to use the 3-part file, and used Excel to copy-paste the data into one spreadsheet. You’ll now have one unwieldy 330mb CSV. I already had the Philadelphia Neighborhoods shapefile from Azavea loaded into my database. Now let’s get the inspections data into a PostGIS-enabled Postgres database.

Loading into PostGIS

In a previous blog post I described how to use QGIS to load spatial data into PostGIS databases. I followed that same method here, loading the CSV into QGIS and converting it to a shapefile, and then importing it into PostGIS with QGIS’ SPIT function. There are definitely other and better ways of doing this, but using QGIS is one of the most straightforward.

Cleaning the Table

If all goes well, you should now be able to use QGIS’ DB Manager utility, or PGAdmin to explore your database and begin writing SQL queries. First I dropped a number of unnecessary fields to cut down on clutter and reduce table size:

ALTER TABLE incidents_wgs84
DROP "ViolationI",
DROP "ViolationL",
DROP "CaseNumber",
DROP "LocationID",
DROP "CaseType",
DROP "CaseLocati",
DROP "ViolationD",
DROP "LastUpdate",
DROP "Resoluti_1",
DROP "StNum",
DROP "StDir",
DROP "StSuffix",
DROP "field_28"

Converting the Date Field

There are over a million records in the database, so I wanted to limit my queries to any inspections from 2014 and later. This turned out to be much more difficult than expected, mainly because the date is stored in a Unix format called, ‘Epoch’, where the dates are stored as milliseconds since 1970, not to mention the field had some extra characters that needed to be removed.

I removed the extra characters from the “AddedDate” field with the following two queries:

update incidents_wgs84 SET "AddedDate" = REPLACE("AddedDate", '/Date(','');
update incidents_wgs84 SET "AddedDate" = REPLACE("AddedDate", ')/','');

Now we should just be left with a giant string of numbers, still stored incorrectly. The next query will convert the string to a big integer (to store huge numbers), and then convert to a timestamp. It also uses a conditional regex statement I found on StackExchange to handle null value records so they don’t break the query.

I start out queries using a “WITH” statement. This is from a special syntax called Common Table Expressions (CTE) that allows you to write more procedural-style SQL queries. While less efficient, the queries are much easier to read and construct. You can learn more about CTEs at Craig Kerstiens’ blog.

WITH clean_table AS
(SELECT *, to_timestamp("AddedDate"::BIGINT/1000) FROM incidents_wgs84 WHERE "AddedDate" ~ E'^\\d+$')

This cleans up and converts the date field into an SQL variable called “clean_table” which I’ll continue to build on in this post.

This next query uses the “clean_table” and selects only those dates after December 31, 2013 to narrow down the number of records:

recent_violations AS
(SELECT * FROM clean_table WHERE to_timestamp > '2014-12-31')

When I put both queries together, it should look like this:

WITH clean_table AS
(SELECT *, to_timestamp("AddedDate"::BIGINT/1000) FROM incidents_wgs84 WHERE "AddedDate" ~ E'^\\d+$')

recent_violations AS
(SELECT * FROM clean_table WHERE to_timestamp > '2014-12-31')

Great, now we can build on those two queries to start actually asking questions about the data.

Summarizing the Data

One of the best parts about Postgis is that you can ask questions fairly quickly without any of the interstitial cruft that desktop GIS tends to accumulate. If you catch a mistake or want to change something in PostGIS you can just modify the code and run it again. Meanwhile, in desktop GIS you might create five different intermediate layers only to realize you made a mistake and have to repeat everything. Let’s build some Postgis queries to summarize the 2014-2015 L&I data.

Violations Per Neighborhood

This query performs a spatial join between the L&I violations and the Philadelphia neighborhoods shapefile. The


nested inside the spatial intersect (ST_Intersect) ensures that the incidents are projected into the local Philadelphia state plane coordinate system, EPSG 2272, to match the neighborhoods shapefile. The full query is below:

WITH clean_table AS
(SELECT *, to_timestamp("AddedDate"::BIGINT/1000) 
FROM incidents_wgs84 WHERE "AddedDate" ~ E'^\\d+$'),

recent_violations AS
(SELECT * FROM clean_table WHERE to_timestamp > '2014-12-31')

SELECT hoods."MAPNAME", COUNT(recent_violations.gid)
FROM recent_violations
JOIN "Neighborhoods_Philadelphia" AS hoods
ON ST_Intersects(ST_Transform(recent_violations.the_geom,2272),hoods.the_geom)
ORDER BY COUNT(recent_violations.gid) DESC
Limit 10

And here’s the result (I copied the table and used the Tables Generator website to convert it to Markdown):

Neighborhood Violations
Strawberry Mansion 2223
Mayfair 2082
Wissinoming 1983
Cobbs Creek 1882
Oxford Circle 1869
Point Breeze 1830
Frankford 1764
West Oak Lane 1759
North Central 1737
Haddington 1655

Most Common Violation Per Neighborhood

It’s great to see the number of violations per neighborhood, but I wanted to dig a little bit deeper. For instance, what is the most frequent violation per neighborhood? While Postgres doesn’t have a built in mode (statistically most frequent) function built in, I approximated one with this helpful stackoverflow post. I continue to use the CTE query style, calling the spatial join I did in the previous step “AS spatial_join” to use the result later for the SELECT DISTINCT:

WITH clean_table AS
(SELECT *, to_timestamp("AddedDate"::BIGINT/1000) 
FROM incidents_wgs84 WHERE "AddedDate" ~ E'^\\d+$'),

recent_violations AS
(SELECT * FROM clean_table WHERE to_timestamp > '2014-12-31'),

spatial_join AS (
SELECT hoods."MAPNAME", COUNT(*) as violcount, "Violatio_1"
FROM recent_violations
JOIN "Neighborhoods_Philadelphia" AS hoods
ON ST_Intersects(ST_Transform(recent_violations.the_geom,2272),hoods.the_geom)
GROUP BY hoods."MAPNAME", "Violatio_1"
ORDER BY violcount DESC)

FROM spatial_join
ORDER BY spatial_join."MAPNAME", spatial_join.violcount DESC, spatial_join."Violatio_1"

This query will return a table with every single neighborhood in Philadelphia that had recorded violations, with the count of and name of the most frequent violation. The following table is random selection from the most frequent violations per neighborhood. It appears that the most frequent violation per neighborhood across the city is usually a citation by the Community Life Improvement Program (CLIP), an anti-blight and community beautification effort by the City of Philadelphia.

Neighborhood Most Frequent Violation Count
Dickinson Narrows CLIP VIOLATION NOTICE 13
Northern Liberties ANNUAL CERT FIRE ALARM 25

Most Dangerous Violations Per Neighborhood

A photograph of a derelict building

Luckily, the table also has a “Priority” field, where violations are classified in a few different categories. I modified my original spatial join query to add a WHERE clause that looks for the “IMMINENTLY DANGEROUS” priority:

WITH clean_table AS
(SELECT *, to_timestamp("AddedDate"::BIGINT/1000) 
FROM incidents_wgs84 WHERE "AddedDate" ~ E'^\\d+$'),

recent_violations AS
(SELECT * FROM clean_table WHERE to_timestamp > '2014-12-31')

SELECT hoods."MAPNAME", COUNT(*) as violcount
FROM recent_violations
JOIN "Neighborhoods_Philadelphia" AS hoods
ON ST_Intersects(ST_Transform(recent_violations.the_geom,2272),hoods.the_geom)
WHERE recent_violations."Priority" SIMILAR TO 'IMMINENTLY DANGEROUS'
ORDER BY violcount DESC

The resulting table contains an entry for every single neighborhood in the city with at least one Imminently Dangerous Violation from 1/1/2014 to 8/1/2015. I used QGIS to export the query result as a layer, and then brought the resulting shapefile into CartoDB through the handy CartoDB QGIS Plugin. One major benefit of using a PostGIS database to perform these kinds of spatial queries is space savings: the file I uploaded to CartoDB was only 140kb, which is amazing considering it started out as a 5.2 gigabyte shapefile. Here is the map of Imminently Dangerous Violations per Neighborhood on CartoDB:


The License and Inspection data that Nat scraped is a great resource for civic hackers and data analysts. But at over 5 gigabytes as a shapefile, it can get unwieldy very quickly. Using PostGIS to perform some initial transformations and simple queries is a great way to get started digging deeper into the data without filling up a hard drive. Finally, Common Table Expressions are an excellent way of keeping your PostGIS queries legible. Try taking PostGIS for a spin with the License and Inspections data. Write me @geotyler on Twitter if you have any questions.