avatarArticles by Robert Cheetham

OpenDataPhilly.org Launches Today

OpenDataPhilly.org logoI’m excited to announce that we rolled out a new open data portal for the Philadelphia region today, OpenDataPhilly.org. Open data and government transparency have been increasingly visible concerns over the past few years. The City of Philadelphia was once a leader in this respect. The municipal government made its GIS data available to the public at no charge almost 10 years ago, and, at the time, was one of the first and largest municipalities in the world to do so. In order to do this, City staff worked through a number of challenging issues that included liability, homeland security and development of a common standard and process for vetting and releasing new data sets. That data has been available on PASDA, the state spatial data clearinghouse for Pennsylvania, ever since.

In the past few years, many municipal governments have been making a public and concerted effort to improve the transparency of their government operations by releasing significant and useful data sets. Washington DC deserves credit for playing a leadership role in this respect. DC was arguably the first major city to not only release downloadable data sets but create real-time streams of data from operational databases. Today the District provides access to 475 datasets from multiple agencies and in a variety of formats, CSV, RSS, KML, XML and shapefiles. In 2008, they doubled-down. To increase exposure and expand usage, the government sponsored a contest, Apps for Democracy, to encourage software developers to create useful applications that consumed this data. The leader of that effort, Vivek Kundra went on to become CIO under President Obama. In May 2009, the federal government launched Data.gov with just 47 data sets. Today there are 380,000 data sets (of which more than 376,000 are geospatial).

Many other cities have followed suit. A few of the most significant include:

And other organizations are getting into the act. The UK launched data.gov.uk in January 2010. The World Bank not only has a great data site, they’ve also sponsored a contest to encourage the development of new applications that use that data. The FCC has an open data site as well as a set of developer APIs. And the app contests have become sufficiently numerous that they are even starting to feel passé.

Philadelphia has been missing from the list. While the City was an early and unsung leader 10 years ago for releasing its GIS data, these recent efforts by other governments have left it far behind. There is no Philadelphia Open Data web site. But there are a lot of people who want to see that change. A BarCamp in late 2009, RefreshPhilly.org, Philly Startup Leaders, Young Involved Philadelphia and other groups have pushed repeatedly for this type of government transparency through publication of operational data. So why is Azavea building this? Well, we really have Roz Duffy to thank. She encouraged me to get involved with the Open Access Philly task force. I attended my first meeting in January and was impressed by the range and diversity of the people who have been attending these meetings. After the first meeting, I felt like Azavea was actually in a good position to create something that would both serve to bring the various City data sets together in a single catalog as well as extend the catalog to other resources.OpenDataPhilly splash page

While the Open Access Philly task force advocated for an online catalog of data, OpenDataPhilly.org is not a City project. The City government doesn’t have the resources to build something right now. I’m proud that Azavea is building this initial version, but, that said, this is not a typical project for us. That’s good and bad. We don’t build open data portals – we build spatial data analysis and visualization tools. And when I ask my colleagues to work on something that isn’t our main focus, it’s distracting and makes us all less productive. And we are a small company that can only afford to do a certain amount of pro bono work in a given year. And, in the long run, I’m not sure it’s actually a good idea for an open data catalog to be operated by a private firm.

Nonetheless, I felt this was important for a number of reasons. First, I kept hearing other technology people in the region lamenting how we were being left in the dust. That’s sad because there’s actually far more data available than most people realize. Second, much of Azavea’s work depends on open standards and the broad availability of useful data sets. By making it easier to find data, we are supporting the ecosystem that supports our business. Third, I buy into the idea that open government encourages both better government and a more engaged citizenry.

Because Azavea is not the City, OpenDataPhilly.org is different from other open government data portals. We have taken a look at a lot of these web sites, and we’ve done our best to incorporate what we thought were the best parts. But we’ve decided to try some different ideas that we hope will make the catalog more useful. First, the catalog is not limited to data from the municipal government – we have also incorporated data from non-profits, universities and commercial organizations. Second, this catalog is not just about downloadable data sets; we’ve also included data-centric web and mobile applications as well as developer-oriented APIs and other structured data feeds. Third, we realize that data for its own sake is not really all that helpful. To be useful, the data needs to actually be put to use in new applications, visualizations and stories. So the OpenDataPhilly.org site includes an Idea Gallery a feature similar to London’s Inspirational Uses page.

These departures from the usual government-sponsored open data catalog has created opportunities, but it has made our task somewhat more difficult. Since we didn’t limit ourselves to government data sources, we needed to both track down these other data sets and develop a series of guidelines to determine what goes in and what doesn’t. I’m sure we missed a lot, and I don’t know if we got the guidelines right. We also didn’t have a lot of material for the Idea Gallery to start out, so we needed to develop some placeholder material. And, as I mentioned above, in the long run, I’m not sure Azavea is the best home for such a project. I think the best home might be a non-profit organization for which transparency and citizen engagement is part of their mission – perhaps a non-profit news organization or a similar entity.

What’s in it?

As our starting point, we took the extensive set of geospatial data sets that were already available on PASDA. We didn’t limit ourselves to City sources; we also added material from DVRPC, the USGS and other organizations when that data was specific to Philadelphia. We added several data-centric applications deployed at the City as well as some applications developed by local universities that use government data. We also included some of the resources we had discovered while working on a data inventory for the WHYY Newsworks web site last summer. OpenDataPhilly is not only a catalog of existing data sets, applications and APIs, it also includes a series of new geodata APIs that the City has implemented over the last few weeks. So the act of constructing the catalog has inspired the City to release some data sets in a new and useful way. That’s pretty exciting. From our perspective, that means the effort is already a success.

How did we build it?

This is not really a geospatial data application, so our usual tools were not going to be appropriate. Since OpenDataPhilly.org will primarily direct people to other data sets, it doesn’t need a lot of processing power. But we’re going to be maintaining this for at least the next few months, so we needed some simple and straightforward content management features. We settled on the following technology mix:

Why now?

Sometimes it’s good to have a deadline. Today’s rollout was timed to coincide with Philly Tech Week, a week-long celebration of technology and innovation in Philadelphia organized by TechnicallyPhilly. Open data serves as bookends for the week. Azavea is rolling out OpenDataPhilly.org today. On Saturday as part of the BarCamp NewsInnovation at Temple University, Tropo is organizing an Open Government Hackathon. The Hackathon will aim to build new applications that use the data listed in the catalog. We’ll be involved in some other events this week. There’s a full summary in a blog from last week.

Acknowledgments

While the City didn’t pay for the development of OpenDataPhilly, that doesn’t mean they didn’t make important and significant contributions. Jeff Friedman (City OIT) and Paul Wright (Fuzebox) have been organizing the Open Access Philly meetings for more than a year, and these meetings were the catalyst that got us moving. Several staff at the City’s Office of Information Technology, including Stuart Alter, Paul Wright, Jim Querry, Brian Ivey, Walter Svekla and others have supported the OpenDataPhilly rollout and development through both encouragement, suggestions and the hard work required to roll out these new geodata APIs. The vast majority of the data sets are ones to which a legion of City employees and residents have contributed over the course of many years. The William Penn Foundation has recently awarded a grant to NPower PA to both encourage use of the data catalog as well as the implementation of the OpenDataPhilly features related to developing a community around the web site. And a large community of people have also contributed advice, encouragement, feedback and data sets to the effort. An incomplete list includes: Johnny Bilotta (developed early version of OpenDataPhilly logo); Roz Duffy; Mark Headd (Tropo); John Mertens, Mjumbe Poe and Aaron Ogle (Code for America fellows); Chris Wink (Technically Philly) and Deb Boyer, Carissa Brittain, Brian Jacobs, Rachel Cheetham-Richard, Claire Connelly, Abby Fretz, Jamal Alsarraj, Dana Bauer and Tamara Manik-Perlman (some of the Azavea folks who worked on the project).

Where do we go from here?

So OpenDataPhilly.org is released. What happens now? That depends on you. A catalog won’t be much use without people using and contributing to it. Want to get involved? Here are a few ways:

  • Show up on Saturday for the Hackathon and join a team.
  • Got data? We know we probably missed a bunch of useful data sets. There is a page for organizations to submit information about their data sets for inclusion in the catalog.
  • Is a critical data set missing? We also have a way for you to ask for missing data sets and vote on other people’s requests.
  • Write to your city, state and federal legislators and ask them to support open government data policies. [We can help you with that too. Check out Azavea’s Cicero API.
  • Say something with the data. Download some data and develop a beautiful visualization that tells a story. Then submit it to the Idea Gallery.
  • If you are a developer, build some apps that use the data. Or, better yet, apply for Code for America, an innovative approach to public service where you can apply your skills to making government work better for everyone.
  • OpenDataPhilly.org needs a home. We’ve created it, but we don’t think we should own it in the long run. We’re ready to give it away. We estimate it’s going to be a few hours a week to maintain this. If you think you have a good home for it, we’d like to hear from you.

Esri Partner Conference and Dev Summit 2011

I just returned from the annual Esri Partner Conference and Developer Summit and wanted to jot down a few notes. The Partner Conference plenary was both exciting and stressful (for me). The layout and format was a significant departure from past events. Instead of the rows of chairs that are the usual layout, there were beanbags and couches (and even bleachers for the Esri staff). There were also dual stages with a small, circular TED-style “forum stage” placed in the middle of the audience. The lineup was a mix of reports from Esri and sneak peaks at future directions, interspersed with short “insightful ideas” and stories from Esri staff and partners.  It was stressful for me because I was giving one of those three minute “insightful ideas” talks (mine was about our B Corp status and a custom partner newsletter we prepare for Esri each month).

I think the event was a significant success.  Kudos to the Esri staff responsible for setting it up – there were a lot of really great ideas that made it fun to attend.  The highlights I saw last week included:

  • Ismael Chivite talking about how the Table-of-Contents and the Identify Button represent crappy design.  Developers need better friends and those friends should be Designers.
  • Demo of ArcGIS Server 10.1 in which a drive-time polygon and population summary was being calculated so fast that it could respond to the mouseover event as the cursor passed over the map…and do so with a national scale road centerline with 43 million segments – very impressive.  60 millisecond response time.  It was a great illustration of how high performance geoprocessing is not just faster, it changes what is possible from a user experience perspective.
  • ArcGIS Server 10.1 will be faster for many types of features – 64-bit goodness plus lots of work in the server.
  • Simpler architecture, fully REST-ful architecture – SOM, SOC, DCOM and Java dependencies are all gone.
  • Other ArcGIS Server 10.1 improvements will include: broad printing/PDF support; dynamic symbology (the final feature we had in ArcIMS that has been really hard in ArcGIS Server); private clouds; and WPS support (yeah!).
  • Lauren Rosenshein showed some really interesting ideas around geoprocessing packages that combine models and data for sharable processing elements.
  • Lots of love for Python in 10.1 including better ArcPy, faster cursors and NumPy support,
  • There are more than 11,000 public objects in ArcGIS 10.0.
  • New, simpler ArcGIS runtime that can be installed on Win or Linux with a simple file copy (and it’s smaller than Adobe Acrobat).
  • Demo by Morten Nielsen (@sharpgis) of a Kinect with OpenNI drivers being used to control a map with gestures – very cool – the YouTube video below is from a month or so ago, but you’ll get the idea.
  • Talk by Tim O’Reilly (@timoreilly) and Jennifer Pahlka (@pahlkadot) on the Code for America program. Azavean and CfA fellow, Aaron Ogle (@atogle), was there with CfA fellow, Ryan Risella (@RyanRisella), presenting some of the work they’ve already accomplished since January.
  • Tim O’Reilly also did a great talk in the Dev Summit plenary (starts at about 12 minutes into the 2nd video), where he spoke about a) the Internet as an operating system (with location as an important sub-system); b) government as a platform (with GPS as a prime example – a risky but innovative platform provider); and c) doing work that matters.
  • Met Eric Rodenbeck from Stamen Design – I’ve admired Stamen’s work for years.  I liked the structure of their talk, which used a very compelling diagram that related speed to power (fashion and business move at relatively high speed but have little long-term power, while nature and culture shift only very slowly but are enormously powerful), and gave lots of love to the many people that work hard to create data and systems that Stamen uses in their work.
  • Some excellent photography of the Partner Conference and Dev Summit posted on the respective home pages

Esri Removes Usage Limits on ArcGIS Online Base Maps

Esri announced on Friday that they are lifting most of the usage restrictions on ArcGIS Online map services. As of February, ArcGIS Online base maps hosted by Esri will be freely available to all users, regardless of the use (commercial, non-profit, internal, external, etc.) The only restrictions will be on very high volume transactions of 50 million or more per year. While some of these services could be better, some have some really terrific cartography.  I really like the World Topographic Map, particularly for communities that have contributed to the Community Maps Program.   And I remain excited that Esri is supporting OpenStreetMap as a base map option.

ArcGIS Online is evolving into an increasingly useful service with not just base maps but also high quality, specialized data sets, such as the US National Wetlands Inventory or the US National Soil Survey Map.   There is also the ability to embed the maps in personal web sites.  The ArcGIS Online blog has a nice set of examples for how these capabilities can be applied to a number of different scenarios.

ArcGIS Online Soil Survey with OSM base map

We have found ArcGIS Online to be useful for several of our projects, particularly those that need a high-quality base map with good cartography but for which there is no budget or no need for an actual web map server.  Since we frequently use the OpenLayers javascript library for many of these projects, we have recently submitted a new feature to the OpenLayers project that adds tiling support for ArcGIS Online base maps.  There’s more on the OpenLayers submission in a post by David Middlecamp on our Labs blog.

Esri File Geodatabase API Released

Over the holidays, Esri pre-announced a beta delivery date for the File Geodatabase API and today it was released in beta.  The shortcomings of the shapefile have been apparent for a decade or more, but it’s less clear to me why something has not taken it’s place.  SQLite Spatial has been a potential open source option, but it’s not one that has taken off.  Esri’s File Geodatabase (FGDB) has had a great deal of potential as an alternative because it is:

  • Cross-platform – runs on Windows and Linux
  • Supports many data types including raster, vector, networks, 3D, relationships
  • Doesn’t require a full relational database (Oracle, SQL Server, MS Access, etc.)
  • Lots more headroom in terms of the size of the database than the shapefile ever had
  • High performance (Esri recommends considering File GDBs over SDE under some high capacity server scenarios)
  • Support for editing

But since its introduction at ArcGIS 9.2, we’ve only been able to use the File GDB via ArcObjects.  Enterprise Geodatabases (née ArcSDE) have had a every useful C API for many years, and there’s been significant demand for something similar for the File GDB.  Such an API would enable the potential replacement of the shapefile as a much more sophisticated cross-platform interchange format.

So, as I was saying, the Esri GDB team released some information in mid-December and released the API in beta today.  You’ll be able to use the API as a C++ library.  We now know that this initial version of the API includes:

  • Create, Open and Delete file geodatabases
  • Read the schema of the geodatabase
  • Create new schemas for simple features (tables, points, lines, polygons)
  • Read feature classes
  • Insert, Update, Delete support for simple features (tables, points, lines, polygons)
  • Perform attribute and some limited spatial queries

There are some limitations:

  • No editing for complex feature types – annotation, networks, topologies, terrains, representations and parcel fabrics
  • No raster support (bummer)
  • Only very limited spatial query functions (envelope intersects only)
  • Only supports ArcGIS 10 File GDBs
  • Only supports Windows (Linux support has been promised in a subsequent release)

This API is something that people have been requesting for years.  Why the heck did it take so long?  My guess is that Esri developers needed to stabilize the internal structures before releasing a API for reading and writing those structures.  The fact that there is only support for FileGDBs created from ArcGIS 10 suggests that this may be correct.

So it’s out in beta now.  Go get it while it’s hot.

Create Neighborhood Maps with OSM and MapOSMatic

As regular readers of my articles may have noted, I’m a big fan of OpenStreetMap. I recently discovered a pretty cool service, MapOSMatic, that enables you to generate customized maps of neighborhoods and cities using the OSM database.  Each map generates two files:

  • the map with a label and border, nicely organized with a lettered and numbered grid
  • an index with the street names

You can generate maps in PNG, PDF or SVG formats, and the PDF versions are generated with vector graphics and text objects, so they can be printed at any resolution.   Further, since the data is available under the open OSM data license, you can re-use and distribute as you see fit.

Based on an idea articulated by Gilles Lamiral, an OSM contributor in France, the application was developed by a small team during a one-week “hackfest” in August 2009.  The initial version was limited to French and English and was based on a static database. A second hackfest in December 2009 added daily OSM data updates, global coverage and redesigned UI.  To the original French and English versions, translations have been added  for a growing list of languages including: Spanish, Catalan, German, Italian, Russian, Arabic, Danish, Dutch, Portuguese, Croatian and Polish.  Features planned for the near term include adding legends, paper size selection, configurable styling, options for displaying different amenity layers and support for multi-page maps.

In a couple of minutes I was able generate a map and street index of my neighborhood in Philadelphia.

Spring Garden Neighborhood Map

Spring Garden Street Name Index

And because OSM is global, it works the same way everywhere in the world.  Check out the neighborhood where my father grew up in Loughborough, England – the OSM map is sufficiently awesome in Europe that the building footprints are there for the entire downtown area.

MapOSMatic rendering process

map of loughborough neighborhood

ArcSquirrel, ZigGIS, OSM and Alternative ArcGIS Editors

I recently listened to a DirectionsMag podcast regarding a new product by a Welsh company, exeGesIS, called ArcSquirrel.  Apart from an awesome company name and humorous product name [since we changed our company name, I think a lot about names.  How cool is "exegesis" for a GIS company?], it’s a plugin for ArcGIS desktop that enables direct editing and data management of SQL Server 2008 spatial data layers.

While I think it is really great that Microsoft implemented a spatial data type as part of their flagship SQL Server database product, the initial release was a somewhat crippled product.  You could query spatial data stored in SQL Server using a wonderful series of extensions to the SQL language, but MS did not package any tools to actually load data.  Further, the ADO.Net and LINQ database access frameworks didn’t really support the new spatial data types very well either.  Some open source spatial data tools were posted on CodePlex and that was useful, but there weren’t really great tools for editing the data directly.

ArcSquirrel logoEnter, ArcSquirrel.  This extension for the Esri ArcGIS desktop tools will enable you to edit the SQL Server spatial data columns using your favorite desktop GIS tools.  ArcSquirrel adds a new toolbar to the ArcMap application as well as tools for loading GIS data to SQL Server, support for multi-user editing, metadata integration with ArcCatalog and support for joins and spatial functions.  At $240/seat, it’s pretty affordable.

OpenStreetMap logoThis is not the first such specialized GIS data editor that extends the ArcGIS desktop product.  Obtuse Software has created ZigGis, an extension to ArcMap for editing PostGIS data.  More recently, Esri has developed and released an open source extension to ArcMap that supports editing the OpenStreetMap database.  I’m particularly impressed that Esri has not only created an extension for OSM, but has elected to release it under an open source license.  The beta version was released in July and the 1.0 release was out last week.  Software like this is going to enable the 100,000′s of ArcGIS desktop users to contribute to the global OpenStreetMap database and thereby make it more useful for everyone.  Based on the Esri demo at the US State of the Map event, Randal Hale has written up a nice review of the extension.  Kudos to Marten Hogeweg and his colleagues at Esri.

Resources

UPDATE:

10/13/2010: ArcSquirrel has released an API that enables programmatic control over the data management and editing process.

Geoprocessing and the Esri GeoServices REST API

In my previous article, I wrote about the Open Geospatial Consortium (OGC) Web Processing Service (WPS) standard and how it can be used to enable different geographic data processing capabilities to work together.  In this article, I’m going to discuss a second example that has been under development by Esri for a few years, but was just released as a published specification.  At this summer’s International User Conference, Jack Dangermond announced that Esri would be publishing a REST API as a new standard.  A couple of weeks ago, Esri made good on that promise and released the GeoServices REST Specification.

What the heck is the GeoServices REST Specification?

While I’ll admit that I have not read the entire 220 page specification document, I’ll try to summarize the salient points.  First, I should note that while I’m pairing this blog post with a related one on WPS, I do not see the GeoServices REST spec as an alternative to WPS.  It’s actually much more broad.  And, unlike WPS, one could probably make the case that it’s already in fairly wide use by a large community.  The spec hews closely to the ArcGIS Server REST API that is already supported by Esri’s entire client product line, including the Flex, Javascript, Silverlight, iOS and Android API’s as well as the ArcGIS Desktop, Engine and Server products.  Anyone that elects to implement this new GeoServices REST spec will basically have a huge built-in client base that can take advantage of their services.

Rather than an alternative to WPS, one might actually see this as an alternative to the WMS, WCS, WFS, WPS and Catalog standards while also providing services for which there are no existing OGC standards, such as geocoding.  The REST-based specification supports JSON, HTML and KMZ responses, with JSON being the default format.  The full list of service categories includes:

  • Catalog Service – a list of available services.
  • Map Service – make maps as well as query, ID and other map functions; much like WMS, though with more functionality.
  • Geocode Service – turn addresses, intersections and place names into map coordinates; also includes reverse geocoding.
  • Geoprocessing Service – you can probably guess that this is my favorite service; both synchronous and asynchronous execution of tasks; this is the service that most closely resembles WPS.
  • Geometry Service – utility functions for commonly used vector geometry operations such as reprojection, simplify and densify, buffer, area/length calculation, label points for polygons, distance calculation, generalize, trim/extend, convex hull, cut, difference, intersect, union and reshape; these could also be implemented as WPS services (or through the Geoprocessing Service) but these are provided as a lighter weight, easy-to-use set of utilities; there’s a lot of overlap here with JTS and NTS and one could imagine a rapid implementation of this service using these toolkits plus a projection engine.
  • Image Service – provide access to existing imagery, in particular raster catalogs and mosaicked images; this service also includes local and neighborhood transformations of the imagery, such as recolor, hillshade, slope, aspect, NDVI, statistics, stretch and identify functions.
  • Feature Service – provides functions for querying and editing vector features stored in a geodatabase; the closest OGC equivalent is WFS.

What will this mean?

On its own, the GeoServices REST spec does not mean much.  It will need a community of developers that are willing implement the specification.  That will mean building back-end server processes that will respond to requests made according to the specification.  The open question is whether or not developers will embrace the standard and will it catch on in the marketplace?  That’s obviously impossible to answer right now, but some of the potential can be seen in Brian Flood‘s work on the Arc2Cloud product.  Brian and his brother  got to be feeling pretty smug at this point.  By implementing many parts of the ArcGIS Server REST API, his Arc2Cloud product already supports the majority of the GeoServices REST specification with the server processes running in the Google App Engine cloud computing infrastructure.  This is a very compelling concept – build geoprocessing services that operate against cloud infrastructure but enable many, many people to use them by doing so on top of an established standard.

For Esri, this is a risky move.  Similar to the risks ERDAS faces by embracing WPS, Esri is creating a specification that, if broadly adopted, will make it easier for some people to not use their flagship ArcGIS Server product.  On the other hand, by demonstrating leadership in the geoprocessing market, they will both encourage the growth of that market and their broad product line puts them in a good position to capitalize on the larger marketplace.  I see this as a smart move by a company that feels sufficiently self-confident in its spatial analysis, geoprocessing and data management capabilities that it can invite both partners and competitors to the table.

There has also been some early criticism of the GeoServices specification.  Some punters have remarked that this is not really an open standard since it hasn’t been submitted to an independent standards organization and is not open for public comment and changes.  Browsing the specification, one my colleagues also remarked on the extensive use of the “esri” prefix in things like enumerations.  That’s something that we would generally not see in an open standard and suggests that this isn’t really intended as something to be used outside the Esri ecosystem.

On the other hand, the new specification is being made available under the Open Web Foundation agreement, which should make the spec free of copyright and patent claims as well as enable others to revise, share and implement as they see fit.  Further, there are many paths for specifications and standards as they evolve.  As the OGC has amply shown, submission to a standards body does not guarantee usefulness.  While the OGC has several standards that are in broad use (Simple Features, WPS, WMS, WFS, KML and WCS), it has also got a bunch of “standards” that have been submitted for narrow, commercial purposes and have failed to gain broad market support.  As the longevity of the shapefile has demonstrated, open publication of protocols can have a significant positive impact on interoperability, even if it’s not managed by a standards body.  Further, as Google showed with KML, commercial shepherding of a protocol for a few years can be a precursor to later submission to a standards organization.

Resources