Articles by Robert Cheetham

Common Cause/PA Launches Our Philadelphia web site

Pennsylvania_logoCommon Cause of Pennsylvania has launched a new web site and blog, Our Philadelphia, to educate the public about elected officials.  Unlike many states, Pennsylvania has no limits on campaign contributions, and the online contribution databases maintained by the state and by the City of Philadelphia are barely usable with much of the data not available at all.  A search for contributions that would take minutes in a more transparent state, like Maryland, would take hundreds of hours in Pennsylvania.  So Common Cause is building its own web site and database to make this data available.  But wait, there’s more.  The site will include several features: 

  • Elected Officials lookups – enter an address and find your representatives as well as a list of their top contributors [we’re excited that this lookup service is powered by our Cicero API
  • Campaign Contribution database
  • Election Reform advocacy – including redistricting, campaign finance and ethics
  • Open Government and Transparency advocacy
  • City and State Government watchdog – with a diminished print media, there is an increasing need for other organizations to supplement the normal role of newspapers

our_phila_clip
Over the next year, Common Cause/PA hopes to add additional information for Pittsburgh as well as extend the contribution databases as well as its ability to report on government activities.

Mashing up Google Calendar and a Javascript Timeline

Usually, this blog is about geography and Azavea’s work, but I thought an internal project might be of interest to others.  Our marketing team recently faced an interesting problem.  Our marketing approach is not based on advertising. Rather, we focus on spreading the word about our work by performing presentations at conferences, writing articles, writing book chapters, our newsletter, etc.  We also respond to a fair number of RFP’s and grant solicitations.  As our marketing and business development team has grown, the number of activities to track has also increased.  Lots of activities also creates opportunities, but if we can’t effectively visualize how they all fit together, we run the risk of missing those opportunities.  In addition, the task of tracking all of the grant and proposal deadlines, conference attendance and other activities becomes pretty tough.

So we resolved to set up a shared calendar as a mechanism for collectively tracking all of these deadlines and activities.  We had  switched our e-mail system to GoogleApps Premium in early 2008.  When we did this, we gained a number of capabilities in addition to e-mail including: shared calendars, document authoring/storage and customizable home pages for each staff person.  So our starting point was to create a Google Calendar for the marketing folks to share.  However, many of the marketing and business development activities span several days, and while Google Calendar is a great way to enter and store events, the usual daily/weekly/monthly calendar layout does not make it easy to see several weeks or months together.  We were really looking for a ‘timeline’ display of the calendar so we would be able to see the juxtaposition of several events and their relationship to each other.  So we looked around for a low-cost system that would enable us to both enter our marketing activities and visualize them in a timeline layout.  We looked at online project management tools, some of which support Gantt charts, but while a Gantt chart is great for decomposing tasks into subtasks, it arranges each task into it’s own line.  So if you have 20 tasks, that’s ok, but if you have 100 or 200 spread out over a year, it’s not very readable – the chart just keeps growing vertically.

marketing timeline calendar

So we decided to build something in-house.  When we had first set up our wiki, David Zwarg had showed off a tool called Simile Timeline, created by some folks at MIT.  So we went back to that project and learned that not only had it continued to develop but it was available as an open source toolkit that could be used in a broad range of applications.  David picked up Simile and within a couple of days, he had mashed up 6 calendars within the account we’d set up for the marketing crew into a timeline-based calendar.  He also experimented with incorporating a map, but we decided it consumed too much screen real estate and nixed it.  After all, we’re still small enough that we generally know what part of the country every is in. :-)

While geography proved to not be very compelling for this application, the juxtaposition of space and time can be a very useful visualization.  Below are a couple of screenshots from one of the recent builds of of our HunchLab product (it’s used for forecasting and geographic change detection), where there’s a critical need to view both spatial and temporal patterns in the same view.

Figure 1: The points on the map represent the span of time selected on the graph with a heat map of the points.

Figure 1: The points on the map represent the span of time selected on the graph with a heat map of the points.

Figure 1: The graph below the map is a Time-of-Day/Day-of-Week graph, showing a "temporal heat map" of when the events in the map occured.

Figure 2: The graph below the map is a Time-of-Day/Day-of-Week graph, showing a "temporal heat map" of when the events in the map occurred.

OSM Maps Port au Prince in Haiti Response

The OpenStreetMap community has really stepped up to the plate and delivered some amazing vector data using a mix of Yahoo! imagery, old CIA maps and new GeoEye imagery.  Some people were digitizing, while others were making sure updated shapefiles were generated every 5 minutes.  Hundreds of sessions were generated in a few days.  The images below, swiped from the Mikel’s post at the OpenGeoData blog, demonstrate the dramatic progress:

OSM at the time of the quake

OSM at the time of the quake

OSM after a couple of days

OSM after a couple of days

OSM, after quake, zoomed in

OSM, after quake, zoomed in

Sean Wohltman made some interesting observations, however, that Google’s similar MapMaker effort was working at cross-purposes to the OSM efforts, leaving users of the maps needing to make a decision about which version they should use.  A common effort would benefit more people, but the legal terms and conditions prevent a straightforward resolution.  Geospatial data developers and users have made great contributions to the Haiti relief efforts, but while the geo-geeks are playing a leadership role in one respect, they are also exposing some tough contradictions in our legal infrastructure.

Update 1/18/2010:

Some additional OSM Resources related to the Haiti quake:

OSM Haiti with Mapnik rendering and earthquake related locations

OSM Haiti with Mapnik rendering and earthquake related locations

Google.org Builds Cloud-based Image Processing Platform

To coincide with the opening of the Copenhagen Climate Summit, Google.org announced a collaboration with the Carnegie Institution for Science to build an online version of the Carnegie Landsat Analysis System (CLAS).   The existing CLAS system is a desktop tool that supports conversion from the raw satellite imagery, calibration, atmospheric correction, cloud masking and spectral analysis to create maps of forest cover, deforestation, and forest disturbance that can be overlaid with other geographic data.  The new version of the software, called CLASLite, does all of this online.

The Google.org folks write:

What if we could offer scientists and tropical nations access to a high-performance satellite imagery-processing engine running online, in the “Google cloud”? And what if we could gather together all of the earth’s raw satellite imagery data — petabytes of historical, present and future data — and make it easily available on this platform? We decided to find out, by working with Greg and Carlos to re-implement their software online, on top of a prototype platform we’ve built that gives them easy access to terabytes of satellite imagery and thousands of computers in our data centers.

Geoprocessing in the cloud with petabytes of satellite imagery while reducing computation from days to seconds.  That’s a compelling vision. The prototype, Earth Engine, is not yet available to the public, but  Google has pledged to make it accessible for free to any tropical country.  And while the initial target of this effort is deforestation, it seems only logical that the Earth Engine could very well be extended to cover other types of geoprocessing.

Distributing geoprocessing has been on its way for a while. Wolfram Research has been offering the server version of its Mathematica product as a way to distribute mathematical and statistical processing across many machines in a network. Brian Flood has done a fair amount of work on cloud-based geoprocessing with his Arc2Earth Cloud Services.  At Azavea, we’ve designed our own DecisionTree raster processing framework to both distribute work across multiple machines/processors/cores as well as be able to run in the Amazon Web Services EC2 environment. Each of these examples is aiming at several benefits:

  • Speed: desktop processing can take many minutes and even hours to complete.  By distributing the work across dozens or hundreds of machines, we can get responses that are fast enough to display the results in “web time” – a second or two.
  • Lower Cost: If we can acquire processing power as we need it, rather than buying and maintaining hardware and disks ourselves, we can lower the cost of computing substantially.
  • Simpler UI: By complex processing to be performed on the web, we can create crafted user interfaces that focus on the needs of a particular workflow rather than requiring that someone learn the far more complex tools in a Desktop GIS.

I’m pretty excited the prospects for bringing analytical and statistical services to a much larger audience via cloud services.

Google Adds Spatial Search to Maps Data API

Google slipped out a new feature in its Maps Data API over the holidays that was quiet but I think was fairly substantial.  If you recall, back in April, Google released a new API, designed for geographic data.  This original release included only a few features: the ability to store and manage spatial data and the ability to search it. A major limitation was that, despite being designed for storing spatial data, it didn’t support spatial searches.   With the latest version of the API, that’s now changed.

This is still a fairly limited capability, with only bounding box (rectangle) and point-plus-radius (circle) searches currently supported.  However, the ability to sort based on distance in combination with the spatial data services would be enough for many of the most straightforward mapping applications.

David and Josh on Video

We had a busy autumn at conferences. Josh Marcus represented us at the first International Crisis Mapping Conference in Cleveland, Ohio.  He presented our work with HunchLab, the crime analysis, early warning and forecasting system we have been developing with support from the National Science Foundation.

Over the past year, David Zwarg has been devoting his 10% research time to supporting the mapping components on the SourceMap project at the MIT Media Lab.  He had a chance to present at the Boston Ignite Spatial a couple of weeks ago.  Check out his presentation on this video.

OpenStreetMap License is Changing

OpenStreetMap: the free wiki world map

Whether for commercial software or open source projects, the crafting of a license is one of the most important decisions a company or team can make.  The license determines who can use the software, how it can be used as well as how it can be shared.  Open data projects, while different from open source software, face the same types of questions.

OpenStreetMap is probably the single largest and most significant open data project in the geospatial realm.  The project was started because “most maps you think of as free actually have legal or technical restrictions on their use, holding back people from using them in creative, productive, or unexpected ways.” Up until now, OSM has been using a license called Creative Commons Attribution Share Alike (CC-BY-SA).  However, OpenStreetMap is more like a database than it is like a text document or photograph and database projects have run into some specific problems with the CC family of licenses.  The OpenStreetMap project is proposing a move to the Open Database License (ODbL).  Like many collaborative projects, the move is being made by submitting the change and the justification for it to the community for review, comment and vote.

Why make this move?  What’s wrong with the CCBYSA license? A lot of people use the CC licenses to publish their articles, photos, paintings and other creative work.  But the various forms of the Creative Commons licenses are designed to work within the legal infrastructure the surrounds the concept of copyright.  Structured databases are collections of facts.  When factual data (like streets drawn on a map) are arranged the way you’d expect it to be, it’s not necessarily protected by copyright law, particularly under U.S. copyright law, which only protects works that arise from creativity.  If copyright doesn’t apply to factual data and the CC licenses are based on copyright law, we have a problem.  The is the core of the issue.  Even the Creative Commons folks have said that the CCBYSA license should not be applied to databases.

The new proposal, ODbL, resolves the issues by applying copyright where it applies and applying contract law where it does not.  It attempts to take the best of both worlds and create a happy medium that applies to database projects like OSM. As perhaps the largest open database in the world, OSM was one of the touchstone cases that the Open Data Commons and Open Knowledge Foundation used to build the license.

But it’s also interesting to note what it won’t cover.  The ODbL will only apply to distribution of the OSM database.  Contributions to OSM (like GPX tracks and other database edits) are covered by a Contributor Agreement which will refer to the ODbL as the means of distributing their contributions.  It won’t cover image tiles generated based on the OSM database.  It won’t cover the OSM wiki, which, since it is text and therefore considered a creative work, will remain covered by CCBYSA.  And it won’t cover the software source code used to run the entire OSM system – that will be usually, but not always, be covered by the GPL.

There remains some controversy within the OSM community. Many members, including one of the founders advocating for the change, feel that a completely free, Public Domain license (no limits on usage) would be preferable.  The ODbL will retain the “share-alike” concept of the current CCBYSA license (requiring both attribution and that changes be submitted back to the community and distribution carry the same terms). They feel that the spirit of reciprocity codified in this approach is stronger. The new OSM license will include both the concept of attribution and share-alike because many members of the community feel that this limitation benefits the project.  Nonetheless, others feel strongly that a truly public domain situation would be better in the long run, encouraging broad usage without consideration for consequences.  In the best democratic tradition, however, both sides express their positions in Vote Yes and Vote No pages.  Check them out.  And if you are an active member of the OSM Foundation, make sure you cast your vote.

You may be tempted to file this under “boring”, but the nuances of licenses are an important part of the creative economy in which we operate.  They set the terms under which we interact with each others work.