Tag Archives: DecisionTree

Tyreek Elam’s Account of His Summer Internship with Azavea

Project H.O.M.E‘s mission is to empower people to break the cycle of homelessness.  As part of their numerous educational and professional development programs is the John and Sheila Connors Youth Employment Program.  Every summer, along with offering academic workshops and professional development classes, Project H.O.M.E places students into local businesses and city agencies for six-week, 20-hour per week internships.  At these positions, students are exposed to business practices and professional activities they might not have otherwise.  It is in this context that Azavea welcomed Tyreek Elam into our Philadelphia office this summer.

Why?  One of our core principles is to do work that is meaningful and encourages positive changes in the communities our clients serve.  Welcoming Tyreek amongst us seemed like a tangible and meaningful, albeit small, way to make a positive impact in the life of someone from our community.  During one of our Management Team meetings, I presented the idea and we all voted unanimously that Tyreek should join us for his internship.  This is his account of his stay with us.  It is my hope Tyreek will remain in touch with us.  We all wish him the best as he prepares to apply to college and develops his professional career.

“Though I was only here for six weeks, these six weeks were some of the most wonderful six weeks, I have had in my entire life.  My internship with Azavea was amazing, everyone in the office was kind, helping, and just plain, awesome.  I have never seen a place so vibrant, everyone is almost always busy working with something but when you go and ask them something there is never a bad atmosphere about them.  Each week I was assigned a different team and a different assignment, and as a result more insight on what Azavea had to offer.

The first week I worked with the Law Enforcement team, consisting of Bennet, Jeremy, and Kenny, as a beta tester, using a demo of their HunchLab product to find any problems or bugs in the software.  I greatly enjoyed the application as well as the way they explained things to me.  HunchLab is a web-based geographic crime visualization, early warning, and risk forecasting software.  HunchLab and the team developing it were so great that at the end of the week I reluctantly had to go.

But the fun did not stop there, the next week was the Cicero team, with Abby, Andrew, and Daniel.  During my week with Cicero, which is an address-based legislative district matching and elected official look up web API, I gathered and entered data about previous elections for various countries.  That was definitely a challenge, an interesting challenge, considering how little is known about a lot of old elections for a lot of countries.

The next week I was placed with the PhillyHistory / Sajara team, which consists of Deborah and CarissaPhillyHistory.org offers a geographic search, mapping and display of historic assets in Philadelphia.  This was also one of my favorite weeks because I really enjoyed surfing through all the historic photos they had of the city I live in.  The entire week was spent with me going through the pictures and recording data, but the pictures I saw made me feel closer to Philadelphia.

The next week I worked with the DecisionTree team helping them install Ubuntu, which was awesome and gave me a feel for Ubuntu and an OS other than the Windows or Mac OS X.   I really enjoyed how TamaraJosh and Erik, let me get a feel for the software and the OS on my own but were there to help me when I stumbled or, was stuck.

My last week, I was with the Land Records team and worked on their PWD Stormwater Billing Application.  Though I knew very little about the application it was still fun.  I was assigned with the task to find ways to break or hack the web app so they could fix it.  Matthew and Justin were extremely helpful when it came to parts of software that I found that did not work or had some bugs.

Overall my time here at Azavea was a great one and I wish I could do it again.  Everyone was approachable and reasonable, but I would like to personally thank Ms. Rachel, because my stay there was twice as wonderful because of her.  She always made sure I had what I needed, if I needed more of anything, if I was making out okay, and if there was ever anything that she herself could not help me with she tried hard to find someone that could.” – Tyreek Elam

Pushing the Boundaries of Geographic Data-Processing Over the Web

Most contemporary work in GIS involves one or more of three major types of activity: a) database development; b) spatial analysis and map production; and c) web-based map display.  Applications of GIS analysis technology are enormously diverse:  land planning, climate change modeling, assessing the impact of sea level rise, natural hazard risk assessment, military scenario planning, cell phone tower placement, and business siting, and many more.  Currently, these applications, which involve large amounts of geographic data-processing are usually tied to desktop workstations because of the significant amount of time, memory, and processing power required to execute the operations.

As computing power continues to grow, Azavea has become increasingly committed to making substantial improvements in the performance of GIS data computation (sometimes referred to as “geoprocessing”) over the web.  Ultimately, what we are seeking to make possible web-based GIS modeling that are so fast that you might think you’re playing a video game.  That’s no small endeavor, but the possibilities are mind-blowing.  Thanks to a National Science Foundation grant in 2010, we made significant progress on testing the feasibility of using graphics processing units (GPUs) to do just that.  If you’re interested in how we can hijack GPUs for GIS, check out our blog series on the research.

It is that context that this past January, our colleague, Tamara Manik-Perlman did a presentation on community planning tools that prioritize place-based decisions at the Esri GeoDesign Summit.  Make sure to watch her presentation and learn what we are up to.

Video credit: Esri. For more videos of the Esri 2011 GeoDesign Summit, visit: http://video.esri.com/series/13/2011-geodesign-summit

Geoprocessing and WPS

As you may have gathered from our recent newsletter article and other announcements about the CommonSpace project we’ve been developing in Philadelphia, we are in the middle of redesigning the geoprocessing engine that underpins our DecisionTree product.  The new engine, code-named “Trellis”, is leveraging our experience implementing high performance raster processing operations.  We are taking the lessons we learned with DecisionTree – distributed and parallel processing, binary messaging, caching, pyramiding, etc. – to create a more generic processing framework that will support a broader array of geoprocessing operations than was the case for the original single-purpose design created for the DecisionTree application.

We’ll unveil more details of the Trellis work as it evolves over the next few months, but as part of our design research, we’ve been looking at a number of the existing technologies related to server-based geoprocessing.  This first article will focus on the Open Geospatial Consortium (OGC) Web Processing Standard (WPS).

WPS is one of an alphabet soup of geographic data and mapping standards overseen by a non-profit standards organization called the Open Geospatial Consortium (OGC).  It’s a particularly interesting one for Azavea because it is concerned with making geographic data processing available across a network – essentially enabling us to move geo-computation and spatial analysis from a desktop GIS to the server and enabling this type of analysis to be provided as a service over the web or even in a mobile application.  We think that this is a really important capability for two reasons:  a) it will allow sophisticated analysis that has previously required a GIS specialist and complex software to be made available in simple applications on the Internet; b) we think this will result in faster, more responsive applications that can serve more people at lower cost.

OGC standards like WPS have developed over the course of many years and have arisen in order to support interoperability across diverse platforms.  The OGC standards that Azavea has found most useful for its web and mobile applications are of two basic types: services the return some kind of geographic data; and formats for organizing and transporting that data across a network.

  • WMS and WMTS (web map service and web map tile service) – service that provides map images for display in a web browser
  • WFS (web feature service) – service to request and filter vector feature data in a geographic database
  • WCS (web coverage standard) – service that provides raster data (aerial and satellite imagery, for example)
  • GML (geography markup language) – this is an XML protocol for encoding geographic data
  • KML (keyhole markup language) – developed by Keyhole and later purchased by Google as part of the software that would become GoogleEarth, KML was submitted to the OGC after it had undergone a fair amount of development; it does not fit neatly into the the other standards, but it’s broadly used for combining geographic data with styling
  • SLD (styled layer descriptor) – a way to describe how to apply color and other styling on a map

What the heck is WPS?

So if WMS is for getting map images, and WFS and WCS are for requesting vector and raster data, and data can be transferred using KML and GML and styled using SLD, that’s a lot of what is done in the web map mapping world.  What do we need WPS for?  WPS provides a way to request transformations of existing geographic data.  While much of contemporary web mapping remains a matter of simply displaying data on a base map and asking some basic questions about that data, the utility of geoographic analysis goes beyond display of information on a map.  For example:

  • in a flood scenario, we might want to know which properties are located within 100 meters of a flood plain boundary
  • to find the perfect site for a school, we might want to consider several geographic maps layers, apply weights to them and generate a heat map
  • for crime analysis, we might want to create a density map based on crime locations

In each of these cases, we need to transform one or more geographic data sets.  To answer the first question, for example, we would need to buffer the flood plain polygons by 100 meters to create a new layer and then select records that fall within the new polygon.  For the second scenario, we need to read each of the relevant map layers, convert them to a common format and scale, apply weights and then create and apply colors to a map of the results.  WPS is a standard that supports requests for these types of geographic data transformations (processing)  in a common way.

Like many of the OGC services standards, WPS is conceptually simple.  It supports only three functions:

  1. GetCapabilities – returns information about the available processing features
  2. DescribeProcess – returns metadata specific to each available processing function
  3. Execute - runs a process based in a series of inputs

It’s important to note that WPS does not actually do anything. Like other OGC services, it is simply a lingua franca for asking for work to be done.  if you host a WPS service, you still have to have software that executes the processing tasks.  But WPS defines a common protocol for making requests for almost any kind of geoprocessing task.

Who’s Using WPS?

52north logoUnlike WMS, WFS and WCS and a few other standards, WPS is a relatively new standard that was only finalized in late 2007.  There are not a lot of examples yet, but the reference implementation is a WPS server and sample clients being developed by 52° North, a non-profit based in Germany.  The software project is being led by Bastian Schaeffer and Theodor Foerster and is a Java-based implementation that is available under an open source license.  The project has an ambitious roadmap and a rapidly growing community.  There is also ongoing work to create a connector to the geoprocessing capabilities in Esri’s ArcGIS Server as well as distributed (or “grid”) computing.

A second open source implementation is PyWPS, created for people using Python.  It’s primary purpose is to make GRASS-based processing available to web clients.

ERDAS logoIn 2009, ERDAS released its own WPS-based web geoprocessing service.  ERDAS refers to its WPS implementation as an  internet Spatial Modeling Service (iSMS).  The WPS interface supports access to the IMAGINE Spatial Modeling Engine by making calls over the internet.  The capabilities are integrated into the ERDAS APOLLO server product line under the APOLLO Professional package.  And, of course, ERDAS has enabled its client technology, the IMAGINE and TITAN products, to consume WPS-compliant services.

What’s the point?

So why is WPS important?  As I mentioned above, like most information technology standards, the purpose of WPS is interoperability.  If a standard becomes broadly adopted in many software packages, it becomes easier to mix-and-match components for a particular purpose.  By enabling their APOLLO server to speak WPS, ERDAS is enabling any software that can make WPS requests to use the server, even if they are not ERDAS products.  So I can use the uDig WPS plugin and make requests to spatial models defined and run in an ERDAS APOLLO server and display the results in uDig.  For a commercial company like ERDAS, this is a double-edged sword.  By supporting WPS, they are also saying that you don’t need to buy an ERDAS client software package in order to use an APOLLO server.  But, by the same token, it also means that now many more people will be able to make requests to APOLLO servers, and this will grow the ERDAS ecosystem and may result in higher sales of APOLLO and IMAGINE licenses.

In a second article, I’ll focus on an even newer standard just published by Esri, the GeoServices REST API specification.

Resources

Google.org Builds Cloud-based Image Processing Platform

To coincide with the opening of the Copenhagen Climate Summit, Google.org announced a collaboration with the Carnegie Institution for Science to build an online version of the Carnegie Landsat Analysis System (CLAS).   The existing CLAS system is a desktop tool that supports conversion from the raw satellite imagery, calibration, atmospheric correction, cloud masking and spectral analysis to create maps of forest cover, deforestation, and forest disturbance that can be overlaid with other geographic data.  The new version of the software, called CLASLite, does all of this online.

The Google.org folks write:

What if we could offer scientists and tropical nations access to a high-performance satellite imagery-processing engine running online, in the “Google cloud”? And what if we could gather together all of the earth’s raw satellite imagery data — petabytes of historical, present and future data — and make it easily available on this platform? We decided to find out, by working with Greg and Carlos to re-implement their software online, on top of a prototype platform we’ve built that gives them easy access to terabytes of satellite imagery and thousands of computers in our data centers.

Geoprocessing in the cloud with petabytes of satellite imagery while reducing computation from days to seconds.  That’s a compelling vision. The prototype, Earth Engine, is not yet available to the public, but  Google has pledged to make it accessible for free to any tropical country.  And while the initial target of this effort is deforestation, it seems only logical that the Earth Engine could very well be extended to cover other types of geoprocessing.

Distributing geoprocessing has been on its way for a while. Wolfram Research has been offering the server version of its Mathematica product as a way to distribute mathematical and statistical processing across many machines in a network. Brian Flood has done a fair amount of work on cloud-based geoprocessing with his Arc2Earth Cloud Services.  At Azavea, we’ve designed our own DecisionTree raster processing framework to both distribute work across multiple machines/processors/cores as well as be able to run in the Amazon Web Services EC2 environment. Each of these examples is aiming at several benefits:

  • Speed: desktop processing can take many minutes and even hours to complete.  By distributing the work across dozens or hundreds of machines, we can get responses that are fast enough to display the results in “web time” – a second or two.
  • Lower Cost: If we can acquire processing power as we need it, rather than buying and maintaining hardware and disks ourselves, we can lower the cost of computing substantially.
  • Simpler UI: By complex processing to be performed on the web, we can create crafted user interfaces that focus on the needs of a particular workflow rather than requiring that someone learn the far more complex tools in a Desktop GIS.

I’m pretty excited the prospects for bringing analytical and statistical services to a much larger audience via cloud services.

Walkshed NYC Enters NYC Big Apps Contest

We’ve been wrapped up in walkability to bring you Walkshed NYC. Using 10 data collections drawn from the NYC.gov Data Mine, we’ve entered Walkshed into the NYC BigApps competition to provide custom walkability mapping to NYC residents.

Just how much customization?   Sixty billion custom walkability maps for each NYC resident — yes, we said each resident.    Walkshed NYC contains 17 preferences each of which can be set to 11 values—that’s 505,447,028,499,293,771 possible maps that you can select from. Plenty of possibilities for all 8,363,710 NYC residents.

The complexity only begins there. Each of the 17 walkability preferences are made up of 157,715,256 values arranged in a grid to cover the city. The values in your selected preferences need to be combined on the fly to generate your distinct map.    Thank goodness we have DecisionTree to power this immense calculation.

But measuring a city’s walkability is just the beginning.  Planning water resources, land use, better sidewalk networks and bike lanes, and distane from diverse habitats are just a few of the ways that geographic technology can help make our towns and citoes operate in a more sustainable manner.  Also have an obsession with walkability or sustainability?  We need your support and votes.  Voting runs from December 15th – January 7th.

On
December 15th , vote to put walkability on the map.

Explore Walkshed New York

Walkshed.org is Live — Walkability Calculations for the Public

We just wanted to share a quick note with our blog readers about today’s launch of Walkshed.org.

Walkshed provides the public with the ability to define what walkability means to them.   By generating a custom heatmap, they can explore Philadelphia and see what neighborhoods best match their factors.   For example, one person might define walkability based on living close to a library, coffee shops, and a shopping center while another person might define it as being close to public transit, carshare locations and a grocery store.   Thanks to DecisionTree, Walkshed enables each person to calculate the locations that best meet their weighted criteria and returns a map that reflects these scenarios “on the fly”.

We’re thrilled that you are finding the application of interest and sharing your feedback with us.  Check it out at walkshed.org.

Walkshed.org Screenshot