Tag Archives: Google

Quick Fingers Lead to Perfect Predictions & Geographic Models

We’re really excited for the Chromercise rollout that was announced today by Google.

At Azavea, we’ve long realized that by combining highly intuitive interfaces with high performance geoprocessing, we could build web applications that simplify the user experience and push the limits of what is possible with web-based geographic visualization and modeling.   What we’ve realized in the process is that delays in user input impact the accuracy of the models and forecasts that we can produce.   Whether we are forecasting crime levels across a city or modeling watersheds, the millisecond delays of waiting on human button pushing reduces the potential of our software.

By rolling out a system to reduce this human induced error, Chromercise is bound to improve every application that we’ve produced.    Please join us in reaching out to Google to express our thanks for this truly revolutionary program.

Mashing up Google Calendar and a Javascript Timeline

Usually, this blog is about geography and Azavea’s work, but I thought an internal project might be of interest to others.  Our marketing team recently faced an interesting problem.  Our marketing approach is not based on advertising. Rather, we focus on spreading the word about our work by performing presentations at conferences, writing articles, writing book chapters, our newsletter, etc.  We also respond to a fair number of RFP’s and grant solicitations.  As our marketing and business development team has grown, the number of activities to track has also increased.  Lots of activities also creates opportunities, but if we can’t effectively visualize how they all fit together, we run the risk of missing those opportunities.  In addition, the task of tracking all of the grant and proposal deadlines, conference attendance and other activities becomes pretty tough.

So we resolved to set up a shared calendar as a mechanism for collectively tracking all of these deadlines and activities.  We had  switched our e-mail system to GoogleApps Premium in early 2008.  When we did this, we gained a number of capabilities in addition to e-mail including: shared calendars, document authoring/storage and customizable home pages for each staff person.  So our starting point was to create a Google Calendar for the marketing folks to share.  However, many of the marketing and business development activities span several days, and while Google Calendar is a great way to enter and store events, the usual daily/weekly/monthly calendar layout does not make it easy to see several weeks or months together.  We were really looking for a ‘timeline’ display of the calendar so we would be able to see the juxtaposition of several events and their relationship to each other.  So we looked around for a low-cost system that would enable us to both enter our marketing activities and visualize them in a timeline layout.  We looked at online project management tools, some of which support Gantt charts, but while a Gantt chart is great for decomposing tasks into subtasks, it arranges each task into it’s own line.  So if you have 20 tasks, that’s ok, but if you have 100 or 200 spread out over a year, it’s not very readable – the chart just keeps growing vertically.

marketing timeline calendar

So we decided to build something in-house.  When we had first set up our wiki, David Zwarg had showed off a tool called Simile Timeline, created by some folks at MIT.  So we went back to that project and learned that not only had it continued to develop but it was available as an open source toolkit that could be used in a broad range of applications.  David picked up Simile and within a couple of days, he had mashed up 6 calendars within the account we’d set up for the marketing crew into a timeline-based calendar.  He also experimented with incorporating a map, but we decided it consumed too much screen real estate and nixed it.  After all, we’re still small enough that we generally know what part of the country every is in. :-)

While geography proved to not be very compelling for this application, the juxtaposition of space and time can be a very useful visualization.  Below are a couple of screenshots from one of the recent builds of of our HunchLab product (it’s used for forecasting and geographic change detection), where there’s a critical need to view both spatial and temporal patterns in the same view.

Figure 1: The points on the map represent the span of time selected on the graph with a heat map of the points.

Figure 1: The points on the map represent the span of time selected on the graph with a heat map of the points.

Figure 1: The graph below the map is a Time-of-Day/Day-of-Week graph, showing a "temporal heat map" of when the events in the map occured.

Figure 2: The graph below the map is a Time-of-Day/Day-of-Week graph, showing a "temporal heat map" of when the events in the map occurred.

OSM Maps Port au Prince in Haiti Response

The OpenStreetMap community has really stepped up to the plate and delivered some amazing vector data using a mix of Yahoo! imagery, old CIA maps and new GeoEye imagery.  Some people were digitizing, while others were making sure updated shapefiles were generated every 5 minutes.  Hundreds of sessions were generated in a few days.  The images below, swiped from the Mikel’s post at the OpenGeoData blog, demonstrate the dramatic progress:

OSM at the time of the quake

OSM at the time of the quake

OSM after a couple of days

OSM after a couple of days

OSM, after quake, zoomed in

OSM, after quake, zoomed in

Sean Wohltman made some interesting observations, however, that Google’s similar MapMaker effort was working at cross-purposes to the OSM efforts, leaving users of the maps needing to make a decision about which version they should use.  A common effort would benefit more people, but the legal terms and conditions prevent a straightforward resolution.  Geospatial data developers and users have made great contributions to the Haiti relief efforts, but while the geo-geeks are playing a leadership role in one respect, they are also exposing some tough contradictions in our legal infrastructure.

Update 1/18/2010:

Some additional OSM Resources related to the Haiti quake:

OSM Haiti with Mapnik rendering and earthquake related locations

OSM Haiti with Mapnik rendering and earthquake related locations

Google.org Builds Cloud-based Image Processing Platform

To coincide with the opening of the Copenhagen Climate Summit, Google.org announced a collaboration with the Carnegie Institution for Science to build an online version of the Carnegie Landsat Analysis System (CLAS).   The existing CLAS system is a desktop tool that supports conversion from the raw satellite imagery, calibration, atmospheric correction, cloud masking and spectral analysis to create maps of forest cover, deforestation, and forest disturbance that can be overlaid with other geographic data.  The new version of the software, called CLASLite, does all of this online.

The Google.org folks write:

What if we could offer scientists and tropical nations access to a high-performance satellite imagery-processing engine running online, in the “Google cloud”? And what if we could gather together all of the earth’s raw satellite imagery data — petabytes of historical, present and future data — and make it easily available on this platform? We decided to find out, by working with Greg and Carlos to re-implement their software online, on top of a prototype platform we’ve built that gives them easy access to terabytes of satellite imagery and thousands of computers in our data centers.

Geoprocessing in the cloud with petabytes of satellite imagery while reducing computation from days to seconds.  That’s a compelling vision. The prototype, Earth Engine, is not yet available to the public, but  Google has pledged to make it accessible for free to any tropical country.  And while the initial target of this effort is deforestation, it seems only logical that the Earth Engine could very well be extended to cover other types of geoprocessing.

Distributing geoprocessing has been on its way for a while. Wolfram Research has been offering the server version of its Mathematica product as a way to distribute mathematical and statistical processing across many machines in a network. Brian Flood has done a fair amount of work on cloud-based geoprocessing with his Arc2Earth Cloud Services.  At Azavea, we’ve designed our own DecisionTree raster processing framework to both distribute work across multiple machines/processors/cores as well as be able to run in the Amazon Web Services EC2 environment. Each of these examples is aiming at several benefits:

  • Speed: desktop processing can take many minutes and even hours to complete.  By distributing the work across dozens or hundreds of machines, we can get responses that are fast enough to display the results in “web time” – a second or two.
  • Lower Cost: If we can acquire processing power as we need it, rather than buying and maintaining hardware and disks ourselves, we can lower the cost of computing substantially.
  • Simpler UI: By complex processing to be performed on the web, we can create crafted user interfaces that focus on the needs of a particular workflow rather than requiring that someone learn the far more complex tools in a Desktop GIS.

I’m pretty excited the prospects for bringing analytical and statistical services to a much larger audience via cloud services.

Google Adds Spatial Search to Maps Data API

Google slipped out a new feature in its Maps Data API over the holidays that was quiet but I think was fairly substantial.  If you recall, back in April, Google released a new API, designed for geographic data.  This original release included only a few features: the ability to store and manage spatial data and the ability to search it. A major limitation was that, despite being designed for storing spatial data, it didn’t support spatial searches.   With the latest version of the API, that’s now changed.

This is still a fairly limited capability, with only bounding box (rectangle) and point-plus-radius (circle) searches currently supported.  However, the ability to sort based on distance in combination with the spatial data services would be enough for many of the most straightforward mapping applications.

Google Fusion Tables – First Look

In June, Google Labs released Fusion Tables.  According to Google’s description, Fusion Tables are:

a service for managing large collections of tabular data in the cloud…You can apply filters and aggregation to your data, visualize it on maps and other charts, merge data from multiple tables, and export it to the Web or csv files.

I wanted to take it for a spin, so I got some data from the Public Crime application we built for the Philadelphia Police Department and loaded it into a Google spreadsheet. You can import local files (.XLS, .XLSX, .CSV & .ODS) as well as import directly from Google Spreadsheets to the tables. One thing to note is that once the file or spreadsheet is imported into a Fusion Table, there is no bulk import functionality to update from outside files. The Fusion Table expects to become the application that manages the data.

After the import there is the interface to create metadata that stays with the dataset for its life span.

metadata1

Once the data is in the table, there are a number of ways if interpreting the data.  There are filtering capabilities that allow you to build ad-hoc queries against the table and perform aggregations to generate reports on the data.

fusionreport

The functionality that most excited me about Fusion Tables was the visualization capabilities. Leveraging the filtering and aggregation, the charts can tell a pretty compelling story. Another note, the embeddable code seemed a little buggy, the aggregations did not get carried over to the script tag.

fusionchart

fusionlinechart

By choosing the ‘Map’ option from the ‘Visualize’ menu brought up a Google Map with all of the points that could be geocoded on the map.

geocodedpoints

I guess for performance reasons, Google is limiting the number of points on the map to 200.  Maybe that number will be increased when it is released out of Labs,  we’ll have to see.

There is a lot of functionality that I haven’t touched on, maybe in a future post. I have made the Fusion Table publicly accessible (but not editable), so feel free to go and play around with it.

Final Note: I did have to do some editing to the source data and it does not reflect the information that is directly downloaded from the PPD Public Crime site. Those edits included:

  • Removing the word ‘BLOCK’ from the address field
  • Appending the address with ‘,Philadelphia, PA’ to facilitate geocoding
  • Removing the columns that stores the local coordinates

Google Earth Automatic Photo Geotagging

The Google Earth Blog posted about a presentation  by Michael Jones from Google discussing the roadmap for Google Earth.  It seems Google is working on an algorithm to automatically geocode uploaded photographs by comparing them with a large collection of known geo-tagged photos.   These known geo-tagged photos could be drawn from a combination of Google Street View, Wikipedia, Picasa and other sources.

There also seems to be some work going on with landmark matching in particular — matching photos with landmarks (the Golden Gate Bridge as an example) as opposed to photos in general.   Google published a research paper a few weeks ago that outlines a system to match landmarks around the world with 80% accuracy.  There is definitely some way to go before we see high accuracy for geocoding photos in general if this very specialized set of photos only manages 80% accuracy at present.

We’ll certainly be thinking about this technology and how it might relate to our digital asset management system, Sajara.   In particular I wonder how their algorithms take into account how a neighborhood’s appearance can change over time.   If an algorithm takes into account the date the photograph was taken this could be useful to automatically tag photos not only in regards to their location, but perhaps also the date they were taken.