Tag Archive:
Azavea Research

Azavea Behind the Scenes – Who Were Our Most Influential Teachers and Why?

It’s September again.  The air is cooler, the sky looks grayer, and the trees we are  documenting feverishly on PhillyTreeMap.org are going to burst into color soon.  It’s also the month when every young software-engineer-to-be packs her or his backpack and trudges bleary-eyed off to school again.  For this month’s newsletter, we thought it pertinent to reflect back on the teachers who helped make us who we are today, before we had ever touched Django and JavaScript, or OpenLayers.

Read about our favorite teachers »

 

Research Project: Sourcemap

"I am proud to be a part of the project, and Azavea is happy to see its staff working on such cool projects."

One of the many perks about working at Azavea is the opportunity to conduct an independent research project. Full time staff that have been with the company for at least 6 months are eligible to develop a research project plan, and pitch it to the powers that be. We have profiled several other research projects in previous newsletters, each solving a different problem. Sometimes staff choose to take on projects pro-bono, rebuild older Azavea applications, or learn more about a given technology.

Last year, I learned about the MIT Media Lab’s Tangible Media Group-produced ‘Sourcemap’ project. Sourcemap is a tool, “for producers, business owners and consumers to understand the impact of supply chains.” My personal interests initially attracted me to the project, and Azavea approached the Sourcemap project in November of 2008 to see if they could use any contributions of the mapping and/or web development kind. They were happy to have contributions, and I began working on the spatial database and mapping components of the project. They generously moved to an MIT Open Source License, partially in order to accept the mapping and web development contributions I would make.

Some of the components I have been working on have included:

  • Migrating from a proprietary web mapping API to OpenLayers
  • Implementing specialized “arcing” cartography between parts and objects
  • Rendering material networks across the International Date Line

The migration to OpenLayers increased performance of object maps, and enabled the maps to display a much greater number of features. This introduced a second problem when there became too many features on the map to be able to distinguish them – if any two parts of an object and an object itself was collinear, it would be impossible to see their connection. By slightly arcing the network, it became possible to discriminate parts in complex objects. Lastly, mapping a network across the IDL introduces many fun problems – one of which is that mapping a part from Japan to an object in Alaska went the wrong way around the globe! The solution I came up with involved creating networks that repeat across the globe and represent the shortest distance between points.

The Sourcemap project generously moved to an MIT Open Source License, partially in order to accept my contributions. The challenges of working with the team at MIT from Azavea’s offices in Philadelphia introduced some growing pains to the project, but the project lead, Leo Bonanni, was committed to opening up the project to outside (of the Media Lab) contributors, and managing a distributed team.

Now, Sourcemap is getting ready to go live (http://www.sourcemap.org/), and they have some beautiful and informative maps. I am proud to be a part of the project, and Azavea is happy to see its staff working on such cool projects.

Click here for an interactive Sourcemap!

Interactive Sourcemap (Firefox only)!

Research: The Amazon Elastic Cloud

"I am exploring the use of The Elastic Compute Cloud (EC2) as a resource for some of Azavea projects already in use. DecisionTree, our geographic prioritization system, was an ideal first candidate..."

I am very excited about my Azavea research project on the Amazon Elastic Compute Cloud (Amazon EC2), a technology from Amazon Inc. that is shifting a lot of people’s ideas about what computing is and can do. Amazon EC2 has arisen through the confluence of technological innovations of the past few years.

First some background. One of the most basic pieces of infrastructure in the World Wide Web today is the ubiquitous entity known as “The Server”. This term is used for a computer which performs some task or tasks on behalf of other computers. For example, web pages come from a web “server”, which sends web pages to your computer for you to see. Moreover this web server may in turn query other servers to complete this request — contact a database server to get data or geospatial server to produce a map image for example.

The idea behind a computing “cloud” (and there are others — as referenced in Robert’s ‘What the Heck is…” article above) is a bunch of computers accessible from the internet which “instantiate” whole virtual computers — with all their associated operating systems, software, data, etc. — that can be accessed on demand. One can instantiate one of these machines, connect to it via the internet through standard remote connection protocols, and voila! your screen shows the desktop for this “computer” that behaves exactly as if it were sitting under your desk.

While for desktops, this approach is odd, for servers there can be many benefits. With a few clicks of a mouse, multiple copies of the same server can be up and running at the same time to handle increases in demand. They can be shut down again when not needed. The details and headaches of actually running and owning physical machinery are offloaded to the cloud provider. The cloud provider also provides bandwidth. Once you have a working version of a website, database, or geospatial server, it can be copied and reused — no need to start from scratch with configuration.

For my research project, I am exploring the use of The Elastic Compute Cloud (EC2) as a resource for some of Azavea projects already in use. DecisionTree, our geographic prioritization web system, was an ideal first candidate. This product requires strong computing resources and was designed from the ground up to be able to run on multiple computers. With EC2 we were able to run DecisionTree on 10 instances at once, dramatically speeding up its operations and providing a mechanism for running DecisionTree for customers who do not want to maintain their own server infrastructure.

In addition to DecisionTree, we are also experimenting with running our Cicero legislative and election data service on EC2 as well as other ways to leverage the Amazon Web Services. For example, last spring, we tested a map image ’tile cache’ service that will generate and store a set of map tiles, enabling an organization to reduce bandwidth usage and improve responsiveness of a high traffic web mapping application. While EC2 was originally limited to Linux-based software, the recent addition of Windows Server as a target platform has provided much more flexibility. Do you have ideas for how you could use Amazon Web Services for your GIS project? Let us know.

Mapping Walkability: Finding the Best Places in Philadelphia to be Carfree and Carefree

"'...how can I find a walkable community?' I'm glad you asked...."
Photo courtesy of Tony
Fischer, Carpe Diem Photography
, via Flickr.com

It is clear that we Americans face many challenges today. The prospect of global climate change has many of us look for ways to reduce our carbon footprint. Record high energy prices earlier this year have made us all aware of how vulnerable we are to such price spikes. Such challenges are daunting, but many people have turned to an unlikely solution: walkability.

The core principle of walkability is quite simple: give people the option to live their lives without having to get in a car. Less need for a car instantly produces a number of positive individual benefits including 1) paying less for fuel, maintenance, parking, and insurance, 2) less exposure to energy price spikes, 3) reduced greenhouse gas emissions, and 4) a healthier, more active lifestyle. And these benefits become more pronounced if you are able shed a car altogether: no more car payment!

This is all well and good, but how can I find a walkable community? I’m glad you asked.

While living in Seattle, I became intrigued by Alan Durning of the Sightline Institute and his concept of a “walkshed” that scored a location based on the quantity and diversity of amenities within a one-mile radius. A year later, Walk Score, which drew heavily from Durning’s walkshed concept, went live as the first application in the world to map walkability. While Walk Score is a fantastic application with a clever methodology, it has a number of acknowledged limitations. Using Philadelphia as a prototype and as part of Azavea’s 10% research project program, I am currently researching ways to overcome some of these limitations to more accurately calculate and map walkability.

Map
showing the walking distance from points in Philadelphia to the closest
train, subway, or trolley stop.

The first requirement of this new methodology is the ability measure the walkability of a location by determining the actual walking distance to a variety of assets. In many cases, “as-the-crow-flies” distances are accurate enough, but that accuracy can degrade quickly with the presence of barriers (rivers, highways, etc), disjointed street networks, or extreme topography. In other words, I need to be able to programmatically detect the actual walking distance to my favorite restaurant on the other side of the Schuylkill River. It may only be a quarter mile as the crow flies, but being bound to the street grid could significantly increase that distance.

The most time consuming step was the development a friction layer for the entire city. This layer had to accurately represent the “friction” a person would encounter walking around the city. For example, a city street or park would represent low walking friction while navigating across a river or highway would be quite high. By taking streets, trails, parks, rivers, highways, and railroad tracks into account, I was able to calculate the walking friction of every point in Philadelphia. This friction layer now allows me to calculate the walking distance to any defined point in the city. Above, you see a sample screenshot that represents the walking distance from every point in Philadelphia to the closest train, subway, or trolley stop.

I find this research incredibly fascinating, but the best part is that this project is just getting started! I have several new walking distance layers in the queue for amenities like bus stops, car-share locations, parks, grocery stores, farmers markets, cultural venues, and more. After these data sets are complete, I have plans to roll out a publicly available web application built on Azavea’s DecisionTree product. This application will not only mimic most of Walk Score’s functionality, but will allow each user give personalized weights to each walkability indicator.

What the Heck is … OpenStreetMap?

"Inspired by collaborative information commons such as Wikipedia, OpenStreetMap is an editable map of the whole world..."
Mumbai as documented in OpenStreetMap.

In the United States, we have a general policy of the federal government sharing useful data with the public. This policy has led to open distribution of geospatial data sets that include the Census Bureau’s TIGER line file, USGS topographic maps, aerial photography, land cover and elevation, a plethora of NASA imagery and even several global data sets developed by the military. This overall openness has been replicated by many U.S. cities and states as well.

While Canada and Australia have a similar legal tradition to the U.S. and some government GIS data is available, most developed countries in Europe and around the world make little or no geospatial data available to the public. In the United Kingdom, the Ordnance Survey maintains the most comprehensive and high quality national GIS database in the world, but the data is only available to the public for a steep licensing fee. In the developing world, data is either not distributed due to national security concerns or simply does not exist.

With the Census Bureau’s TIGER data as a starting point, private companies in the United States began building high quality base maps for commercial sale. These companies have grown and consolidated until there are only a small number that dominate the market, and the two largest, TeleAtlas and NavTeq, are now held by consumer electronics firms. These companies maintain global data sets, but the cost of licensing them is substantial.

It is within this environment of high costs and limited access to data that a project called OpenStreetMap began in the U.K. Inspired by collaborative information commons such as Wikipedia, OpenStreetMap is an editable map of the whole world, which is being built largely from scratch using GPS traces and other personal surveys. It is released with an open content license and available for free to anyone that wishes to use it. The project is a combination of software, data and knowledge. A variety of software tools have been developed to support online and off-line editing of the map data as well as its maintenance and distribution. A wiki is used to organize information about standards and processes.

Copenhagen as documented in OpenStreetMap.

Like other open data projects, such as Wikipedia or the Human Genome Project , the effort is not perfect. For example, there is not yet a standard mechanism for storing the data necessary to perform geocoding and routing; the concept of place name aliases is relatively weak; there is no formalized review process to identify and eliminate deliberate vandalism; the spatial data model is limited to points and lines; and while there are standards for what are valid attributes for each feature, they are not enforced, so the implementation of data elements is not yet very consistent. Nonetheless, the effort is growing rapidly and improving with time. There are now thousands of people building the map in almost every part of the world, and the size of the global database (known as ‘the planet file’) is now doubling every six months.

OpenStreetMap is a compelling example of how the power of loosely organized collective action can be brought to bear to create sophisticated new knowledge resources. In some parts of the world, OpenStreetMap is now more comprehensive than what is commercially available, and it will doubtless continue to develop. Azavea staff are both contributing to the database and exploring new ways to leverage the resulting map. If you would like to participate in its development, there are a number of resources online that will help you learn how. If you live in Philadelphia and would like to help improve the map, I’m organizing a Meetup in January and you are invited!

Why Make a Wild Guess on Where to Sit in the Office When You Can Use Geoprocessing?

"It's exciting to see how our staff research bears fruit at unexpected times ... Who knows what will crop up next?"

Recently Azavea went through another round of office expansion, almost doubling our office size. We knocked down walls, carved up new conference rooms, added a bike garage (as opposed to a bike tree), and more. We now have lots of new space, and quite a few new people. One of the questions that simmered while we watched the work complete was: where are we going to sit? Our staff is full of busy, smart, sophisticated people who can’t be bothered to do their own spatial analysis. Can’t we come up with some way to take the thinking out of this equation? In addition, this question is inherently spatial, so it sounded like a great opportunity to leverage our spatial research.

Map of Azavea’s office showing an employee’s ideal desk location based on entering weighted preferences in DecisionTree.

The basic premise is that when individuals moved their desk, they will move toward something they desire, and away from something they don’t. If you are allergic to printer toner, you don’t want to sit next to the printers, and if you really like the sun, you definitely want to sit next to the windows. The ultimate location of an employee’s desk takes into account all sorts of factors, and comes to a solution that is often unique to the individual. Does this sound familiar? Indeed! Managing these types of decision factors is the basis for Azavea’s DecisionTree® framework.

Using these principles, it became apparent that software developer, David Zwarg’s research was well suited to address this problem. One of David’s ongoing research projects at Azavea is collaborating with Dana Tomlin at the University of Pennsylvania to develop an advanced raster cost-distance algorithm. The innovation behind this raster cost-distance algorithm is a wave propagation model, which is not constrained to the grid imposed on the raster data. Bonus!

To start, David picked some key landmarks in the new office, and generated a cost-distance raster for each of them. His list of raster datasets generated include: cost-distance to the refrigerator, cost-distance to the bike garage, cost-distance to the printers, cost-distance to the windows, and more. In all, there were 14 layers — or decision factors — that David was able to incorporate, based on the new office floor plan.

Next, he converted the raster datasets to the Azavea Raster Grid (ARG) format. What is this format, and why convert data from raster grids? ARG is a grid format that we use internally (not to be confused with “Argh!”, which is also used internally) and has been optimized for fast processing and storage speed, in addition to being the format used by DecisionTree.

Finally, David plugged the raster datasets into a demo DecisionTree application, and published the application on the Azavea Intranet a couple weeks prior to the completion of the office expansion. The application contains a base map that is the architectural floor plan of the new office space. Azavea staff members could now use DecisionTree to locate the places in the office that suited their preferences. Adjust a few sliders, click update, and the application shows the best place in the office, based on your criteria! No more guesswork required.

It’s exciting to see how our staff research bears fruit at unexpected times. Across the gamut, from Open Source projects to geoprocessing to pro-bono cartography, our staff research brings a wealth of experience to their work (and play) – who knows what will crop up next?

What the Heck is…an SBIR?

"Through SBIR-funded research, Azavea is able to develop new technologies that we hope will both provide social value and grow into new products that create jobs and solve complex problems. "

Azavea was awarded a research grant by the National Science Foundation in December, 2006. This was our third such award in two years, and we are pretty proud. Usually, private companies are not allowed to be recipients of government grants, which are primarily awarded to universities, government agencies, and non-profit organizations. But these grants are different. They are part of the Small Business Innovation Research (SBIR) program. The SBIR program was started in the 1982 by the National Science Foundation and now includes a dozen federal agencies.

SBIR grants and contracts are awarded based on a competition with each federal agency having its own variations on the rules, but with a largely similar format. The process is separated into two phases. In Phase I, companies submit innovative ideas for products and services that match a set of priorities issued by each agency. The agencies evaluate the ideas and award grants to the ones that appear both feasible and contribute to the objectives of each awarding agency. If the company wins a Phase I award – the process is extremely competitive with only 1 in 10 applications being successful – they have six months to prove the feasibility of the idea. If the idea is proven feasible, then they are allowed to submit a Phase II proposal.

Phase II awards are for larger amounts of money and are two-year grants, during which the company must develop a commercial product or service and bring it to market. Usually, the SBIR awards are not sufficient to complete the effort and require additional investment from the firm before they can be delivered to the marketplace. However, while limited in size, SBIR grants serve as a sort of venture capital effort that can fund high priority but risky projects that might not otherwise receive funding from the private market.

So why does Azavea go after these SBIR grants? We fund much of our R&D efforts internally, but sometimes we are presented with a complex technical hurdle that we are not sure how to solve. Or we have an idea for a product, but no client yet, and need a way to jump-start the development. Our DecisionTree geographic prioritization software, for example, was partially funded by a SBIR research contract from the Department of Agriculture. Under the grant, we tested the feasibility of developing a faster raster calculation engine. Through SBIR-funded research, Azavea is able to develop new technologies that we hope will both provide social value and grow into new products that create jobs and solve complex problems.

Azavea Research: Historic Geocoder


This photo states that it was taken in 1894 at the NW corner of 15th and Pennsylvania Ave.
In 1895 Pennsylvania Ave. ran along the railroad tracks that are now between Hamilton and Callowhill.
That intersection no longer exists, as Pennsylvania Ave. now ends around the intersection of 22nd and Hamilton.
We have the photo geocoded as 15th and Hamilton as that is the current address for the same location.

Most people have experienced typing an address, intersection, or other location description into an online application which then converts it into coordinates that can be used to pinpoint the location on a map. This is the part of the process called “geocoding”.

Creating geocoding software is almost never a simple process. The more variables involved in the software, the more complex the geocoding process becomes. One of these variables is time and the change of place names over time. In our spare time we have been developing an application called a Historic Geocoder, through which we aim to address the difficulties of geocoding historic pieces of information with a ‘current’ set of location data.

A good example of an Azavea application that uses geocoding is PhillyHistory.org, a publicly accessible site run by the City of Philadelphia Department of Records and City Archives. The site is a searchable collection of some of the approximately 2 million historic photos stored in the City Archives. A unique feature of the site is that a visitor can search by a current address and find pictures near that address.

Over the years the City photographers have documented the location of each photograph by using addresses. However, the catch is that sometimes street names change. When this happens, a historic photograph with a location description that has since changed is geocoded to the wrong coordinate location or cannot be geocoded at all.

Our Historic Geocoder research project consists of three parts: a) a record of street name changes; b) a database of street segment changes; and c) software to enable time-based geocoding.

By recording not only where current streets are and what they are named, but also where streets were in the past and what they used to be called, the Historic Geocoder will provide us with the ability to geocode based on both space and time. Instead of only entering a location, a user will be able to enter a location and a date and the system then locates where the historic address was during that time period on a current map.

Historic photos are not the only records with potential historic address problems. Surveys, censuses, and legal records all use addresses to describe locations. Being able to geocode these locations with relation to time is a very important first step towards the analysis of these data. Through our R&D work on a Historic Geocoder, we hope to make it possible to more accurately assign locations for historic data.

OLAP: Online Analytic Processing

On Line Analytical Processing (OLAP) is a technology that extends conventional database technology by enabling rapid analysis of aggregated data. Like most information technology, OLAP comes with its own vocabulary. Whereas data in a traditional database is stored in two-dimensional tables, OLAP databases store data in multi-dimensional cubes that enable people to quickly change their view of aggregated data with less effort. The cube is made up of numeric facts called measures – like the ‘number of packages of widgets shipped to a client’. Measures are grouped into dimensions. Some typical dimensions might include time, product categories, delivery areas and so on.

OLAP cubes can be queried in a similar manner to a conventional database, but while most databases use Structured Query Language (SQL), their OLAP brethren have their own language, called MultiDimensional eXpressions (MDX). You wouldn’t want to use MDX to run your sales transaction database, but it’s ideally suited to create a report such as ‘Total Packages Delivered by Route by Product Source per Quarter for the last 5 years’.

The output of an MDX query can be represented in all of the traditional ways including tables and charts, but we are obviously interested in the geography and maps. While OLAP systems have been used in large businesses to analyze sales and other data for many years, their use with geographic data has been limited. Geospatial information has special properties that are not captured in most OLAP systems, such as proximity and cartographic hierarchies (like various zoom levels). The distribution of events in space and time has much to say about those events, and the spatial part of that equation is not yet incorporated fully in many of the tools on the market today. By incorporating these special properties into OLAP cubes, more powerful data analysis can be performed, revealing new and important patterns in information. My research seeks to bring spatial analysis into the OLAP world and broaden the power and applicability of this technology. I am particularly interested in real estate data and am working with several years of Vermont real estate sales.

Azavea (Brown) Bags-It

"Relaxed and fun, the Brown Bag Lunches let us share our unique talents (show off) and learn a bit more about each other in the process."
Chip Hitchens builds his own guitars.

Every once in a while the staff at Azavea needs to stand up, stretch the keyboard-itis out of our fingers, and run around the office. Rather than induce group mayhem we have found an organized, extra-curricular outlet for our office angst. ‘Brown Bag Lunches’ take place once a month during the lunch hour. Everyone congregates in one of our conference rooms to enjoy a collegial lunch and a presentation from one (or two) of our own. We like to consider ourselves well-rounded people to begin with, but we each have now learned a wealth of new information about topics ranging from the martial art Aikido to the ‘art and mystery’ of guitar making, and the ancient Chinese game of ‘Go.’

During a recent lunch, Robert and Rachel Cheetham donned full fencing gear and turned the center of the office into a makeshift piste (the name of the strip along which fencing bouts are held, which we learned during their presentation). A few months earlier, David Zwarg demonstrated how he uses a cell phone to blog and map his location on trips. Relaxed and fun, the Brown Bag Lunches let us share our unique talents (show off) and learn a bit more about each other in the process.

What’s in a Widget?

"Development of widgets as new faces for existing web services pushes geographic data to lightweight and easily distributable clients."

As a technology company, Azavea is competing in an ever-changing market.  One way Azavea stays current with emerging technologies is by sanctioning research projects in variegated technologies, headed by full-time staff members.  Research projects range from Open Source projects to pro bono GIS services for the community.  One of these research projects involves the Yahoo! Widget Engine, which builds upon Azavea’s existing Web Services expertise.

Widgets are cross-platform applications that run inside of a runtime engine.  These widgets, while compact, are built upon an internet-ready engine that provides connectivity to URL resources and Web Services with minimal programming.

In addition, widgets have strong support for innovative interface designs, challenging the “window” interface modality that is ubiquitous across all desktop applications.

Development of widgets as new faces for existing Web Services pushes geographic data to lightweight and easily distributable clients, in addition to providing Web Services for application level support.  Azavea is continuously exploring ways to disseminate geographic knowledge; exploring and challenging the way existing knowledge is transacted enhances the company’s repertoire of geographic solutions.