Tag Archive:
What the Heck Is…

What the Heck is… Scala?

When we decided it was time to build a next generation version of DecisionTree, I started a research project  (with my 10% R&D time) to carefully evaluate the current state of the art in concurrent programming.  When I say “concurrent programming”, I am talking about two different but related concepts.  One way to make a computational task complete more quickly is to chop up the work that needs to be done into smaller parts and then to divide the work across multiple CPUs in a single computer (“parallel programming”), or to divide up the work across different computers (“distributed programming”).  During this research, I spent some time learning a new and very exciting programming language called Scala.  It was ideal for the cloud and multicore programming challenges we were facing, fulfilled our stringent criteria for a programming language, and was fun to learn and use while enabling us to be very productive.  All of these reasons led us to decide to use Scala as the core programming language for our next generation DecisionTree framework.  So what exactly is this programming language, and why did we choose it?  Why does the programming language matter, anyway?

Scala was created in 2001 by Martin Odersky.  Odersky wrote the modern Java compiler — Java is an extremely widely used programming language in the enterprise, especially popular because of the way it enforces “type safety” and the correctness of a programmer’s program.  While Java was designed to be state of the art in 1995 and to help programmers solve the problems they were facing at the time, when he began the work on Scala, Odersky wanted to take a few steps back and think about what kind of language could help programmers tackle the new types of challenges they were beginning to face: for example, high-level domain modeling, rapid development and concurrent programming.  With these goals in mind, he built Scala on the JVM, which means that organizations could use existing software libraries written in Java.  Now, Scala is being used to solve those problems, and has a quickly growing user base with some significant adopters who have needed its power, most visibly including companies like Twitter, FourSquare, LinkedIn, the Guardian, Novell, and companies in the UK and US financial services.

One of his core intentions when creating Scala was to make programmers happy — by making their work easier and more productive.  It’s very concise and eliminates the boilerplate code that you see in languages like Java and C#.  This means that programmers can focus on the logic of their problems — it’s like when you can think of the perfect phrase or metaphor that exactly captures the problem.  Some languages feel very heavyweight and verbose, but they offer safety assurances and  the high performance that you need.  While Scala has all of the same safety assurances and performance characteristics, it feels like a lightweight and elegant “dynamic” language.

Here’s a simple example, comparing Scala to Java.  Say we want to create a dictionary where we can use the English word for a number to find the actual number.  For example, we could use the word “one” to look up the word 1.  For numbers 1 to 3, this would look like the following in Java:


Map numberMap = new HashMap(); numberMap.put("one", 1);
numberMap.put("two", 2);
numberMap.put("three", 3);

In Scala, it looks like this:

var numberMap = Map("one" -> 1, "two" -> 2, "three" -> 3)

Scala is also very expressive, as it combines two different programming paradigms: object-oriented programming and functional programming.  While I can’t fully explain the two paradigms here, let me just say that most programming these days is in the object-oriented paradigm, but functional programming is having a powerful resurgence.  In functional programming, you “compose” your program with functions — the basic idea is that you are building something complex with simpler parts, and everything is the same kind of thing (technically, everything is an expression or function).  But these parts need to be entirely self contained (they can’t have side effects).

The name “Scala” itself is a combination of the words “scalable” and “language”, because Scala is a language that’s designed to be extensible — it’s a language that can grow and change as the needs of programmers change.  And one reflection of this is the support for the Actor Model that the developers baked in to the language that simplifies the developer of parallel and distributed programs.  Another advantage of this flexibility is that when other developers weren’t satisfied with the implementation of the “Actor Model” in Scala, they just wrote their own and other folks could use it as if it was baked into the language from the start.  This started the “Akka” project, which has now joined together with the core Scala team in a company called Typesafe, to provide  enterprise support and tooling for Scala and Akka.

This is just the tip of the iceberg in terms of Scala and why we chose it, but it’s worth mentioning that it is a very practical language.   We can use a wide selection of GIS libraries available in Java.  And Scala gives us enough control to optimize our code to run extremely fast.  (See Erik’s technical blog about his R&D work on high performance Scala.)

If you’re a programmer and want to check out Scala, I highly recommend checking out the Scala website and the Typesafe website and the Typesafe blog.  If you’re especially interested in concurrent programming with Scala and Akka, I recommend the Typesafe ‘getting started’ tutorial which will walk you through putting together a parallel implementation of an algorithm to compute the digits of Pi.

What the Heck is… GPU?

workers

Most of us know that CPU or “central processing unit” is the brains of our laptops, workstations, servers, and smart phones.  As games and other 3D applications have grown in popularity, a specialized processor known as the “graphics processing unit” or GPU has become more important.  GPUs are the chips that perform the math and geometry calculations necessary to render the 3D scenes in a game or render the battle scenes and animate Gollum in the Lord of the Rings movies.

The power of CPUs has continued to grow at extraordinary rates since the 1960’s, enabling devices that are faster, smaller, lighter, and more powerful.  Despite this accelerating computing power, however, we have found that many types of analytical tasks – from calculating the effects of climate change on sea level rise to calculating walksheds – remain too costly (in terms of computing time) to be run on the web with more than a small number of users.  In 2006, we began attacking this performance and scaling problem with the development of our DecisionTree platform.  Originally designed for supporting business siting and real estate decisions, DecisionTree optimized the performance of each calculation by breaking it up into small chunks, distributing the work amongst several “workers” and then reassembling the results.  This “distributed computing” approach worked for this particular scenario, and we are now able to perform a calculation that previously required several seconds in 500 milliseconds or less, enabling the development of software like the City of Asheville’s Priority Places.

GPUs enable us to potentially extend this approach. The CPU in a contemporary laptop or workstation may have two or perhaps even four “cores”.  However, while multi-core CPUs are a relatively recent development, GPUs have been multi-core for many years – a contemporary GPU processor may have 250, 500 or even 1,000 cores on a single chip.  While these GPU cores are smaller, simpler and tightly focused on the mathematical calculations used for rendering images, a few years ago some scientists got the idea that they might be able to hijack all those cores to perform certain types of scientific computing tasks more rapidly by breaking up the work and performing it on all of those hundreds of GPU cores at once.  Thus was born General Purpose computing for GPUs or “GPGPU”.  GPU computing is transforming medical imaging, fluid dynamics, and other fields that can take advantage of this type of capability.

nsf_logoUsing GPUs is not a straightforward task.  In order to use all of those cores, we frequently need to completely re-think a given algorithm.  For the past six months, with support from the National Science Foundation’s SBIR program,  we have been developing GPU versions of several Map Algebra operations aimed at testing the feasibility of using GPUs to make radical improvements in the speed of raster GIS processing. We have made good progress with some operations accelerated more than 75 times.  We are not the only ones thinking about this, and we are incredibly excited about the potential to have a big impact on the responsiveness and scalability of GIS applications.

If you are interested in learning more about the technical details of working with GPUs, check out David Zwarg’s 6-part series on GPU computing in the Azavea Labs blog.  What new GIS applications do you think will be made possible with GPUs?  If you have some ideas, get in touch.

What the Heck is … Cloud Computing?

"Cloud Computing...got started several years ago with attempts to engage global networks of PC's into large-scale science problems"
From L to R: Amazon Web Services, Google AppEngine and Microsoft’s Azure Services offer some of the leading cloud computing platforms.

In the beginning, computers were devices that filled rooms and whole buildings. They slowly shrank in size until, in the 1980′s, computing underwent a revolution, bringing Apple Macs and IBM PC’s to our desks. In the 1990′s, we began to connect all those personal computers to each other using the internet, creating a global network of computers. We are now in the midst of another revolution. The current transformation is again returning computing power back to machines that fill rooms and even entire warehouses, but this time, instead of a single computer filling that space, there are thousands of them filling data centers run by new, old and unexpected companies. These new data centers are being used to create a ‘cloud’ of network-accessible services and have recently been rebranded with the latest buzzword (at Azavea, we always seek to be fully buzzword-compliant) as ‘cloud computing’.

DecisionTree geographic calculation tools running on the Amazon cloud computing
services will enable you to run high performance geographic calculators without requiring your own infrastructure.

Cloud computing has actually been around for a while. Even before the internet, networked computers that could break up many tasks into small chunks were said to be engaged in ‘distributed computing‘ or, more recently, ‘grid computing‘. Cloud computing is the same concept applied to internet-connected computers. It really got started several years ago with attempts to engage global networks of PC’s into large-scale science problems. The SETI@home project enabled people to contribute their idle PC’s computing power toward examining radio signals for evidence of extraterrestrial life. Similar projects for insight into protein folding diseases, decryption and the Large Hadron Collider for processing LHC experiments have been joined by global networks of spammers and hackers who manage thousands of compromised computers to form ‘botnets’ that are used to attack government computer systems or blackmail companies.

Aside from those bent on curing cancer or instigating global mayhem, contemporary cloud computing efforts are frequently aimed at more modest objectives. Amazon.com, the retailer, is one of the leaders in this field. What began as a way for Amazon to sell unused capacity in its data centers, Amazon Web Services (AWS) is now an entire suite of reusable services being leveraged for all sorts of activities that have nothing to do with selling books and movies. The AWS Simple Storage Service (S3) is an online data storage service. The AWS Elastic Compute Cloud (EC2) enables software developers to create ‘virtual’ computers running Linux or Windows that can be applied to any computing task. Other AWS services include credit card transactions, message queues, web search, and order fulfillment. AWS has been joined by similar services at Google and a new Microsoft effort called Azure.

Many cloud computing providers provide dashboards displaying system availability.

Now imagine you are a small company that has a new idea that will require lots of computer servers. Before AWS and other services, you would have purchased your own servers and built a data center. Now, you can skip all that hassle by hosting your new idea on an infrastructure maintained at a much lower cost by Amazon, Google or Microsoft. These services are priced like your electricity and gas — you pay by the unit of storage, computing time or other metric. So as you need more capacity, you fire up another virtual server, but you only pay for what you use.

So what does cloud computing mean for geospatial services? Cloud-based geospatial services are already common. The API’s for GoogleMaps, Yahoo!Maps, Microsoft Virtual Earth, and ESRI ArcGIS Online systems already provide some basic map display, geocoding, routing and other geospatial information services as hosted services. While none of these are based on the metered pricing that Amazon offers, I’m confident this type of business model is coming. A new company, Cloudmade, is focused on creating commercial services that leverage the OpenStreetMap database.

At Azavea, our cloud computing work has focused on two of our services: Cicero and DecisionTree. To learn more about Dave Felcan’s research project on the AWS Elastic Compute Cloud (EC2), read his article below.

What the Heck is … OpenStreetMap?

"Inspired by collaborative information commons such as Wikipedia, OpenStreetMap is an editable map of the whole world..."
Mumbai as documented in OpenStreetMap.

In the United States, we have a general policy of the federal government sharing useful data with the public. This policy has led to open distribution of geospatial data sets that include the Census Bureau’s TIGER line file, USGS topographic maps, aerial photography, land cover and elevation, a plethora of NASA imagery and even several global data sets developed by the military. This overall openness has been replicated by many U.S. cities and states as well.

While Canada and Australia have a similar legal tradition to the U.S. and some government GIS data is available, most developed countries in Europe and around the world make little or no geospatial data available to the public. In the United Kingdom, the Ordnance Survey maintains the most comprehensive and high quality national GIS database in the world, but the data is only available to the public for a steep licensing fee. In the developing world, data is either not distributed due to national security concerns or simply does not exist.

With the Census Bureau’s TIGER data as a starting point, private companies in the United States began building high quality base maps for commercial sale. These companies have grown and consolidated until there are only a small number that dominate the market, and the two largest, TeleAtlas and NavTeq, are now held by consumer electronics firms. These companies maintain global data sets, but the cost of licensing them is substantial.

It is within this environment of high costs and limited access to data that a project called OpenStreetMap began in the U.K. Inspired by collaborative information commons such as Wikipedia, OpenStreetMap is an editable map of the whole world, which is being built largely from scratch using GPS traces and other personal surveys. It is released with an open content license and available for free to anyone that wishes to use it. The project is a combination of software, data and knowledge. A variety of software tools have been developed to support online and off-line editing of the map data as well as its maintenance and distribution. A wiki is used to organize information about standards and processes.

Copenhagen as documented in OpenStreetMap.

Like other open data projects, such as Wikipedia or the Human Genome Project , the effort is not perfect. For example, there is not yet a standard mechanism for storing the data necessary to perform geocoding and routing; the concept of place name aliases is relatively weak; there is no formalized review process to identify and eliminate deliberate vandalism; the spatial data model is limited to points and lines; and while there are standards for what are valid attributes for each feature, they are not enforced, so the implementation of data elements is not yet very consistent. Nonetheless, the effort is growing rapidly and improving with time. There are now thousands of people building the map in almost every part of the world, and the size of the global database (known as ‘the planet file’) is now doubling every six months.

OpenStreetMap is a compelling example of how the power of loosely organized collective action can be brought to bear to create sophisticated new knowledge resources. In some parts of the world, OpenStreetMap is now more comprehensive than what is commercially available, and it will doubtless continue to develop. Azavea staff are both contributing to the database and exploring new ways to leverage the resulting map. If you would like to participate in its development, there are a number of resources online that will help you learn how. If you live in Philadelphia and would like to help improve the map, I’m organizing a Meetup in January and you are invited!

What the Heck is … ArcGIS Server?

"ArcGIS... enables access to the entire basket of GIS analysis capability included in the ArcObjects component framework."


Azavea was founded to build web-based software tools that support geographic analysis. For the past seven years, most of these applications have been based on the ESRI ArcIMS platform. ArcIMS was designed for map display and geographic queries and it does this well, but, apart from visualization, geocoding and routing, the platform’s analytical capability is limited. As ArcIMS has evolved, ESRI has also been steadily extending the analytical capability of its flagship ArcGIS platform, but these capabilities were largely inaccessible from ArcIMS.

ArcGIS Server (AGS) changes all of that. It enables access to the entire basket of GIS analysis capability included in the ArcObjects component framework. For the first time, it also packages the full capabilities of a geographic database, ArcSDE, with the map serving and analysis capabilities. In other words, it is a complete platform for server-based geographic analysis and visualization.


City of Philadelphia, Department of Records’ ParcelExplorer application, developed by Azavea using ArcGIS Server.

What do I mean by analysis? Well, anything that you can do with Spatial Analyst, toolboxes, ArcObjects and the modeling and geoprocessing platform can now be done on the server including: Map Algebra (for raster analysis); feature calculations such as merge, dissolve, buffer and intersect; routing; geographic searches; and models (sequences of processing steps that answer a question or transform a data set). And ArcGIS Server is not just about analysis. It enables you to publish maps on the web with the cartographic flexibility that you have with ArcMap and even supports digitizing and editing of map features. Finally, it is packaged with a set of software development tools that make building compelling web applications easier and faster.

With the release of ArcGIS Server 9.3, Azavea has seen substantial performance improvements as well as the release of new and powerful toolkits such as REST, Javascript and Flex API’s that support the rapid development of responsive and lightweight web applications.

ESRI will continue to support ArcIMS for a few years, but will not develop the platform further. All new R&D will be rolled into this ArcGIS Server product, so this is the platform for the future. Do you have questions about ArcGIS Server? Don’t hesitate to get in touch.

What the Heck Is … FLEX?

"Flex is an excellent choice for applications that need animation or complex controls that push the bounds of what is possible in a web browser."

Since Apple started automatically pushing out Safari to Windows users, nerds everywhere have been metaphorically beating each other up over browser benchmarks, hackability, and anti-aliasing schemes. But regardless of any particular loyalties, it’s a fact that things on the web look (and sometimes act) differently in different browsers.

Web pages are, at some level, just a set of instructions that need to be interpreted by a web browser to make the picture on your screen. Differences in web browsers such as Microsoft Internet Explorer, Mozilla Firefox, Apple Safari, and the various versions of each (and even differences between the same versions on different operating systems) make it hard for web developers to provide a consistent experience to users. Abobe’s Flex is an open source collection of tools that help developers make consistent, rich Internet applications, independent of a person’s choice of browser.

Applications made with the Flex framework run in Flash Player, a common browser plug-in which has been around since the late 1990′s. Flash was initially a popular way to add interactive graphics, animations, or video to websites, but has evolved into a platform for developers to build entire web and desktop applications. Flex includes a standard set of user interface objects (such as buttons, forms, and the usual features that people expect to see on web and desktop applications), and an object-oriented programming model familiar to web developers. Flex makes it easier for a developer to create feature rich applications that operate consistently regardless of the user’s operating system or browser. While there are other similar platforms for web-based user interfaces, such as ExtJS (a JavaScript library we use) and Microsoft’s Silverlight, Flex is an excellent choice for applications that need animation or complex controls that push the bounds of what is possible in a web browser.


The Washingtonpost.Newsweek Interactive’s The Root – Example of a Flex- enabled mapping interface that lets users map their family trees.

Azavea has already used Flex in a couple different scenarios, both in and out of map applications. The “Roots” section of Washingtonpost.Newsweek Interactive’s www.theroot.com uses Flex to display an interactive graphical family tree. While the data is stored in a conventional database and uses a conventional server behind-the-scenes, the interface is implemented in Flash using Flex and Flex-based diagramming tools. The Flex application interacts with the back-end server using web services (check out “What the Heck is a Web Service?”). Azavea’s DecisionTree uses Flex to power the interactive map page, providing enhanced browser interoperability and enhanced graphics, such as overlays with variable transparency that can be adjusted by the user on the fly.

We’re excited by what we’ve been able to do so far with Flex, and are looking forward to the forthcoming release of ESRI’s ArcGIS API for Flex, which brings the visual sparkle of Flex to ArcGIS Server applications.

What the Heck Is … PostGIS?

Every Azavea project has some kind of database components. Most of Azavea’s early projects used commercial databases such as Microsoft SQL Server, Oracle or Microsoft Access. I am a particular fan of SQL Server. Its low price point is paired with high performance and sophisticated features such as OLAP and data mining.

However, there are some terrific open source alternatives as well. PostgreSQL is one that we have been using increasingly over the past year. PostgreSQL is an advanced relational database engine with support for stored procedures, full-text indexing, sub-queries, replication and (yeah!) geospatial data. The GIS support is provided by a project called PostGIS. The founders and lead maintainers of the project are developers at Refractions Research, a small company based in Victoria, British Columbia. PostGIS extends the core PostgreSQL database engine by adding support for geographic objects, including the ability to execute geographic queries using simple SQL. Put another way, it ‘spatially enables’ the PostgreSQL database.


Example of a PostGIS enabled database.

We have used PostgreSQL and PostGIS on several new projects including The Root and the Election Incidents Tracking and Mapping application we built for the Committee of 70 (previous article), with more waiting in the wings. We’re also pleased to see that with ArcGIS 9.3, ESRI will also be adding support for PostgreSQL in the ArcSDE component of the ArcGIS Server platform.
.

Puzzle: What the Heck Is … That Photo!?

Photo used courtesy of the City of Philadelphia’s Water Department. www.phillyhistory.org

Occasionally, as the PhillyHistory.org team is posting historic photographs to PhillyHistory.org (from the Philadelphia Department of Records’ City Archives collection or the recently added Philadelphia Water Department collection), we come across some beautiful, bizarre, and sometimes inexplicable images. The photograph above is a great example of one of these discoveries.

This month’s puzzle is a bit different from our typical newsletter puzzles. We’re asking you to awaken the right side of your brain and come up with a creative caption describing what you think might be happening in the above photograph.

Head to www.phillyhistory.org to explore the photo collections of the City of Philadelphia Department of Records‘ City Archives and of the Philadelphia Water Department.

Send your caption to info@azavea.com. The winner (chosen by the super-saavy PhillyHistory.org team) will receive a $25 gift card to Barnes & Noble! The winning caption will also be published in our next newsletter.

What the Heck is…OpenLayers?

"[OpenLayers] is a good example of how open source software can support and extend commerical software in a positive way."

As you’ve just read, we rolled out a new feature in Sajara – we changed the map search feature from a custom component that Azavea had built to an open source tool called OpenLayers. OpenLayers is a toolkit that was originally developed by MetaCarta, but they gave it away to the public (nice people!) and it has since gathered quite a following. What is it? It’s a Javascript library that makes it easy to put a dynamic map in any web page. Furthermore, it does not rely on any particular map server technology and will work equally well with ESRI ArcIMS, ArcGIS Server, UMN MapServer, GeoServer and even GoogleMaps or Microsoft Virtual Earth. OpenLayers does not actually generate the maps – you still need a map server to do that – but it provides a simple and intuitive interface for interacting with map data. In the case of Sajara, we are using it with ESRI ArcIMS and the WMS Connector generating the map images.


Example of how OpenLayers can be used with baselayers from GoogleMaps in the genealogy mapping tool we created for the Washingtonpost. Newsweek Interactive’s ‘The Root’.

Why did we add OpenLayers to Sajara? The short answer was that we wanted to be able to support multiple map servers in order to give our clients more flexibility, but by incorporating an open source toolkit like this, we are also leveraging the thousands of hours of time invested by a global community of developers. While it is evolving rapidly (meaning that we get new functionality ever few months), it is a well-tested and responsive set of components. It also has two sister projects called TileCache and FeatureServer that add some additional capabilities.

Our initial foray with OpenLayers has been very positive. We have used it in some of our pro bono projects as well as work for other clients, and we are now testing it for potential future use in DecisionTree and Kaleidocade. It is a good example of how open source software can support and extend commercial software in a positive way.

What the Heck is…Map Algebra?

"...Map Algebra provides a vocabulary and conceptual framework for classifying ways to combine map data to produce new maps."

I have written before about GIS models, toolboxes and geoprocessing. But long before those concepts and products existed, many of us interacted with GIS software through a command-line interface. These command line interfaces were quite different from one product to the next. But for those of us who started out with raster processing after 1990 (I learned my first GIS concepts using the command line version of Idrisi) there was a unifying language that we could use to describe the various functions and processes: Map Algebra.

Developed through the 1980′s by Professor C. Dana Tomlin as part of his PhD thesis work, Map Algebra provides a vocabulary and conceptual framework for classifying ways to combine map data to produce new maps. While primarily applied to raster data sets (GRID and image data), the same concepts can be applied to many types of cartographic information, and it has since been extended by Dr. Tomlin and others to 3D, time and other domains. People use Map Algebra for a broad array of applications including: suitability modeling, surface analysis, density analysis, statistics, hydrology, landscape ecology, real estate, and geographic prioritization. Azavea has used Map Algebra on several projects and it is at the heart of our DecisionTree product.

What does it look like? Usually, Map Algebra can be expressed as text – for example, one might add several maps of rainfall like this: Rain_total = Rain_April + Rain_May + Rain_June, but it also lends itself to graphical flow charts like those in the ModelBuilder application.

Map Algebra is organized into four major groups of operations – local, focal (or neighborhood), zonal, and incremental. Each of these operations combines maps or transforms map data to create a new map. Part of the elegance of Map Algebra operations is based on the idea that all operations result in a new map. This makes it easy to group and string functions together into larger models. While there are different flavors of Map Algebra, the overall concept is used in every GIS system that supports raster calculations. Early in its development, Dr. Tomlin made the decision to openly share all of the source code, documentation and algorithms with anyone that asked (pretty nice guy!). Consequently, the ideas and source code were incorporated into many commercial software packages. In the ESRI product suite, Map Algebra capabilities are provided through the SpatialAnalyst extension.

Illustration of a Neighborhood Maximum operation applied to produce a new data set.

If you would like to learn more about Map Algebra, the original book is Geographic Information Systems and Cartographic Modeling by C. Dana Tomlin. He is also the co-director of a research lab called the Cartographic Modeling Lab. And if you want to use Map Algebra and cartographic modeling in your organization, give us a call.

What the Heck is…an SBIR?

"Through SBIR-funded research, Azavea is able to develop new technologies that we hope will both provide social value and grow into new products that create jobs and solve complex problems. "

Azavea was awarded a research grant by the National Science Foundation in December, 2006. This was our third such award in two years, and we are pretty proud. Usually, private companies are not allowed to be recipients of government grants, which are primarily awarded to universities, government agencies, and non-profit organizations. But these grants are different. They are part of the Small Business Innovation Research (SBIR) program. The SBIR program was started in the 1982 by the National Science Foundation and now includes a dozen federal agencies.

SBIR grants and contracts are awarded based on a competition with each federal agency having its own variations on the rules, but with a largely similar format. The process is separated into two phases. In Phase I, companies submit innovative ideas for products and services that match a set of priorities issued by each agency. The agencies evaluate the ideas and award grants to the ones that appear both feasible and contribute to the objectives of each awarding agency. If the company wins a Phase I award – the process is extremely competitive with only 1 in 10 applications being successful – they have six months to prove the feasibility of the idea. If the idea is proven feasible, then they are allowed to submit a Phase II proposal.

Phase II awards are for larger amounts of money and are two-year grants, during which the company must develop a commercial product or service and bring it to market. Usually, the SBIR awards are not sufficient to complete the effort and require additional investment from the firm before they can be delivered to the marketplace. However, while limited in size, SBIR grants serve as a sort of venture capital effort that can fund high priority but risky projects that might not otherwise receive funding from the private market.

So why does Azavea go after these SBIR grants? We fund much of our R&D efforts internally, but sometimes we are presented with a complex technical hurdle that we are not sure how to solve. Or we have an idea for a product, but no client yet, and need a way to jump-start the development. Our DecisionTree geographic prioritization software, for example, was partially funded by a SBIR research contract from the Department of Agriculture. Under the grant, we tested the feasibility of developing a faster raster calculation engine. Through SBIR-funded research, Azavea is able to develop new technologies that we hope will both provide social value and grow into new products that create jobs and solve complex problems.

What the Heck Is a Toolbox?

We all know what a ‘toolbox’ is in the physical world, but what do we mean by a toolbox in a GIS context? Toolboxes are a way to wrap up a series of GIS processes into a small software program. The ESRI ArcGIS platform includes several toolboxes with the desktop ArcView, ArcEditor and ArcInfo licensees. These toolboxes include things like ‘Data Management’, ‘Conversion Tools’ and ‘Analysis Tools’. Additional toolboxes are provided with extensions such as Spatial Analyst.

But toolboxes are not limited to functionality delivered by ESRI. Any GIS software process can be automated and turned into a toolbox for use in your organization. Toolboxes can be created from GIS models, python scripts or custom ArcObjects programs.

At Azavea, we are using the toolbox technology to automate the integration of the legislative districts that drive our Cicero web service. Our DecisionTree product also includes a custom toolbox that helps to create the raster GRID files that can be used as inputs in the online application. But the most exciting development with toolboxes arrived last year with the release of ArcGIS Server.

ArcGIS Server is much more than the successor to the internet map server technology in ArcIMS. While it is able to perform tasks such as map generation and geocoding, the full range of capabilities in the ArcObjects framework can be accessed. In addition, many types of toolboxes and models can be ‘published’ as web pages that enable users of an ArcGIS Server application to run those tools without the desktop application. This is an incredibly powerful capability. It means that not only can you build models and toolboxes to automate your desktop processes, but you can now enable visitors to your website to perform many of the same tasks. So, for example, let’s say that you work at a land trust. You might have built a conservation prioritization model to enable people inside your organization to quickly assess properties based on a series of input data sets. ArcGIS Server now makes it possible to make that model available to the town planning boards, citizen groups and other stakeholders in your region.

What the Heck is…a GIS Model?

"When I founded Azavea seven years ago, one of my dreams was to make the process of building and executing GIS models easier."
Robert Cheetham

We usually think of a ‘model’ as a way of representing the world. But the term model can be a bit confusing in the GIS world. There are data models – a way of representing the world in a database. We have many ways of representing the world in a GIS database – points, lines, polygons, images, surfaces and 3D volumes are the most common but there are many variations on these basic building blocks. In recent years standard data models have been developed to encompass common concerns in particular domains. ESRI and other organizations have published data models for transportation, land records, hydrology, telecom, water/wastewater, to name just a few. Contemporary software is usually structured in terms of objects. Object models help us to represent the world in a software program.

A third type of model represents our world in terms of processes. In this sense, a GIS model is a sequence of processes that generate a measurement, create a map, transform existing data sets into new ones or run repeatedly to create a simulation. The objectives of a process model can vary broadly. Very commonly, a model is simply a way to automate a sequence of actions that we would otherwise have to perform manually. In other cases, the model may be generating a measurement or other output for a particular set of inputs.

Azavea has worked on a few projects that were composed almost entirely of this type of model. The Natural Lands Trust developed the SmartConservation model, a methodology for scoring any location in SE Pennsylvania by calculating more than 40 different conservation and landscape ecology metrics. These scores were combined into a single SmartConservation score for a property that indicated its conservation value. Azavea wrote software using ArcIMS and ArcObjects to automatically calculate these metrics with only a web browser.

These types of models have existed on paper for as long as people have been using GIS software, but it become much easier to chain together a series of operations with the advent of flowchart-style tools now present in several GIS software packages. In the ArcGIS environment, models are created by either writing a script or using ModelBuilder. ModelBuilder is a visual programming language that enables an ArcGIS user to drag data sets and GIS processes onto a drawing surface where they can be connected together and turned into sequences of operations. The models (which are also known as ‘tools’) can be strung together into larger models, can be shared amongst users with common data sets and can even be published on the web using ArcGIS Server.

When I founded Azavea seven years ago, one of my dreams was to make the process of building and executing GIS models easier. Are there processes you would like to automate or geographic models you would like to build? Give us a call.

What the Heck is a Web Service?

In addition to development of custom GIS web applications, Azavea has been developing web services for the past few years. What exactly are web services? They are a standards-based way to provide software building blocks over a network. They are not complete web applications on their own. Rather, they are small pieces of capability that can be combined to build new applications. A web service is also sometimes called a Web API (Application Programming Interface).

One web service that Azavea developed and hosts is Cicero. Cicero is a legislative district locator, elected official database, and legislative mapping service that provides data on local, state, and national legislatures. It is being used to support political advocacy campaigns and data integration.

ESRI also offers a suite of web services known as ArcWeb Services that provides geocoding, spatial query, and map generation capabilities that can be integrated into any application with access to the web. Several of Azavea’s web applications use ArcWeb Services, including Delaware Valley Association for the Education of Young Children’s (DVAEYC) CONNECT Services. This application uses ArcWeb for routing and geocoding. The key advantage is that the service provider (ESRI) takes responsibility for providing up-to-date street data, and we can focus on how we want the application to use the data instead of managing it ourselves.

Web services can also be chained together so that one building block is used by another to provide a new capability. For example, in Cicero, we use ArcWeb Services for locating addresses, but then we use the Cicero data for looking up the legislative districts, creating maps or finding data about legislators for the location. When web services are linked together like this into a more complex system, it is sometimes known as a Service-Oriented Architecture (SOA).

Web services are a fundamental part of the Web 2.0 revolution that focuses on making data open and easily sharable. There are several web sites that facilitate working with web services. Programmable Web is sort of like a phone book for public web API’s. OpenKapow enables users to develop their own web services that consist of sequences of actions one would take on a web site to view data or perform an activity. And Yahoo! Pipes enables users to combine sequences of RSS feeds into customized data streams.