Tag Archive:
Cloud Computing

What the Heck is … Cloud Computing?

"Cloud Computing...got started several years ago with attempts to engage global networks of PC's into large-scale science problems"
From L to R: Amazon Web Services, Google AppEngine and Microsoft’s Azure Services offer some of the leading cloud computing platforms.

In the beginning, computers were devices that filled rooms and whole buildings. They slowly shrank in size until, in the 1980′s, computing underwent a revolution, bringing Apple Macs and IBM PC’s to our desks. In the 1990′s, we began to connect all those personal computers to each other using the internet, creating a global network of computers. We are now in the midst of another revolution. The current transformation is again returning computing power back to machines that fill rooms and even entire warehouses, but this time, instead of a single computer filling that space, there are thousands of them filling data centers run by new, old and unexpected companies. These new data centers are being used to create a ‘cloud’ of network-accessible services and have recently been rebranded with the latest buzzword (at Azavea, we always seek to be fully buzzword-compliant) as ‘cloud computing’.

DecisionTree geographic calculation tools running on the Amazon cloud computing
services will enable you to run high performance geographic calculators without requiring your own infrastructure.

Cloud computing has actually been around for a while. Even before the internet, networked computers that could break up many tasks into small chunks were said to be engaged in ‘distributed computing‘ or, more recently, ‘grid computing‘. Cloud computing is the same concept applied to internet-connected computers. It really got started several years ago with attempts to engage global networks of PC’s into large-scale science problems. The SETI@home project enabled people to contribute their idle PC’s computing power toward examining radio signals for evidence of extraterrestrial life. Similar projects for insight into protein folding diseases, decryption and the Large Hadron Collider for processing LHC experiments have been joined by global networks of spammers and hackers who manage thousands of compromised computers to form ‘botnets’ that are used to attack government computer systems or blackmail companies.

Aside from those bent on curing cancer or instigating global mayhem, contemporary cloud computing efforts are frequently aimed at more modest objectives. Amazon.com, the retailer, is one of the leaders in this field. What began as a way for Amazon to sell unused capacity in its data centers, Amazon Web Services (AWS) is now an entire suite of reusable services being leveraged for all sorts of activities that have nothing to do with selling books and movies. The AWS Simple Storage Service (S3) is an online data storage service. The AWS Elastic Compute Cloud (EC2) enables software developers to create ‘virtual’ computers running Linux or Windows that can be applied to any computing task. Other AWS services include credit card transactions, message queues, web search, and order fulfillment. AWS has been joined by similar services at Google and a new Microsoft effort called Azure.

Many cloud computing providers provide dashboards displaying system availability.

Now imagine you are a small company that has a new idea that will require lots of computer servers. Before AWS and other services, you would have purchased your own servers and built a data center. Now, you can skip all that hassle by hosting your new idea on an infrastructure maintained at a much lower cost by Amazon, Google or Microsoft. These services are priced like your electricity and gas — you pay by the unit of storage, computing time or other metric. So as you need more capacity, you fire up another virtual server, but you only pay for what you use.

So what does cloud computing mean for geospatial services? Cloud-based geospatial services are already common. The API’s for GoogleMaps, Yahoo!Maps, Microsoft Virtual Earth, and ESRI ArcGIS Online systems already provide some basic map display, geocoding, routing and other geospatial information services as hosted services. While none of these are based on the metered pricing that Amazon offers, I’m confident this type of business model is coming. A new company, Cloudmade, is focused on creating commercial services that leverage the OpenStreetMap database.

At Azavea, our cloud computing work has focused on two of our services: Cicero and DecisionTree. To learn more about Dave Felcan’s research project on the AWS Elastic Compute Cloud (EC2), read his article below.

Research: The Amazon Elastic Cloud

"I am exploring the use of The Elastic Compute Cloud (EC2) as a resource for some of Azavea projects already in use. DecisionTree, our geographic prioritization system, was an ideal first candidate..."

I am very excited about my Azavea research project on the Amazon Elastic Compute Cloud (Amazon EC2), a technology from Amazon Inc. that is shifting a lot of people’s ideas about what computing is and can do. Amazon EC2 has arisen through the confluence of technological innovations of the past few years.

First some background. One of the most basic pieces of infrastructure in the World Wide Web today is the ubiquitous entity known as “The Server”. This term is used for a computer which performs some task or tasks on behalf of other computers. For example, web pages come from a web “server”, which sends web pages to your computer for you to see. Moreover this web server may in turn query other servers to complete this request — contact a database server to get data or geospatial server to produce a map image for example.

The idea behind a computing “cloud” (and there are others — as referenced in Robert’s ‘What the Heck is…” article above) is a bunch of computers accessible from the internet which “instantiate” whole virtual computers — with all their associated operating systems, software, data, etc. — that can be accessed on demand. One can instantiate one of these machines, connect to it via the internet through standard remote connection protocols, and voila! your screen shows the desktop for this “computer” that behaves exactly as if it were sitting under your desk.

While for desktops, this approach is odd, for servers there can be many benefits. With a few clicks of a mouse, multiple copies of the same server can be up and running at the same time to handle increases in demand. They can be shut down again when not needed. The details and headaches of actually running and owning physical machinery are offloaded to the cloud provider. The cloud provider also provides bandwidth. Once you have a working version of a website, database, or geospatial server, it can be copied and reused — no need to start from scratch with configuration.

For my research project, I am exploring the use of The Elastic Compute Cloud (EC2) as a resource for some of Azavea projects already in use. DecisionTree, our geographic prioritization web system, was an ideal first candidate. This product requires strong computing resources and was designed from the ground up to be able to run on multiple computers. With EC2 we were able to run DecisionTree on 10 instances at once, dramatically speeding up its operations and providing a mechanism for running DecisionTree for customers who do not want to maintain their own server infrastructure.

In addition to DecisionTree, we are also experimenting with running our Cicero legislative and election data service on EC2 as well as other ways to leverage the Amazon Web Services. For example, last spring, we tested a map image ’tile cache’ service that will generate and store a set of map tiles, enabling an organization to reduce bandwidth usage and improve responsiveness of a high traffic web mapping application. While EC2 was originally limited to Linux-based software, the recent addition of Windows Server as a target platform has provided much more flexibility. Do you have ideas for how you could use Amazon Web Services for your GIS project? Let us know.