Scala for Distributed Processing

Concurrent programming, in which a program is organized as processes that can run in parallel or be distributed across a cloud of separate computers, is at the heart of many of Azavea's technologies. Josh Marcus' research is focused on the new generation of programming languages and libraries that use messaging passing instead of shared memory to communicate, which makes interactions between processes or threads explicit instead of implicit. Josh is interested in the Actor Model, a model of message passing, as an approach that is potentially safer and easier than traditional thread and shared memory methods, and as an approach that unifies parallel and distributed computing.

scala.jpgRecently, Josh has been exploring the use of Scala and the open source Akka project for use in a distributed geoprocessing platform. Scala is a programming language developed in 2003 that runs on the Java Virtual Machine that was designed to integrate functional and object oriented programming, and includes an implementation of the Actor Model (based on Erlang). For geoprocessing in particular, it has the advantage of allowing access to a wide array of Java libraries and having strong support for building domain specific languages. Scala's status as a hybrid language also allows development that use tools (e.g. mutable variables) that are important for raster processing, but are not allowed by other languages with a functional or concurrent programming focus. Scala is also a very fun and concise language to program in -- Josh has had a long standing interest of finding a bridge between the expressiveness of the dynamic languages (Python, Ruby, etc.) and the stability and power of type safety, and Scala is a good step in that direction.