In terms of raw data, the earth observation industry is undeniably exploding. Investments in freely available data from satellite constellations like MODIS, Landsat, and Sentinel have democratized access to timely satellite imagery of the entire globe (albeit at a lower resolution than you’re accustomed to seeing on Google Maps). Meanwhile, cloud providers like AWS and Google Cloud have gone so far as to store satellite data for free, further accelerating global usage of these images.
The trouble, naturally, is that interpreting the content of satellite imagery is not an easy task. In the field of remote sensing, researchers have been applying algorithmic techniques to the challenge of earth imagery interpretation for over 70 years. Until relatively recently, even simple tasks like identifying building footprints or distinguishing tree canopy in urban areas were laborious sub-specialties of the field. Then, in 2012, the “deep learning revolution,” opened up an entirely novel frontier of useful algorithms that could be applied to satellite imagery with state-of-the-art results.
Modern machine learning techniques, and deep learning, in particular, have made tasks like object detection, object counting, semantic segmentation, and generic image classification much more straightforward to create. Through a process called supervised learning, you simply provide the model as many hand-annotated examples as you can feasibly gather so that it can “train” itself to faithfully mimic those handmade examples when it makes predictions. Deep learning has proven to be an incredibly versatile technology–everything from apps on your phone that make you look younger, to the Alexa voice assistant, to Tesla’s Autopilot feature.
Deep learning traces its origins back to the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). The field of deep learning was initially focused on the kinds of images in datasets like ImageNet (which contains over 1 million individually annotated images scraped from the internet). These images tend to be relatively small files with three “channels” in the visible spectrum (red, green, and blue) and are stored in common file formats like PNG or JPEG. None of these characteristics resemble a typical satellite image, which is a principle reason why it’s still difficult to apply deep learning techniques to satellite imagery.
Most popular deep learning architectures are not designed for imagery that is often a gigabyte or larger, may contain over a dozen channels (most of which are not in the visible spectrum), and is stored in spatially referenced file formats like GeoTIFF and JPEG2000. So while advances in machine learning for computer vision tasks have led some, like Google’s former CEO Eric Schmidt, to declare image recognition a “solved problem,” for folks interested in applying these techniques to satellite imagery there remain many obstacles for even basic workflows.
Unlike scrapable imagery on the web, satellite imagery has traditionally been difficult to access. However, in recent years, analogs to ImageNet focused specifically on satellite imagery have filled the void. SpaceNet, for instance, is a semi-annual competition and collection of datasets focused on extracting information from satellite imagery. Similar competitions like xView and Functional Map of the World have dramatically expanded the availability of high-quality geospatial datasets for benchmarking new algorithms.
Still, even though training datasets for satellite imagery are freely available, the problem of actually wrangling that data or amending the architecture of common machine learning models to work with that data is still mostly in the research phase. Azavea has invested significant resources into making this final piece of the puzzle easier, namely via our open source python library for applying machine learning to satellite imagery called Raster Vision. Raster Vision allows users to do three messy things in an elegant way:
- Transform satellite imagery into a format that plays nicely with most machine learning frameworks. You can “chip” a large image into hundreds or thousands of smaller images that can be used to train a model and then retrospectively stitched back together while maintaining all of the relevant geospatial information crucial to most mapping tasks.
- Abstract the process of using common machine learning libraries, like PyTorch, so that you can easily train models, evaluate their results, and manage different experiments in parallel.
- Package trained models so that you can easily deploy them in different settings (e.g. online vs. “at the edge”) and use them to predict on new data.
Today, the availability of satellite imagery still far outpaces the commercial and scientific communities’ capacity to analyze it. Tools like Raster Vision are only a small, foundational step toward a future where extracting answers from satellite imagery is as easy as asking questions. Information contained in satellite imagery is an irreplaceable asset for tackling challenges as wide-ranging and important as quantifying the effects of climate change, predicting crop yields, and calculating progress toward the Sustainable Development Goals globally. Machine learning, and in particular, fast-evolving sub-disciplines like deep learning come with the promise of making satellite imagery analysis easier, more scalable, and even more broadly applicable.