Cloud Optimized GeoTIFF Support in GeoTrellis 2.0 Release

Cloud Optimized GeoTIFF Support in GeoTrellis 2.0 Release

The GeoTrellis project just published the official 2.0.0 release. The most significant change is native Cloud Optimized GeoTIFF (COG) support. Additionally there are a number of smaller new features, bug fixes, and performance improvements. This post will provide an overview of the most important improvements, but you can find the complete list of changes here.

GeoTrellis 2.0.0 supports Cloud Optimized GeoTIFFs

Impact on users

A 2.0 release from the previous version, GeoTrellis 1.2, represents a major release following SemVer and means there are API breaking changes. However, 2.0 is not a library overhaul. For many users, no changes to existing codebases will be required to continue using GeoTrellis as before.

Projects that depend on HBase, Cassandra, S3 backends may require some code changes. Additionally, it is the first release with GeoTiff Overviews support (which are required to support COGs), this required some changes to GeoTiff types.

To use GeoTrellis 2.0, you will need to upgrade your project and change the organization to “org.locationtech” as shown below:

libraryDependencies += "org.locationtech.geotrellis" %% "geotrellis-raster" % "2.0.0"

GeoTrellis will still be available on Maven Central via sonatype’s nexus repository in addition to repo.locationtech.org.

Any questions about migrations or how these changes may affect your project should be posted to the GeoTrellis Gitter.

Cloud Optimized GeoTIFFs – COGs

There has been a lot of well-deserved excitement around Cloud Optimized GeoTIFFs (COG) recently. We can thank Chris Holmes from Planet / Radiant.Earth / everywhere else for his ability to push for focus and move folks forward together to make it the standard. See this community COG website, Chris’ 3-part series on COGs, this description of COGs here, and this discussion of COGs for Deep Learning. Amongst many others.

So what does it mean to have COG support in GeoTrellis? At the highest level, it means users can build (internal) and read (both internal and external) overviews, create real COG raster according to this python validation script. Previously GeoTrellis employed a custom Avro format for reading and writing layers. This meant that source data needed to be copied and rewritten in a format optimized for GeoTrellis that could not be easily consumed elsewhere.  New Layers API, the term GeoTrellis Layer changed and moving forward, we will call old GeoTrellis layers Avro Layers, as all Tiles are stored as compressed Avro blobs.

COGLayers is the new LayerAPI that enables users to store scenes as valid COGs with generated VRTs, which in turn makes it possible to view ingested layers via common GIS tools (like QGIS and ArcGIS!).

Use Cases

It is exciting to visualize COGs in action. One good way to do that is through Azavea’s Raster Foundry project.

We incorporated native COG support into Raster Foundry. The following two GIFs show a user supplying a remote URL to define a COG source image in Raster Foundry and then using a GUI to build a model to perform analysis on a COG layer.

This is exciting because data can be consumed quickly without duplication while performing analysis on large rasters all in the browser.  Raster Foundry uses AWS S3 with COGs stored in Postgres to query the layers.

 

Highlighted changes

Refactored HBaseInstance, and CassandraInstance to be more flexible and fit a more general pattern of user requirements.

Dependence on cats

We moved from scalaz and our own Functor implementation towards cats library usage. Read more about our evaluation here.

ETL package deprecation­­

The ETL package is being deprecated, and we don’t plan to actively maintain it.  However feel free to report issues here, and we are also happy to provide migration support  via our Gitter channel.

Slick package deprecation

We are happy to report that the slick package has been deprecated!  As a result, some objects were moved to new locations (see the change log for clarifications).

Collection Layer Readers SPI interface

Introduction of SPI interfaces for CollectionLayerReaders.

Behavior changes

All tiffs are read compressed and streaming by default, there is no decompress flag on reads at the moment.

What’s next

COGs

There is still more work to be done. Using COGs incurs a performance cost compared to Avro-encoded layers. We’d like to get this performance closer to parity.

We’d also like to improve query performance for large spatiotemporal layers by implementing more advanced space filling curve indexing techniques, based on the approach used in GeoWave

Pipeline Project

The old ETL process was not as effective as we had hoped; lots of GeoTrellis ingests required an ETL process that was not as simple as it should have been. Instead, we are planning to introduce a pipeline process – following a model used in the PDAL pipeline project.  Read more about it here.

GDAL Native

Increase inter-operability of GeoTrellis systems by allowing them to read a wider range of formats using standard tools like GDAL.

Virtual Tile Mosaic

Improve the ability to work with COG layers.

GeoTrellis Server

GeoTrellis Server is a lightweight server for developers that simplifies interaction with GeoTrellis compatible formats (for instance Avro / COG Layers, GeoTIFFs). GeoTrellis server is based on http4s and uses MAML (Map Algebra Modeling Language) to perform operations on tiles. MAML allows for the definition of image processing pipelines over a variety of different data sources, allowing stored imagery to be recombined and altered in interactive applications.  With MAML, GeoTrellis server aims to provide a framework to build web servers more easily from your existing GeoTrellis layers. This will include support for OGC’s WCS standard, as well as TMS in the not-so-distant future.

Connect with us

We appreciate hearing about the projects that GeoTrellis supports – please get in touch via Twitter, our mailing list, our Gitter channel, or email to share what you are working on.

  • GitHub – Issues, codebase, documentation, everything you need
  • Our mailing list – Stay informed about releases, bug bashes, and GeoTrellis updates
  • Gitter – Scala is hard. We can help. Come ask questions about your GeoTrellis project
  • Twitter – We send team members to conferences, workshops, and share Big Data Open Source Geo project news
  • Email – Have questions about a project idea that could benefit from processing rasters at a scale? Reach out to us via email – we’d love to hear from you!