Machine Learning to Drive Urban Resilience: Mapping Tree Canopy with the World Bank

Machine Learning to Drive Urban Resilience: Mapping Tree Canopy with the World Bank

While we may not exactly be tree-huggers, it’s fair to say that Azavea has been friendly with trees. We’ve mapped the change in Philadelphia’s tree canopy, developed a mobile tree mapping application, and even planted trees in front of our Spring Garden Street office. Throughout our 20 year history, we and our arboreal companions have been as tight as bark on a…well, I’m sure you can figure the rest out.

We recently had another opportunity to work on a tree-related project with the World Bank’s Digital Works for Urban Resilience program. As one of seven pilot programs to address environmental issues in Africa, we trained student workers to label satellite imagery using GroundWork and created a machine learning model to identify tree canopy.

Driving urban resilience in Africa

Digital Works for Urban Resilience addressed two issues simultaneously. One, the need for better information to drive decision-making in the rapidly growing urban areas of Africa. And, the increased difficulty in obtaining safe employment during the COVID-19 crisis. It explored various ways of creating rich datasets to drive urban planning and disaster preparedness by training and employing young Africans affected by the pandemic.

Azavea’s “rapid pilot” centered on the greening efforts of Freetown, Sierra Leone, where one of the city’s objectives is to plant 1 million trees and increase vegetation coverage by 50 percent. To monitor progress, it needed: a baseline map, a change detection model, and a replicable and successful data labeling protocol. Azavea stepped in to help test such a workflow, map tree canopy in Freetown, as well as Dar-Es-Salaam, Tanzania, and train a model from the resulting data.

Mapping tree canopy

As part of our managed annotation services, we maintain a team of CloudFactory (CF) data analysts trained in the use of GroundWork and annotation of satellite imagery. Our data analysts provided professional-level annotation of the 30-50cm Maxar imagery obtained by the World Bank. The main objective of the pilot was to test the feasibility of developing labeling workforces for urban resilience across Africa. So, we also trained 50 students of the Tanzania Resilience Academy to label tree canopy. This allowed them to complete a paid work placement program in lieu of the one interrupted by the COVID-19 pandemic. The students mapped over 4,500 GroundWork tasks in imagery of Freetown and Dar-Es-Salaam.

Resilience Academy students used GroundWork to label tree canopy in Freetown, Sierra Leone. (Source: The World Bank)

Rooting out issues

In this two-class semantic segmentation campaign, image interpretation was a bit challenging, due to the image resolution. Large shrubs proved to be the trickiest feature for both the professional and student labelers to identify, as the resolution made it hard to distinguish between shrubs and tree canopy. The large size of the imagery and the individual tasks within GroundWork provided a greater data labeling challenge.

Despite a short pilot period, student labelers were able to map significant amounts of Freetown’s tree canopy (in yellow), as shown in the GroundWork task map.

With more people and hours to devote to the campaign, the Resilience Academy students were able to label significantly more of the Freetown imagery than CF. However, some tasks were so detailed that they required an hour or more to complete. This caused fatigue and posed technical problems given network limitations. On the contrary, CF analysts were more experienced at handling large tasks, but smaller numbers meant we needed to triage (or rather, tree-age?) the task map to get a representative sample of labels. We developed two new GroundWork features to address such problems: the ability to split a task into smaller parts and to open nearby tasks while in the labeling interface.

Developing a new digital workforce

To facilitate workflow development and assure quality, we held a virtual “face-to-face” session for the students and their technical advisors. We also held weekly calls, provided routine visual feedback, and guided the advisors in developing management documentation, such as “burndown” charts. In return, our partners provided local knowledge and surfaced edge cases and technical issues. Collaboration was especially important given the speed with which Resilience Academy labelers had to learn not only how to identify tree canopy, but also how to use GroundWork, and in many cases, to label satellite imagery at all.

A Resilience Academy student uses GroundWork to label tree canopy.
(Source: Chris Morgan, World Bank)

Student participants reported that they enjoyed the collaborative work environment, which they felt contributed to their skills development and productivity. Several even used their new skills to participate in a competition we ran to label cloud cover. Despite their success, the short time frame of the pilot made the learning curve almost too steep. We all agreed that future efforts should include a longer training period.

Modeling tree canopy

Once labeling was complete, machine learning engineers James McClain and Adeel Hassan developed a model to identify tree canopy from the results. They used Azavea’s machine learning framework Raster Vision to process and consume the training data and train the models. The engineers employed several deep learning approaches to developing an optimal model, including: comparing Panoptic FPN and DeepLab V3 model architectures, pre-training, and making use of other imagery bands. The final model was an “ensemble” of those approaches that produced the best output. The ensemble model showed promising results–it was able to accurately (based on the F1 score) predict tree canopy in both Freetown and Dar-Es-Salaam. While it is probable the model will return accurate predictions on similar geographies in a similar season, we find it’s always best to retrain the model with data that address those discrepancies.

Sample output of a model trained on student labels. The results look good despite the low validation scores.

During the modeling process, we compared the student and CF labels, and the results showed that despite their limited experience, student labelers were able to produce good quality annotations. Though the CF dataset was more accurate, the students’ data produced a usable model. This bodes well for the development of a labeling workforce in Freetown and other cities in Africa.

Going out on a limb for climate change

The next step for the Digital Works for Urban Resilience program is to incorporate these micro-tasking-type workflows into long-term projects. They also hope to better understand how these types of data labeling “microtasks” can address skill-building and short-term employment opportunities for populations beyond university students. 

At Azavea, we believe you know a tree by its fruit, and we’re always looking for ways to create impact for good in the world. If you’re looking to join us, we have a suite of machine learning tools, frameworks, and knowledge to help you use artificial intelligence to combat climate change. And if you have a project you need help with, don’t hesitate to reach out to us.