Ethical Machine Learning for Disaster Relief

After the deluge, there’s the…well, deluge. Or so it might seem to aid workers and agencies used to the phenomenon known as the “second disaster.” Following a crisis, a sea of donations and volunteers from across the globe flood relief agencies. The second calamity? How few of them are suitable for a crisis.

At best, abandoned items create environmental hazards. In 2004, for example, Indonesians burned rotting piles of unneeded clothing donations following the Indian Ocean tsunami. At worst, they divert (often scarce) time and resources away from victims.

Several people sift through a large pile of donated clothing sitting on an Indonesian beach. — Donations pile up on a Banda Aceh beach after the 2004 tsunami. (Source: USAID)

In many ways, technology in disaster relief has advanced exponentially since 2004. But, if the bottled water left to expire on Puerto Rican airport runways post-Hurricane Maria are any sign, much has stayed the same.

Phenomena like “the second disaster” invite those of us committed to ethical machine learning to contemplate our humanitarian efforts. Good intentions don’t always build good roads. The impact of the exploitation of human and natural resources on the modern landscape further complicates the ethics.

Azavea strives to use geospatial technology for good. Many in the industry feel likewise. Given that, before we aim the powerful capabilities of machine learning at vulnerable communities, which questions should we ask? How can we best harness that power to produce just outcomes for the communities we are trying to help?

Ethical machine learning and Earth observation: Redistributing power

The ability to photograph the Earth from above has had a massive cultural influence. In the nineteenth century, Nadar’s hot-air balloon photography inspired author Jules Verne. In the twentieth, The Blue Marble (an image of Earth taken during the Apollo 17 mission) inspired a generation of environmental activists.

An 1862 caricature by Honoré Daumier shows the photographer Nadar taking pictures from his hot-air balloon. — Photographer Gaspard-Félix Tournachon, known as Nadar, was the first to practice aerial photography. (Honoré Daumier, 1862)

Military and colonial uses have also molded its legacy. The First World War propelled the development of aerial reconnaissance. Trench warfare made the cavalry obsolete, and the airplane and camera took its place. This militaristic legacy was further entrenched as hot war tensions ossified in the Cold War era and the satellite rocketed surveillance capabilities beyond the stratosphere.

Now, even celebrities can get in on the espionage game. See: George Clooney, who, with the Satellite Sentinel Project (SSP), “kept an eye on Omar al-Bashir” using satellites over the border of Sudan and South Sudan.

Though efforts such as Clooney’s are laudable, it is fair to question why the Sudanese have less access to their airspace than an actor half-a-world-away. To avoid perpetuating disparities, we must recognize how history has biased our datasets and produced asymmetries to access.

De-biasing the dataset

The World Bank discusses the dangers of geographically biased datasets, noting the difference in building characteristics between African and European cities. Algorithms trained on one dataset are likely to perform worse on the other. Even the mechanical eye of the satellite is biased towards what it can see, and areas lacking urban infrastructure are in some ways less visible. The Black Marble, which shows how different North and South America appear from space at night, is a well-known example of this phenomenon.

Image of the Earth as seen from space. North America is well-lit compared to South America. It illustrates a difficulty of applying machine learning to satellite imagery--not all places are equally visible. — Satellite imagery reveals the difference in infrastructure in North America v. South America (lower right). (Source: NASA)

How can we analyze what we cannot see? Scientists studying landslide risk and climate change note the relative lack of information about the Global South. Without adequate datasets against which to train models, the power of machine learning to affect change in these regions is limited.

Following a disaster, these areas are also more at peril than developed nations. The United Nations Department of Economic and Social Affairs (UN DESA) cites research that the flood disaster risk for low-income countries is 26 times higher than that of high-income countries. UN DESA also points out that 95 percent of deaths resulting from disaster occur in developing nations. With such stark statistics, expanded datasets are imperative for ethical machine learning.

Mapping Africa

Recently, our GeoTrellis team worked with the Agricultural Impacts Research Group (AIRG) at Clark University to help address data disparities in West Africa. Mapping Africa hopes to address hunger by improving the accuracy of cropland maps. Despite having large amounts of cultivable land, poverty is still high in much of the continent. Precise maps can inform better decision making at the policy level. And, better infrastructure improves disaster resiliency.

An animated gif displaying a machine learning model for detecting fields in satellite imagery of Northern Ghana. — Identifying fields in Northern Ghana (Growing season, off growing season, pixel-based predictions)

For the part of the project Azavea worked on, the Kenyan company Spatial Collective annotated images for a machine learning model to predict agricultural fields in Ghana. We leveraged our experience to clean up AIRG’s prototype and use Raster Foundry to serve out images to the application. This type of work helps bridge the digital divide via data as well as via people.

Reliance on big data alone exacerbates inequalities caused by the digital divide. Many nations lack the institutional knowledge and infrastructure to gather and maintain large repositories of data, much less to analyze them. Diversifying our data sets is only the first step. People, not data, will bridge the digital divide.

Decolonizing the workforce

The humanitarian industry relies on volunteers. The catalyzing of goodwill following a crisis is not only inspiring, it produces compelling results. After the 2010 earthquake in Haiti, volunteers used OpenStreetMap to document infrastructure on the island. Their work informed the distribution of resources on the island in ways that would not have otherwise been possible.

This reliance on volunteers is not without consequences, however.

Responses that use data crowdsourced from the site of impact still rely on outside volunteers for technical analysis. These analysts often lack both the cultural competency to accurately interpret the data as well as the established avenues of communication needed to ensure that the data result in action. It may be that volunteers “lose” as much data as they gather.

A small group of people are gather around a computer for training session in GIS techniques. — Members of Azavea’s Data Analytics team lead a GIS training in Guyana.

Aid groups should consider broadening their base of volunteers. People in crisis-prone areas can be pre-trained in crisis mapping to increase the efficacy of their output. Agencies who expand their volunteer force will also be able to exploit non-traditional sources of data such as the experiences and knowledge of locals who have survived previous disasters.

A 2016 paper also documents the complexities of sustaining and motivating a volunteer workforce long-term. Nor should crisis mapping be regarded as a cure-all. Evidence suggests that as much of the response in Haiti was affected via radio and the efforts of the Haitian diaspora as was affected by foreign aid groups. As much effort should be extended to fortifying the capacity of local actors to analyze data as is extended to recruiting volunteers from more technically developed areas.

Capacity building in local economies

Developing nations are leading the calls for further capacity building and skill development. Azavea’s Fernando Ramirez has heard the call at many of the international conferences he’s attended in support of our work to develop tools for the UN Sustainable Development Goals (SDGs). He recently documented the gap between what commercial entities think developing nations want and what leaders of these countries say they actually need.

What does it mean if the power structures of our industry mimic those of the colonial era? The start of changing these dynamics is to listen to the people we are trying to help. At Azavea, this customer-centered approach has led to projects such as building GIS capacity in Guyana with the Inter-American Development Bank (IDB) and the Central Planning and Housing Authority of the country.

Ethical frameworks for collecting and sharing data

Who “owns” the data, who owns the analysis? Was the data consensually gathered? These concerns are paramount to producing ethical machine learning. Will the data empower underrepresented groups, or further exploit them? Who profits from the data?

Many are working on frameworks to address such questions, but few are listening to communities on the ground.

A macro-graphic of the FAIR (FIndable, Accessible, Interoperable, Reusable) and CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) Principles for collecting data for ethical machine learning. — The CARE Principles supplement the FAIR Principles by considering people as well as data. (Source: GIDA)

Current open data management standards such as the FAIR Principles focus on the characteristics of data. According to the Global Indigenous Data Alliance (GIDA), this ignores “power differentials and historical contexts.” GIDA proposes the CARE Principles for Indigenous Data Governance as a supplement to FAIR. CARE engages with “Indigenous Peoples’ rights and interests,” especially their right “to create value…in ways that are grounded in Indigenous worldviews and realise opportunities within the knowledge economy.”

Crisis mapping: Finding the forest in the trees

The use of machine learning on satellite or aerial imagery is now widespread in providing immediate disaster relief. Machine-trained algorithms inform decision-making about the dispersion of aid. Open datasets empower even small agencies to hold the privileged accountable.

Still, bad actors have found ways to make this information work to their benefit. Anticipating adverse outcomes is necessary for ethical machine learning.

The good

This type of work provides substantial aid for communities in need. In the Near East, organizations are using machine learning to develop better resettlement programs for refugees. The Human Rights Watch (HRW) also used YouTube videos to counter the Syrian government’s claim that 2013 Ghouta chemical attacks were the work of rebels. Using interviews with survivors and satellite imagery (amongst other data), HRW deduced that Bashar al-Assad’s regime had most likely carried out the attack.

A 2006 satellite image of Damascus, Syria. Machine learning applied to such imagery is used in disaster relief an raises many ethical questions. — A 2006 satellite image of Damascus, a city in the Ghouta area. (Source: Wikipedia)

The bad

The new wealth of data is open to those with ill intentions as well. And, the chaos following a disaster can be the perfect opportunity to exploit them. SSP (heralded by George Clooney) did an analysis of satellite imagery that documented war crimes in Sudan and predicted future incursions. But, armed militias used the data made public by SSP to better target civilians.

A close-up of George Clooney as he listens to President Barack Obama who is out-of-focus in the bottom of the image. The two are discussing the ethics of using machine learning to aid the Sudanese. — George Clooney and President Barack Obama discuss the 2010 Sudan conflict. (Source: Wikimedia)

The organization PakReport.org, which launched following 2010 floods in Pakistan, found themselves in a similar dilemma. Open datasets of crowdsourced crisis maps were restricted after the Taliban threatened to use them to target foreign aid workers.

The necessary

The gravity of crisis situations demands robust policies for responsible data use. And, their urgency requires policies established well in advance of need. Developments in this area are accelerating. In 2018, the Center for Democracy & Technology (CDT) analyzed 18 frameworks organizations are building to help them navigate issues of data and aid in ethical manners.

The United Nations Office for the Coordination of Humanitarian Affairs (OCHA) outlines a four-step process for data responsibility:

Identify the need
Assess core competencies and capacities
Manage risk to vulnerable populations
Adhere to legal and ethical regulations

OCHA further delineates four characteristics of organizations that use data responsibly:

Responsibility as a process
“Bright line” rules and “red button” responses
Transparency
Feedback loops

A graphic illustrating the OCHA's standards for and characteristics of data responsibility represented as interlocking puzzle pieces. Those who practice machine learning for disaster relief should consider such ethical frameworks. — OCHA’s “Data Responsibility in Action” (Source: OCHA)

The UN Global Pulse’s Due Diligence Tool for Working With Prospective Technology Partners provides relief agencies with guidelines for researching, interviewing, and selecting responsible technical partners.

The second disaster

A torrent of technology is inundating disaster relief management. As it has from its inception, remote sensing and its subsequent analysis promise extraordinary returns. History has shown, however, that it also incurs heavy costs. The gravity of disaster relief demands that we carefully examine our practices, partners, and assets to be ethical practitioners. Stay tuned for more on the subject in an upcoming blog.

From helping to develop educational tools for developing green infrastructure policy, to collaborating to create technical jobs across the globe, Azavea is working to prevent the first and second disasters. Learn more about how you can team with us to produce ethical technology using our machine learning products and services Raster Foundry, Raster Vision, and GeoTrellis.

Ethical Machine Learning for Disaster Relief: Avoiding the Second Disaster