Read this section to explore how data needs to be used responsibly, the role of artificial intelligence, and the effects of data on people.
Better Data for Doing Good: Responsible Use of Big Data and Artificial Intelligence
Using Big Data and AI as a Force for Social Good
AI and big data are generating new tools and applications creating actionable insights, real-time awareness, and predictive analysis on numerous topics for sustainable development and humanitarian action. More and more compelling examples illustrate the value of this technology to improve early warning systems and inform policy and programmatic response. These individual use cases represent a small but significant innovation in learning about the world around us. Taken together, they provide new ways to detect and respond to world events, influence policy debates, and drive development, in a way that is both safe and fair (figure 3.1, table 3.1).
The following sections examine the benefits and applications of big data and AI – including for (a) speech and audio processing, (b) image recognition and geospatial analysis, and (c) text analysis. They also describe how AI is being leveraged to support
the SDGs and address the emerging challenges and risks that accompany the uptake of these technologies.
Figure 3.1 The Sustainable Development Goals
Table 3.1 Examples of artificial intelligence applications for the Sustainable Development Goals
SDGs | Value of artificial intelligence | Case study | Risks and challenges |
SDG 1: No poverty |
Artificial intelligence (AI) can be used to monitor income and track policies to identify progress and successful practices. |
Combining satellite imagery and machine learning resolution daytime satellite images to obtain estimates of household consumption and assets. Using survey and satellite data from five African countries – Malawi, Nigeria, Rwanda, Tanzania, and Uganda – the study showed how a convolutional neural network can be trained to identify image features that can explain up to 75 percent of the variation in local-level economic outcomes. |
There is a risk of omitting segments of the population that cannot be captured by remote sensing signatures because of their lack of footprint or the given sociocultural context. |
SDG 2: Zero hunger |
AI can be used to maximize yields and improve agricultural practices based on multiple data sources. |
Detecting patterns in big data saves Colombian rice Agriculture mined 10 years of weather and crop data to understand how climatic variation affects rice yields. The project fed the patterns into a computer model and predicted a drought in the region of Córdoba. The center subsequently advised the Rice Producers Federation of Colombia (FEDEARROZ) against planting in the first of two annual growing seasons. This advice saved farmers from incurring significant losses. |
Overexploitation, based on local optimization, could lead to exhausted lands and lack of resources at the systemic level |
SDG 3: Good health and well-being |
AI can be used to support diagnosis and personalized medical treatment. |
Revolutionizing personalized medicine using AI Watson, IBM's "cognitive computing" platformuses natural language processing to efficiently and quickly sort through millions of journal articles, government listings of clinical trials, and other existing data sources to help diagnose patients and provide personalized treatment plans. University of Tokyo doctors reported that the artificial intelligence diagnosed a 60-year-old woman's rare form of leukemia that had been incorrectly identified months earlier in less than 10 minutes. |
Overpersonalized medicine could lead to abuse from the insurance industry and other stakeholders based on private personal information. |
SDG 4: Quality education |
AI can be used to tailor the delivery of education based on each student's needs and capabilities. |
Detecting dyslexia in children in Spain Ten percent of the population has dyslexia, aneurological learning disability that affects reading and writing but does not affect general intelligence. Children with dyslexia can learn coping strategies to deal with its negative effects. Unfortunately, in most cases dyslexia is detected too late for effective intervention. Change Dyslexia is a project that uses cutting edge scientifically based computer games, such as Dytective Test and DytectiveU, that screen and support dyslexia at large scale. |
There is the danger that harmful media can be easily accessed by children. For example, the use of YouTube Kids videos optimized with AI and bots that create long, repetitive, and sometimes frightening videos meant to keep children entertained for as long as possible. |
SDG 5: Gender equality |
AI can help correct for gender bias in insights derived from big data and nontraditional data sources. |
Mapping indicators of female welfare at high spatial geo-located cluster data from the Demographic and Health surveys on rates of literacy, stunting, and use of modern contraception methods to produce high- resolution spatial gender-disaggregated maps, using predictive modeling techniques. The study focused on three countries in Sub-Saharan Africa (Kenya, Nigeria, and Tanzania), one country in South Asia (Bangladesh), and one country from the Western hemisphere (Haiti). |
AI applications are at risk of reinforcing existing gender biases present in the data used to train the algorithms. |
SDG 6: Clean water and sanitation |
AI can predict consumption patterns from sensor data to optimize water and sanitation provision. |
Monitoring coastal water quality in real time in system strategically deployed around Singapore's coastline. The system integrates hydrodynamic and water quality modeling into a forecasting framework that forms the backbone of a central operational management system. Eight specially outfitted buoys act as miniature labs, collecting data on pollutants, including oil and nutrients, and send live updates to the authorities on how they could spread. |
AI (or simple malware) can be used to attack or disable critical public infrastructure by means of remote warfare. |
SDG 7: Affordable and clean energy |
AI can be used to make existing infrastructure more intelligent and energy efficient |
Preventing power supply failures in domestic Railways has trialed remote condition monitoring of the power supply systems, leveraging AI to predict possible outages. The measure is set to be rolled out on two sections of the Western and South-Western railway network. |
As noted above, critical network infrastructures may be subject to cybersecurity threats. |
SDG 8: Decent work and economic growth |
AI can be used to optimize recruitment for both employers and jobseekers. |
Optimizing online job searches LinkedIn, a well-known business- and employment-oriented social networking service, uses AI and big data to help recruiters automate much of the candidate screening process. The tool is also integrated in different applicant tracking systems and, for example, automatically synchronizes with the different open jobs, ranking candidates against them. |
If algorithms learn hiring practices based on biased data that prefers, for example, Caucasian names rather than others, it can make biased hiring decisions. |
SDG 9: Industry, innovation and infrastructure |
AI can be used to automate and eliminate rote or routine work, freeing up labor to focus on more creative tasks. |
Speeding up toy production in Denmark A factory in Denmark uses autonomous robots andprecision machines to make 36,000 Lego pieces per minute, or 2.16 million pieces every hour. |
AI will transform and could eliminate some jobs. McKinsey estimates that some 60 percent of all jobs will see a third of their activities automated. |
SDG 10: Reduced inequalities |
AI can support translation of less- known languages to ensure all voices are accounted for in decision-making processes. |
Accelerating development in Uganda with speech in South Africa used machine learning to develop speech-to-text technology to filter the content of public radio broadcasts for less-known languages spoken in Uganda. Once converted into text, the information can reveal sentiment around topics relevant for sustainable development. |
Advances in robotics and AI could increase inequality within societies, further entrenching the divide between rich and poor. |
SDG 11: Sustainable cities and communities |
AI can measure traffic in real time, monitor commuting statistics, or improve transportation services. |
Inferring commuting statistics in Indonesia population at more than 30 million. In response to the needs of the authorities, UN Global Pulse – Pulse Lab Jakarta initiated a project to test whether location information from social media on mobile devices could reveal commuting patterns in the area. The results of the research confirmed that geo-located tweets have the potential to fill current information gaps in official commuting statistics. |
AI may lead to cascading failures of interconnected systems in smart cities. Failures in machine learning algorithms need to be accommodated in urban emergency planning. |
SDG 12: Responsible consumption and production |
AI can improve efficiency of recycling processes, which can eliminate waste and improve yields. |
Supporting smart recycling in the United States artificial intelligence, are helping to make municipal recycling facilities run more efficiently in the United States. Through deep learning technology, robotic sorters use a vision system to see the material, AI to think and identify each item, and a robotic arm to pick up specific items. The technology could help make recycling systems more effective and profitable. |
AI can also be used to increase the scale of extractive or manufacturing industries, creating a larger environmental footprint over time. |
SDG 13: Climate action |
AI and climate science can help researchers identify previously unknown atmospheric processes and rank climate models. |
Predicting road flooding for climate mitigation in from the Georgia Institute of Technology developed a framework to improve the resilience of road networks in Senegal to flooding, including recommendations on how to prioritize road improvements given a limited budget. The results showed how roads are being used, how they are damaged, and how policy makers can allocate budget in the most efficient way to repair them. |
Heavy computation required to power AI may lead to increased energy costs. |
SDG 14: Life below water |
AI can help detect, track, and predict the movement patterns of vessels engaged in illegal fishing. |
Supporting sustainable legal fishing in Indonesia Indonesia and Global Fishing Watch – a partnershipbetween Google, Oceana, and SkyTruth – are cooperating to deliver a vessel monitoring system for all Indonesian-flagged fishing vessels and generate data that is publicly available. The project aims to promote transparency in the fishing industry. |
The data collected might be incomplete, as some vessels may be undetectable when switching off their transmitters. |
SDG 15: Life on land |
AI can be used to map and protect wildlife on land using computer vision systems. |
Identifying, counting, and describing wild animals 225 camera traps, across 1,125 square kilometers, in Serengeti National Park to evaluate spatial and temporal dynamics. The cameras accumulated some 99,241 camera-trap days, producing 1.2 million pictures between 2010 and 2013. Members of the general public classified these images via a citizen-science website. The project then applied an algorithm to aggregate the classifications to investigate multi- species dynamics in the local ecosystem. |
Monitoring technologies can be used by poachers just as easily as conservationists. |
SDG 16: Peace, justice and strong institutions |
AI can reduce discrimination and corruption and drive broad access to e-government. |
Turning information into knowledge and action in education, justice, health care, banking, taxes, policing, and so on – have been digitally linked across one platform, "wiring up" the nation. Estonia is also exploring ways to leverage AI to improve e-government and other public services. |
Citizen monitoring could be misused to repress political practices (such as voting, demonstrations). |
SDG 17:
Partnerships for the Goals |
AI should be a public good. |
Leveraging partnerships to improve AI for global ethical, and beneficial development of AI. The Partnership on AIc represents a collection of companies and nonprofits that have committed to sharing best practices and communicating openly about the benefits and risks of AI research. Another example is the annual "AI for Good Global Summit"d organized by the International Telecommunication Union, the UN's specialized agency for information and communication technologies. |
Collaboration must also result in action |
Speech and audio processing
Arguably, one major achievement of big data and AI has been to facilitate real-time translation of a growing number of the world's languages. Although language translation is not an SDG per se, greater language and cultural understanding could help increase
the efficiency and effectiveness of development efforts across all SDGs – for example, by helping to map public opinion (see box 3.3). Google and Microsoft systems, for example, are now able to translate over a hundred languages. Also, new systems
have been developed that perform real-time translations – such as a Skype system that can translate voice calls into 10 different languages in real time.
Early models of machine translation used statistical methods that translated words based on a short sequence, that is, within the context of several words before and after the target word, which did not always work for long and complex sentences. New
neural network architectures, such as long short-term memory, have drastically improved efficiency. Such systems can now learn from millions of examples and are able to translate whole sentences at a time, rather than word by word.
Box 3.3 Using machine learning to analyze radio broadcasts in Uganda
Radio remains a primary source of information for communities in many parts of the world, particularly in remote rural areas where coverage and access to other forms of connectivity is limited. Radio is also an accessible medium for the millions who remain illiterate.
In Uganda, where a majority of the population lives in rural areas, radio is a vibrant platform for community discussion, information sharing, and news broadcasting. Radio talk shows and dial-in discussions are popular forums for voicing local needs, concerns, and opinions.
UN Global Pulse collaborated with Stellenbosch University in South Africa to develop speech-recognition technology to automatically convert these radio broadcasts into text for several of the languages spoken in Uganda, including English, Luganda, Acholi, Lugbara, and Rutooro. "Radio mining" consisted of two automated software stages and two human analysis stages. This semi-automated approach allowed a relatively small team of analysts to process many audio recordings quickly and affordably.
Several projects were piloted with UN partners to understand the value of talk radio to provide information on topics relevant to the Sustainable Development Goals, such as health care service delivery, response to disease outbreaks, and the efficiency of public awareness raising radio campaigns, among others
Computer vision, image analysis, and geospatial data
Accurate population information is critical for authorities to plan and deliver quality public services and coordinate crisis-relief efforts. However, collecting related data traditionally is a long-standing challenge for development practitioners and policy makers. For example, gathering national household survey data on poverty is typically time-consuming and expensive, requiring elaborate data collection and analysis techniques. This exercise is particularly challenging in fragile states, where limited capacity and security concerns typically hinder data collection and processing. In this setting, for example, satellite imagery has been used to gain an overview of population density and assess poverty and access to energy – covered by SDG 1 and SDG 7 (see boxes 3.4 and 3.5).
In the health sector – covered by SDG 3 – current advances in medical imaging and computer analysis of tumors can complement and refine radiologists' analysis. Mobile phone call records have also been combined with satellite data to build dynamic population maps and estimate cross-border flows of migrants to enable development actors to track the spread of disease. This technique was leveraged in southern Africa to map the movements of cross-border communities to better understand malaria infections patterns.
In the environmental field – SDGs 12, 13, 14, and
15 – AI-assisted analysis of satellite imagery can be used
to monitor damage to coastal areas due to floods or
typhoons, or drought-affected areas, or the retreat of
wetlands or encroaching land use in deltas or river basins.
Combined with meteorological models and large data
sets on changes in ocean temperature and currents, such
mapping can help improve forecasting and early warning
systems of future major weather events. Moreover, GPS
data has been used to analyze traffic patterns to reduce
pollution (see box 3.6).
Another AI application getting considerable attention
is automated or self-driving cars – a potential solution for
optimizing transportation in ways that can minimize car
accidents. Debate is ongoing about what a fully automated
car really is, but considerable progress has been made toward
solving problems of visual recognition, object identification,
and reaction processing, which are critical to this endeavor.
Building on humble beginnings and minor innovations
(including cruise control, assisted steering, lane assist, automatic braking, and "Traffic Jam Assist"), the race toward a
fully automated car is now underway (box 3.7).
Box 3.4 Estimating population counts and poverty in Afghanistan and Sudan
In Afghanistan, the United Nations Population Fund and the UN Country Team collaborated with Flowminder, an organization that collects, aggregates, and analyzes anonymous mobile, satellite, and household survey data to generate population maps. The project used survey data, geographic information systems, and satellite imagery data to estimate populations in areas with no such data.
In Sudan, the United Nations Development Programme used satellite data to estimate poverty by studying changing nighttime energy consumption. The team used data pulled from nighttime satellite imagery, analyzing illumination values over two years, in conjunction with electric power consumption data from the national electricity authority. The study was also informed by desk research, including similar World Bank work in Kenya and Rwanda. Electricity consumption was used as a proxy indicator for income, as poorer households were assumed to be lower energy consumers. The exercise demonstrated how satellite imagery can help measure poverty.
Box 3.5 Mapping energy access in India
Satellite night-light data has also been leveraged in India. A team from the University of Michigan, the U.S. National Oceanic and Atmospheric Administration, and the World Bank Group's
Energy and Extractives Global Practice analyzed the daily light signatures of more than 600,000
villages from 1993 to 2013 (see map B3.5.1).
Electrification trends were visualized on NightLights.io, an open-source platform for
processing big data in a scalable and systematic way. The platform features an application
programming interface that enables technical partners to query light output. And its interactive
maps allow users to explore light output trends. Through the project, the research team gained
a high-level overview of rural electrification, compared villages and plot trends, and shared
data, which can help inform government policy.
Map B3.5.1 Night lights in India
Box 3.6 Cleaning Mexico City's air with big data and climate policy
Mexico City's congestion, among the world's worst, worsens local air quality. City dwellers are exposed to twice the recommended level of ozone and fine particulate matter (PM2.5), as advised by national standards and according to 2016 data, resulting in some 10,850 annual deaths. A team of researchers from the University of California, Berkeley, and the Instituto Nacional de Ecología y Cambio Climático in Mexico used data from Waze, a GPS navigation software, to evaluate various transport electrification options based on their ability to reduce urban air pollution and emissions – including (a) the electrification of the entire city taxi fleet, (b) the electrification of public transit buses, and (c) the electrification of all light-duty vehicles.
The team first measured the number, location, and duration of traffic jams throughout the city, estimating related emissions using the MOVES-Mexico model. The team then used data from Google's "popular times" function to map urban population movement.
Using this information, the team was able to identify the best policy options and optimal
locations for electric vehicle charging stations.
Box 3.7 Self-driving cars
Human error causes about 90 percent of all car accidents. Artificial intelligence (AI) and autonomous driving might therefore help reduce accidents and save lives. Self-driving cars have to identify, assess, evaluate, and respond to fast-changing circumstances, and predict likely events in real time. A fully automated car has to master vehicle dynamics, control systems, and sensor optimization. For example, detecting pedestrians from images or video is a very specific image-classification problem.
Driverless cars require robust data capacity for image processing and recognition. Navigation and mapping data is also essential, with GPS coordinates used extensively. Mercedes, BMW, and Audi purchased the mapping business Here from Nokia for US$2 billion; Here combines "static" mapping data taken from cars with 3D cameras with live information supplied by a network of connected devices, including cars (Bell 2015). In January 2016, Volkswagen partnered with Mobileye, a technology company that develops vision-based advanced driver-assistance systems, to produce its real-time image-processing cameras and mapping service for driverless cars. Ford became the first manufacturer to road test a fully autonomous car in snow on public roads in March 2016 after working with researchers from the University of Michigan to create an algorithm recognizing snow and rain (Ford 2016). Ford has already tested autonomous Fusion cars on public roads in the U.S. states of Arizona, California, and Michigan.
Despite these groundbreaking developments, the move toward autonomous driving is
not without its problems. Many worry that a car-centric vision detracts from more sustainable solutions related to public transportation and urban design (covered by Sustainable
Development Goal 11). Driverless vehicles are also likely to wipe out millions of jobs, including taxi drivers, couriers, and truck drivers, something new policies must address urgently.
Moreover, legal frameworks will need to keep pace and be redesigned. Although a few
countries are moving to issue new legal frameworks for autonomous driving, significant
legal gaps remain.
Text mining and text analysis
Also known as text mining, text analytics is the science of turning unstructured text into structured data. Text analytics is focused on extracting key pieces of information from conversations. By understanding the language, the context, and how language is used in everyday conversations, text analytics uncovers the "who" of the conversation, the "what" or the "buzz" of the conversation, "how" people are feeling, and "why" the conversation is happening. Conversations are categorized and discussion topics identified.The technology is being leveraged, among other things, to support agricultural development and build food security – covered by SDG 2. Kudu, a mobile auction market application, is using text analysis algorithms to match farmers looking to sell their produce with suitable market traders. The system allows any farmer or trader to send a message by phone. Once matched, compatible buyers and sellers are notified. Kudu not only limits unnecessary travel and dependency on intermediaries, but encourages competition by overcoming critical information gaps. The application was developed by the AI Research Group, which is specialized in the application of AI to problems in the developing world and operates out of the College of Computing and Information Sciences at Makerere University in Kampala, Uganda.
Analysis of text from Twitter feeds has also been used to track food prices in real time in Indonesia. UN Global Pulse worked with the Ministry of National Development Planning and the World Food Programme to "nowcast" food prices based on Twitter data. The outcome was a statistical model of daily price indicators for four commodities: beef, chicken, onion, and chili. When the modeled prices were compared with official food prices, the forecast and actual prices were closely correlated, demonstrating that near real-time social media signals can serve as a proxy for daily food prices.
Similar techniques are being used to analyze a host of other development issues. For example, the ability to monitor public sentiment toward policy measures in real time, via social media, can provide critical information on the impact of policy and how it is playing out in practice, especially for vulnerable groups or households (box 3.8). Data from social media can also help estimate the number of expats around the world (box 3.9).
As mentioned earlier, conducting household surveys is often expensive. New approaches such as monitoring social media could help address data gaps in developing economies. Moreover, these approaches may capture marginalized or migrating communities not always accounted for by traditional means such as national censuses.