Read this section to explore how data needs to be used responsibly, the role of artificial intelligence, and the effects of data on people.
Better Data for Doing Good: Responsible Use of Big Data and Artificial Intelligence
The Big Data Revolution
As chapter 1 notes, the concept of big data typically describes data sets so large, or so complex, that traditional data processing techniques often prove inadequate. The term "big data" thus captures not only the large volumes of data now available, but also the accompanying processes and technologies for collecting, storing, and analyzing it. In other words, "big data" is not just about data – "no matter how big or different it is considered to be" – it is primarily about "the analytics, the tools and methods that are used to yield insights," including the frameworks, standards, and stakeholders involved in the field and ultimately the knowledge generated.
Although businesses increasingly are mining the digital
trails we leave behind to predict consumer behavior, track
emerging trends in the market, and monitor operations
in real time to improve sales and profit margins, big data
analytics also holds enormous potential to help understand
and address pressing socioeconomic and environmental issues.
Big data can help inform policy and interventions that set us on a more sustainable development path and improve responses to humanitarian emergencies.
Innovation labs across academia, government, the international development community, civil society, and the
private sector have been using big data and AI to develop a
wide range of applications, from mapping discrimination
against refugees in Europe to
facilitating the rescue of migrants at sea based on shipping
data, detecting fires in the Indonesian rainforest, predicting food insecurity due to
changing food prices via Twitter,
or fighting the effects of climate change. Box 3.1 describes
how big data is also being used to predict and respond to
disease outbreaks.
Box 3.1 Using big data to predict dengue fever outbreaks in Pakistan
Dengue fever is the most rapidly spreading mosquito-borne viral disease in the world. It is endemic in Pakistan, where human mobility and hospitable conditions for mosquitoes have helped it spread. Those infected typically suffer from severe illness, and mortality rates are high.
A partnership involving Telenor Research, the Harvard T.H. Chan School of Public Health, Oxford University, the U.S. Centers for Disease Control and Prevention, and the University of Peshawar used big data to anticipate and track the spread of dengue in Pakistan. The partnership leveraged anonymized call data records from 40 million Telenor Pakistan mobile subscribers during the 2013 outbreak to map the geographic spread and the epidemiological timeline of the disease. The analysis combined transmission suitability maps with estimates of seasonal dengue virus importation to generate detailed and dynamic risk maps, helping to inform national containment and epidemic preparedness in Pakistan and beyond.
More broadly, the project illustrates the potential of mobile data to reveal mobility patterns that can help accurately predict the spread of disease. The insights it generated helped predict the spread days or even weeks earlier than traditional means.