Better Data for Doing Good: Responsible Use of Big Data and Artificial Intelligence

From Design to Responsible Use: Ethical Challenges with Using Big Data and AI

Although we are only scratching the surface of what is possible in the new age of big data and AI, and how they can be leveraged for social good, we also need to grapple with both the unintended risks and malicious use of the same technology. These benefits and looming risks were aptly articulated by the UN Secretary-General at the 2017 "AI for Good Global Summit":

We face a new frontier, with advances moving at warp speed. Artificial intelligence can help analyze enormous volumes of data, which in turn can improve predictions, prevent crimes and help governments better serve people. But there are also serious challenges, and ethical issues at stake. There are real concerns about cyber security, human rights and privacy. . . The implications for development are enormous. Developing countries can gain from the benefits of AI, but they also face the highest risk of being left behind.

Algorithm-based systems, powered by big data and AI, increasingly both learn from and autonomously interact with their environments, as well as one another. In April 2011, the government of El Salvador removed a countrywide subsidy on liquid petroleum gas, the most common domestic cooking fuel. Instead of subsidizing prices at point of sale, eligible households were given an income transfer. The reform triggered considerable public debate and controversy. UN Global Pulse and the World Bank teamed up to investigate whether social media signals from Twitter could be used to understand public perceptions and social dynamics surrounding the fuel subsidy reform, specifically reactions and concerns about political partisanship, the level of information reaching communities about the reform, and trust in government commitment to deliver the subsidy. A taxonomy of keywords was developed to filter Twitter for relevant content. Regional experts were consulted to ensure slang words and synonyms were included in the taxonomy. Tweets were then filtered to assess relevance and isolate content originating from El Salvador. The study suggests that social media analysis, using big data and AI, can help inform policy implementation, as the sentiment observed was similar to public opinion measured by household surveys. Source: Adapted from UN Global Pulse 2015. Box 3.8 Monitoring public sentiment about policy reforms using social media in El Salvador Data from social media can be used to help estimate migrant populations. For example, studies based on Facebook data yield estimates of approximately 214 million "expats" in the world (people stating that they live in a country other than their self-reported "home country"), close to the 2017 estimated total of 258 million international migrants globally. Among the issues surrounding the use of social media data to estimate migrant populations are the difficulty in defining who an international migrant is, selection bias, and the reliability of self-reported information. But scholars are working on reducing selection bias via model fitting and results are promising. Source: Adapted from Rango and Vespe 2017. Box 3.9 Shedding light on migration patterns using social media information. This tends to generate behavioral patterns that cannot always be predicted or explained. Where this evolution in AI will ultimately take us is not yet clear. Some raise the risk of autonomous weapons or viruses targeting individuals with a particular defective DNA trait as one frightening scenario. And rising concerns about the malicious use of AI, for instance, for profiling, merits a stronger ethical governance and regulatory framework that covers how related methods are developed and deployed. The risk of unintended consequences of AI should be accounted for at each stage of innovation, beginning with design.

Technologies and algorithms by themselves have no intrinsic morality ­– however, technology can be used for good or bad depending on how it is employed. Looking at existing technologies, ethical considerations need to address questions such as what life-and-death decisions self-driving cars make. Although privacy norms have been long established to protect personal data from misuse and ensure individual privacy in the digital world, ethics has become an additional tool in AI applications used to protect fundamental human rights and help make decisions in areas where law has no clear-cut answers. The UN Special Rapporteur on the right to privacy recommends formal consultation mechanisms be instituted "including ethics committees, with professional, community and other organizations and citizens to protect against the erosion of rights and identify sound practices" (Cannataci 2017). A recent example in which ethics and moral obligations of data handling were included in an official UN document is the "Guidance Note on Big Data for the achievement of the 2030 Agenda" adopted by the UN Development Group (UNDG 2017). The note, the first official document in the UN on big data and privacy, stresses the importance of ensuring that data ethics is included as part of standard operating procedures for data governance (box 3.10).

Box 3.10 Data privacy, ethics, and protection: A guidance note on big data for achievement of the 2030 Agenda

1. LAWFUL, LEGITIMATE AND FAIR USE Data should be obtained, collected, analysed or otherwise used through lawful, legitimate and fair means, taking into account the interests of those individuals whose data is being used.

2. PURPOSE SPECIFICATION, USE LIMITATION AND PURPOSE COMPATIBILITY Any data use must be compatible or otherwise relevant, and not excessive in relation to the purposes for which it was obtained.

3. RISK MITIGATION AND RISKS, HARMS AND BENEFITS ASSESSMENT A risks, harms and benefits assessment that accounts for data protection and data privacy as well as ethics of data use should be conducted before a new or substantially changed use of data (including its purpose) is undertaken.

4. SENSITIVE DATA AND SENSITIVE CONTEXTS Stricter standards of data protection should be employed while obtaining, accessing, collecting, analysing or otherwise using data on vulnerable populations and persons at risk, children and young people or any other data used in sensitive contexts.

5. DATA SECURITY Robust technical and organizational safeguards and procedures should be implemented to ensure data management throughout the data lifecycle and prevent any unauthorized use, disclosure or breach of personal data.

6. DATA RETENTION AND DATA MINIMIZATION Data access, analysis or other use should be kept to the minimum amount necessary to fulfill the purpose of data use.

7. DATA QUALITY All data-related activities should be designed, carried out, reported and documented with an adequate level of quality and transparency.

8. OPEN DATA, TRANSPARENCY AND ACCOUNTABILITY Appropriate governance and accountability mechanisms should be established to monitor compliance with relevant law, including privacy laws and the highest standards of confidentiality, moral and ethical conduct with regard to data use.

9. DUE DILIGENCE FOR THIRD PARTY COLLABORATORS Third party collaborators engaging in data use should act in compliance with relevant laws, including privacy laws as well as the highest standards of confidentiality and moral and ethical conduct.


Data ethics should be treated holistically using a consistent and inclusive framework that considers a diverse set of outcomes instead of an ad hoc approach that only accounts for limited applications. Such mechanisms include codified data ethics principles or codes of conduct, ethical impact assessments, ethical training for researchers, and ethical review boards.

Privacy impact assessments, in general, allow developers and organizations to effectively assess the risks posed to privacy by big data and AI, thereby ensuring compliance with privacy requirements, identifying mitigation measures, and effectively classifying the impacts of data and algorithm use. Including issues of ethics and human rights in any impact assessment, including a privacy impact assessment, could prove more effective than developing a separate analysis or ethical review framework.

For example, UN Global Pulse builds ethical considerations into its data practices by conducting a "risks, harms, and benefits assessment," which may help identify anticipated or actual ethical and human rights issues that may occur during a data innovation project. The assessment considers the proportionality of potential benefits compared to risks of harm from data use, as well as risk of harm from the data not being used. If the risks outweigh the benefits, the project does not proceed. In its "Guide to Personal Data Protection and Privacy," the World Food Programme also builds ethics into its procedures through the application of humanitarian principles and risk assessments.Although ethics may not have clear-cut rules, when assessing the risk of harm along with the benefits "any potential risks and harms should not be excessive in relation to the [likely] positive impacts of data use".

Incorporating privacy by design is also crucial for innovation applications that operate with limited human supervision. The rapidly developing nature of AI algorithms can give rise to algorithmic bias and unverified results. Similar to privacy by design is the concept of AI ethics by design, which suggests seven principles, including recommendations to proactively identify security risks by using tools such as the privacy impact assessment to minimize potential harm. In addition, ensuring oversight of the entire data innovation process, from design to use, is vital to securing true incorporation of ethics into AI system.

Moreover, accountability and transparency are critical ethical principles that must accompany any AI innovation project. "[T]ransparency builds trust in the system, by providing a simple way for the user to understand what the system is doing and why". To maintain transparency, the Institute of Electrical and Electronics Engineers recommends developing new standards that describe measurable, testable levels of transparency so systems can be objectively assessed and the level of compliance can be determined. Although it is harder and harder to keep algorithms transparent because of heavily interlinked and layered processes of algorithmic programming, the AI ethics by design approach suggests that ensuring the transparency and accountability of algorithms is essential to determining the intended outputs and preventing algorithmic bias.

The overall data ethics program may also include recurring data ethics reviews at every critical juncture, such as review boards. A similar approach already exists in research institutions and is usually referred to as internal review boards. For example, in their published procedures for ethical standards regarding data collection, the United Nations Children's Fund (UNICEF) adheres to mechanisms for review such as internal and external review boards as well as the basic ethics training for researchers. Any UNICEF project involving surveys, focus groups, case studies, physical procedures, games, or diet and nutritional studies is subject to ethical review.

A stakeholder-inclusive approach that features "the proactive inclusion of users" is also desirable. "Their interaction will increase trust and overall reliability of these systems". "[T]he context of data use" should also always be considered, thus requiring human intervention, and at times, context-specific expertise ­– such as the presence of a humanitarian expert during a humanitarian response or of a transportation planning expert in a project that looks at transportation policy.

Finally, ethical approaches to AI should be humanrights-centric, incorporating substantive, procedural, and remedial rights. Just as misuse of AI may lead to harm, nonuse of AI may allow preventable harm to occur. Decisions to use or not use applications of AI can infringe on fundamental rights. As suggested by the UN Special Rapporteur on the right to privacy in his recent report to the UN General Assembly, "commitment to one right should not detract from the importance and protection of another right. Taking rights in conjunction wherever possible is healthier than taking rights in opposition to each other". But undoubtedly, incorporating ethics into every stage of project design and implementation of AI can potentially mitigate harm and maximize positive impact of rapidly developing new technologies, ensuring they are used for social benefit.