Discussion and recommendations

The recent requirements for data sharing by an increasing number of funders, publishers and regulatory agencies risk exacerbating existing inequities between researchers in high-resource and low-resource settings, and data reuse is unlikely to produce the expected public health benefits unless critical challenges are addressed. The case studies in this paper provide concrete examples of real challenges and some potential solutions related to equitable data sharing. We recommend the following ways forward:


1. Capacity building

Planning and collecting good quality data requires significant investment in terms of expertise, experience, skills, time and effort on the part of primary data collectors. In light of this, specific funding should be allocated for capacity building programmes to improve data management as well as data reuse capacity in researchers in low-resource settings. Collaborations between researchers in high- and low-resource settings as a condition for sharing may strengthen such capacity building efforts. Collaborations with the primary researchers are especially important where interpretation of the data requires in-depth understanding of the population the data are drawn from and the context in which the data were collected and curated. Initiatives including those led by WWARN (Case study 3) demonstrate that equitable sharing can be achieved, following considerable investment in human resources, technology and infrastructure for the curation and sustainable sharing of research outputs. Efforts to develop data management and data sharing courses that will be made freely available are underway.


2. Investments

Funding in data management and sharing platforms supporting poverty-related disease research communities should be increased. Designated funding should be included in research grants of the primary study that budget for costs and time spent on an activities specific to data sharing, such as the additional curation needed, data storage, staff time, hardware and software. Investments are also needed in the management of platforms supporting complex data integration and analyses. Without these investments, the recent requirements for data sharing by an increasing number of funders, publishers and regulatory agencies risk exacerbating inequities between researchers in well-resourced and resource-limited settings, and data reuse is unlikely to produce the expected public health benefits.


3. Data sharing policies

Although data sharing has been widely promoted and researchers have increased their data sharing activities, very few research groups and institutions have formal data sharing policies. Instititutional data sharing policies are important for many reasons: for members of the institution to have a shared understanding of their own data sharing processes, to safeguard the interests of their researchers as well as those of their data subjects. The data sharing policy should provide guidelines for secondary users to request for data and what are the priority secondary analyses such as those that are consistent with institutional aims. It should also include when special conditions of access should be put in place such as requirements for collaborations on secondary analyses. In addition, an institution may set embargo periods, preferential access provisions (e.g. to collaborators and LMIC researchers, and to secondary analyses that directly benefit communities that generated the primary data). These policies should take into account their context, type of data and database and relevant existing regulations and policies (e.g. funders'). For example, in the case of the Maternal & Newborn Health Registry (Case Study 1), the policy may state that data underlying the study published will be shared, and not the entire registry.


4. Incentives and attributions

In order to avoid disincentivising primary research, appropriate recognition and credit should be provided to primary researchers and their teams. In light of current developments in data sharing, mainstream international guidelines on authorship criteria should be revisited. The current International Committee of Medical Journal Editors may not be adequate to account for the different levels and types of contributions of the primary researchers in secondary analyses. The discussions and decisions around authorship should involve both primary and secondary researchers including those in low-resource settings. Creative solutions have been suggested such as the "CRediT taxonomy" system and "data authorship" but these have not been widely accepted. While the CRediT taxonomy system specify roles of authors, it does not provide guidance on when an individual qualifies to be an author. Data authorship is not yet held in the same academic kudos as manuscript authorship.


5. Consent and community engagement

For new studies, researchers should ensure that participants have given consent for sharing their data with researchers external to the primary research team. 'Broad consent' for unspecified future use is currently the most widely accepted mechanism to obtain participant consent for sharing data beyond the primary research teams. Research staff who are tasked to obtain broad consent must be appropriately trained. For multicentre studies, it is necessary to engage with collaborators to ensure that clinical study agreements include provisions for data sharing and obtaining appropriate consent.

For primary studies, what is appropriate information and what constitutes adequate understanding on the part of potential research partipants remain enduring ethical questions. Studies have shown that communications about data sharing adds another layer of complexity to the informed consent process. Community and public engagement may help to improve general understanding of data sharing among research communities. Such engagement is also important to discern what consititutes sensitive data, what secondary uses might cause harm or stigma to communities, and what limitations should be placed on sharing with external parties. A combination between conventional engagement approaches such as holding public talks and consultation with community advisory boards, and creative initiatives such as arts-science collaborations and café-style talks, may be necessary to refine both the development of core information about data sharing to be provided to all research participants, and appropriate solutions for context specific-challenges arising when explaining data sharing.