**2. Rare disease data sharing initiatives**

This section discusses the importance of data sharing for rare disease, across the research lifecycle, from diagnosis and basic research, to clinical trial transparency, to health technology assessment.

#### **2.1 Diagnosis and drug discovery**

A number of national and international initiatives have emerged to demonstrate the potential of data to improve rare disease patient care and to accelerate research. All of these initiatives seek to adopt data-intensive approaches to accelerate rare disease diagnosis. They adopt learning health system strategies, which involve collecting rich data as part of routine clinical care and making these data available for research to improve diagnostics and therapies. Finally, the initiatives all recognize the importance of international data sharing in the rare disease context.

The Genomics England 100,000 Genomes Project has "committed to sequencing 100,000 whole human genomes, from 70,000 patients, by the end of 2018" [11], with a focus on rare and infectious diseases [12]. This project will facilitate the introduction of genomic medicine in NHS care while contributing to the personalization of its medicine [11]. Clinicians are hoping to achieve earlier diagnoses and develop more effective treatments with this data [13]. Researchers also hope to gain a better understanding of cancer.

Genome Canada has proposed a national, clinical genomics project, which aims to advance precision medicine for all Canadians, with an initial pilot focused on rare disease [14]. The proposal is to introduce genomic testing as part of clinical care. The data will then be made available as a research platform. The vision is to establish a national cohort, perhaps through a federation of provincial datasets.

The European Joint Program on Rare Diseases (EJP RD) brings 130 institutions together across 27 EU Member States as well as Canada, Armenia, Georgia, Israel, Norway, Serbia, Switzerland and Turkey, to accomplish its two main goals: [15].


social, economic, health service) coupled with accelerated exploitation of research results for benefit of patients" [15].

To achieve its objectives, the EJP-RD has developed a five pillars structure subdivided in various themes and activities such as Joint Transnational Calls for collaborative research projects; a common virtual platform for discoverable data and resources for rare disease research; capacity building and training of patients and researchers in rare disease research and processes, all to accelerate the validation, use and development of innovative methodologies tailored for clinical trials in rare diseases [15].

Through the Breaking Barriers to Health Data Project, the World Economic Forum is "partnering with genomics institutes in the United Kingdom, the United States, Canada and Australia" [16] to pilot a governance framework "to support the effective and responsible use of federated data systems to advance rare disease diagnostic and treatment-related research" [16]. Federated data systems enable researchers to query a distributed network of secure databases. The individualpatient data remains hidden in each of the secure nodes. This pilot project aims to demonstrate a proof-of-concept for federated data systems, accompanied with an economic analysis and a scalable governance framework [16].

An example of a commercial initiative to overcome the geographic barriers to rare disease research is the start-up RDMD [17]. This company aims to generate a rich, regulatory-grade biobank, database, and registry of patients with rare disease from across the United States (US) and internationally. The start-up leverages the rights of patients in the US and in other countries to request access to their health records and biospecimens for onward transfer to RDMD. RDMD then looks to enter into partnerships with pharmaceutical companies to accelerate their research into rare disease therapies. Patients are provided with access to their aggregated and structured medical record through an app.

#### **2.2 Clinical trial transparency**

Improving the transparency of clinical trials has been an important public health priority for regulators and policy-makers in recent years. Clinical trial transparency encompasses the registration of clinical trials before recruitment, the timely dissemination of results—whether positive or negative, and the sharing of individual patient data supporting those results [18]. Sharing of individual patient data enables reproducibility studies to confirm the validity of results, and facilitates meta-analyses. Transparency can also accelerate research and reduce duplicative trials that waste resources and expose participants to unnecessary risks. Regulators increasingly publish the clinical data submitted by pharmaceutical companies seeking market approval [19]. Some sponsors also proactively make individual patient data available. There are now several data sharing platforms that facilitate clinical trial data sharing, including Yale Open Data Access project (YODA) [20], ClinicalStudyDataRequest [21], and Vivli [22].

Ensuring that the results of clinical trials as well as the underlying data are made available is perhaps more important for rare disease clinical trials. Regulators sometimes allow more flexibility and accept greater clinical uncertainty to accelerate approvals of drugs for rare diseases with high unmet need. This is because of the "unique challenges that hinder efficient and effective traditional clinical trials, including low patient numbers, limited understanding of disease pathology and progression, variability in disease presentation, and a lack of established endpoints" [1]. Where there is greater uncertainty over the meaning of research data, there is a greater need for transparency to support regulators, prescribing physicians, and

*International Data Sharing and Rare Disease: The Importance of Ethics and Patient Involvement DOI: http://dx.doi.org/10.5772/intechopen.91237*

patients. Sharing individual patient data does raise concerns about patient privacy, discussed below. The tension between transparency and privacy, however, tends to be overstated, as benefits can be promoted and risks can be reduced through governance mechanisms. Moreover, rare disease patients are generally supportive of greater transparency, as long as their privacy is protected, appropriate steps are taken to seek their consent, and patient groups are involved in the design of data sharing governance.

#### **2.3 Access to medicines**

Even where approved medicines are available for rare disease, an additional hurdle is convincing health technology assessment bodies that these medicines which are often very expensive per patient—are cost-effective [23]. There is often significant uncertainty over the clinical and economic value offered by rare disease therapies, in part because of the limits to generating clinical evidence in small patient populations. One potential solution to accelerate patient access is through managed access programs. Where countries offer these programs, drugs may be given early approval despite some uncertainty over value, under the condition of ongoing collection of data to fill in evidentiary gaps. Real world evidence is collected through post-market surveillance to confirm the drug delivers value. Post-market surveillance, however, is challenging and requires effective data sharing strategies and infrastructure. Moreover, it is important to involve patients in decisions to approve drugs where there is greater uncertainty over benefits and risks. Patient can also be engaged in establishing the conditions under which a drug would meet or fail to meet the conditions of a managed access agreement. Indeed, patients are increasingly involved in health technology assessment to ensure that the drugs are delivering the clinical, economic, and personal value that matters to them [24]. Given the diverse burdens of disease on rare disease patients and their caregivers, they have important perspectives on the true value that can be delivered by new therapies.

#### **3. Privacy**

Data-intensive medicine, and the research, biobanking, and data sharing that necessarily accompany it, all raise privacy concerns for patients. In the Big Data era, increasingly rich data are being generated as part of clinical care and research protocols. Traditionally, privacy in research was primarily protected by removing or separating identifiers from research data. Rich, multi-dimensional health data can no longer be definitively de-identified. Genomic data for example is rich, unique to the individual, stable over time, and shared across families. It also contains potentially sensitive information about the health predispositions of individuals and their families. Genomic data therefore raise particular concerns about the limits of de-identification [25]. But the problem is broader than just genomic data. A recent study also showed that 99.98% of American can be reidentified from a database with less than 15 demographic attributes [26]. Re-identification is increasingly seen as an inherent risk in research. This risk increases as the dimensionality of data increases, as more publicly available data becomes available, and as new statistical re-identification tools emerge. If patients are re-identified, sensitive information about their health may be disclosed to unauthorized third parties, including employers, insurers, and family members, and may be used to discriminate against or stigmatize the individual or their family. Sharing patient data with clinicians and researchers around the world may heighten concerns over privacy. Where data are

copied and distributed to many different parties, there is a greater potential for a breach of confidentiality or security, and lower confidence that the breach will be identified and rectified.

Rare disease patients may face a greater risk of re-identification or subsequent harm. Rare disease patients may be easier to single-out in a dataset, given their unique genotypes and phenotypes, and the small number of participants in a study. Rich data is often collected or generated about rare disease patients, such as whole genome sequences and pictures and videos of their phenotypes. In order to match similar patients to inform a diagnosis, or to conduct a study with an acceptable sample size, information about rare disease patients must often necessarily be shared beyond institutions and national borders. Moreover, in part because of institutional and geographical barriers to care and participation in research, many rare disease patients share rich health information about themselves online with patient support groups or researchers. In fact, there are numerous academic and commercial research efforts that enable remote participation of rare disease patients to overcome geographic barriers [27]. The public availability of patient information could potentially increase the risk of re-identification in research datasets.

At the same time, many patients with rare disease see the important clinical and scientific value of data sharing and are willing to participate if research involves appropriate consent processes, safeguards, and patient involvement. A number of solutions have evolved to reduce the tension between privacy and openness. The first solution is to develop more transparent consents about how data are shared. This is recommended by the Global Alliance for Genomics and Health (GA4GH) Consent Policy [28]. Consent is discussed in greater detail in the next section. The second solution is through safeguards and governance, including robust de-identification, security protections, and access controls. Responsible data governance aims to maximize uses of data that benefit science and society, minimize risks to data subjects, and strike a proportionate balance where these interests come into conflict [29]. Risks of data breaches or misuse when sharing data can be significantly reduced through governance mechanisms including due diligence review of access requests by an expert committee, data access agreements that protect participant privacy, and ongoing monitoring of data use. Sharing data within secure cloud environments can enhance security and accountability by limiting the distribution of copies of datasets. Federated network technologies now allow researchers to submit search queries or run research analyses across multiple secure patient databases, without ever having to access the patient results. The World Economic Forum is exploring such an approach specifically for international rare disease research (see above). A third solution, also discussed below, is greater patient involvement in the design or research and data sharing governance, to ensure their input on priorities and the balancing of risks and benefits under uncertainty.

There is also a risk of too much privacy protection in the rare disease context. Data privacy laws are tightening globally in response to concerns over commercial and law enforcement surveillance practices. Europe's *General Data Protection Regulation* (GDPR) is now in force, and California will soon be introducing its own comprehensive consumer data privacy regime [30]. The GDPR imposes stricter, more formal procedural and security safeguards for the protection of personal data, particularly for special categories of data, for example, health and genetic. It also imposes higher consent standards with regards to the purposes of processing, and transfers between organizations and across borders. Different national and institutional interpretations of the GDPR have hampered international health research collaborations [31]. Formal legal safeguards and strict transparency requirements leave organizations with less flexibility to share samples and data about rare disease

*International Data Sharing and Rare Disease: The Importance of Ethics and Patient Involvement DOI: http://dx.doi.org/10.5772/intechopen.91237*

patients, especially internationally, even where researchers seek explicit patient consent and/or patient involvement in data sharing governance.
