Skip to main content

Data collection, processing, controlling

Introduction

The development of medical research, and more broadly scientific research, as well as healthcare, including in the field of gene and cell therapy, is today inseparable from the processing and analysis of health data, the production of which has greatly accelerated in recent years. Indeed, medical research and healthcare seem to be guided by the exploitation of this data, which is seen as a mine of information that is essential for research (data-driven research) today. As health data reveal sensitive information about the individuals concerned, their use is strictly regulated in order to preserve the rights of the individuals concerned (also referred to as « data subjects »). Beyond their use, it is their re-use (also referred to as « further processing » or « secondary use ») that is of interest in the field of medical research especially. Indeed, the ever-increasing production of data today, combined with the evolution of analysis techniques for data processing, including artificial intelligence techniques, makes it possible to extract useful information for medical research and healthcare from the masses of data collected. Nevertheless, due to the high sensitivity of health-related data, the use of such techniques must be strictly controlled so as not to infringe the rights and freedoms of the data subjects.

The General Data Protection Regulation (Regulation (EU)2016/679, hereinafter referred to as the GDPR), which came into force on 25 May 2018, strictly regulates the collection and processing of personal data. The GDPR imposes obligations on public or private organisations regardless of its size, which process personal data on its behalf or not, established in Europe or which, not established in Europe, target or collect data related to individuals within the European Union. Chapter VIII of the GDPR provides for severe fines for non-compliance with its rules (the amount of financial penalties can be up to 20 million euros or in the case of a company up to 4% of annual worldwide turnover). These penalties may be made public.

Regarding the re-use of data for scientific research purposes, the GDPR provides a legal framework which continues to fuel debates in its interpretation.

Regarding health data and genetic data specifically, they are considered by the GDPR as belonging to special categories of personal data (Please see also the subentry "Data classification"). Indeed, because of the potential information on the state of health of a person that they are likely to reveal, their sensitivity has led to the establishment of a general principle of prohibition of their processing (Art. 9(1) GDPR) - a principle to which there are several exceptions (Art.9(2) GDPR).

The protection of personal data is a fundamental right recognised by the European Union (EU) Charter of Fundamental Rights in its article 8. Individuals' data must be processed fairly, for specified purposes and on the basis of the consent of the data subject or other legitimate basis laid down by law (Please see also the subentry “Main principles”). Every individual has the right to access and rectify the data collected about him/her.

Stakeholders

Data subject: The identified or identifiable person from whom the data has been collected or who is concerned by the data processing carried out if the data has not been collected directly from him/her and to whom the GDPR has granted rights regarding the use of his/her personal data.

Data controller: The controller is the legal person (company, municipality, etc.) or individual who determines the purposes and means of a processing operation, i.e. the objective and the way it is carried out. In practice and in general, it is the organization as such, and not an individual within the organization, that acts as controller. The GDPR gives a definition of the controller in its article 4(7).

The data controller is therefore the one who has the responsibility for the processing, who is in charge of the "controlling". According to the principle of accountability foreseen by the GDPR, the controller is responsible for the compliance of the personal data processing with the rules of personal data protection.

It is also possible to have several controllers: they will therefore be jointly controllers for the processing and will define their respective obligations in order to ensure the compliance of the processing operation(s) carried out within a specific contract (Article 26 of the GDPR).
For more information: European Data Protection Board, Guidelines 07/2020 on the concepts of controller and processor in the GDPR.

Processor: The processor is a natural or legal person, public authority, department or other body which processes personal data on behalf of the controller (Article 4(8) GDPR). The processor(s) must provide sufficient guarantees as to the implementation of appropriate technical and organisational measures so that the processing complies with the law on the protection of personal data and guarantees the protection of the rights and freedoms of the individuals. Articles 28 and 29 GDPR lay down the rules governing the delegation of processing operations to processors by the controller(s). For more information: European Data Protection Board, EDPB Guidelines 07/2020 on  the concepts of controller and processor in the GDPR.

Data Protection Officer: The Data Protection Officer (referred to as « DPO ») is responsible for ensuring compliance with the GDPR within the organisation that has appointed him/her with regard to all processing carried out by that organisation. He/She is responsible for monitoring compliance with the rules of the GDPR, but also with other provisions of EU or Member States' law on personal data protection, and with the internal rules put in place by the controller and the processor on personal data protection. In certain cases defined in Article 37 of the GDPR, the DPO designation by the controller is mandatory (in particular when the activities of the controller include the massive processing of particular data, including health data). The designation, position and tasks of the DPO are detailed in Section 4 of the GDPR (Articles 37 to 39). In particular, where a DPO is designated, data subjects may contact him/her for any questions relating to the processing of their personal data and the exercise of their rights. The DPO involvement in the framing of any personal data processing is of high importance, in particular where sensitive categories of personal data are at stake and where a data protection impact assessment is necessary.

Supervisory authority: The supervisory authority is an independent public authority established by a Member State to monitor the application of data protection laws in order to protect the fundamental rights and freedoms of individuals with regard to the processing of their personal data and to facilitate the free flow of personal data. Data subjects affected by the processing of their personal data may refer to this authority if their rights are violated. The supervisory authorities liaise with data controllers and/or DPOs to ensure the compliance of the processing of personal data in the territory.

In the case of cross-border data processing, the supervisory authority of the principal place of business or sole place of business of the controller or processor is competent to act as the lead supervisory authority (Article 56 GDPR).

Chapter VI of the GDPR (Articles 51 to 59) describes their status, organisation, competences, tasks and powers.

Please see also the subentry “Data protection authorities”)

European Data Protection Board (EDPB): The European Data Protection Board is an independent body of the European Union provided for by the GDPR (Articles 68 to 76). It replaced the Article 29 Working Party (Art.29 WP). Its main task is to ensure the harmonised interpretation of the GDPR in all EU Member States. On the basis of Article 64 of the GDPR, it is called upon to issue formal opinions, and on the basis of Article 65, binding decisions in the event of disputes between authorities. Furthermore, through the development of guidelines and opinions, the European Data Protection Board contributes to the common position of EU data protection authorities by providing clarification on the interpretation of the GDPR provisions.

Definitions

Data collection: Refers to the first step in the data processing activity. In fact, it consists of the collection of data, by whatever means, which will be useful for the purpose of the processing. As Article 5(b) of the GDPR reminds us, personal data must be collected for specified, explicit and legitimate purposes and may not be further processed for a purpose incompatible with the original purpose for which they were collected.

The collection operations are considered to be personal data processing activities and must therefore comply with the requirements of the GDPR.

This collection may be made by the controller directly from the data subjects. Also, the data may not have been collected directly from the data subjects, in particular in the case of re-use of data collected by another controller.

This data processing will imply the information of the data subjects, even if the data have not been collected directly from them (information to be provided listed in Articles 13 and 14 GDPR), unless an exception provided for in Article 14 of the GPDR applies.

Data acquisition: Refers to the situation where the data concerned is acquired not by collection from the data subjects but by a third party who has already collected data for other purposes. For example, data may be acquired from a data repository, transferred from a data logger or data capture. In this regard, Recital 10 of Regulation (EU) 2022/868 on European Data Governance uses the term data acquisition with reference to Directive (EU) 2016/943, which sets out the framework for the lawful obtaining, use or disclosure of trade secrets.

Data processing: Refers to an operation or set of operations which relates to personal data, whatever the process used (collection, recording, organisation, storage, adaptation, alteration, retrieval, consultation, use, disclosure by transmission or dissemination or otherwise making available, alignment, through automated or non-automated means). (Article 4(2) GDPR)

This notion includes a broad range of practices such as data preparation and data analysis, regardless of the purpose of the processing operation.

Data processing must meet the conditions of lawfulness laid down by the GDPR and must in particular have one or several purposes determined prior to the collection and use of the data.

A distinction can be made between two types of processing: primary use (or initial processing) and secondary use of the data, (also referred to as “reuse” or “further processing”):

  • Primary use refers to the situation where data have been collected and processed for a specific purpose.
  • Secondary use refers to the processing of data previously collected for a purpose different from that originally envisaged at the time of collection. This secondary use potentially includes controllers other than the one(s) who originally collected the data. 
  • There are different data processing techniques:
  • Manual processing (i.e. non-computerised processing),
  • And automated processing (i.e. computerised processing that can include artificial intelligence systems).

Thus, personal data processing is not necessarily computerised: paper files are also concerned and must be protected under the same conditions. Whatever the personal data processing technique used (automated or not), the rules of the GDPR will apply.

Artificial Intelligence System (AIS): According to the second version of the compromise text on the proposed Regulation on Artificial Intelligence dated 15 July 2022, artificial intelligence system means a system designed to operate with a certain level of autonomy and that, based on machine and/or human-provided data and inputs, infers how to achieve a given set of human-defined objectives using machine learning and/or logic- and knowledge based approaches, and produces system-generated outputs such as content (generative AI systems), predictions, recommendations or decisions, influencing the environments with which the AI system interacts.

Among the techniques of data processing by artificial intelligence, we often find two types of techniques called deep and machine learning:

  • Machine learning: Machine learning is an artificial intelligence technique that aims to give machines the ability to "learn" from data, via mathematical models. More specifically, it is the process by which relevant information is drawn from a training data set.
  • Deep learning: Deep learning refers to systems that learn via neural networks without being guided by humans and whose logic may be difficult for the developer to explain. Deep learning is a sub-field of machine learning techniques.

Supervisory Authority: Refers to one or more independent public authorities established within a Member State to supervise the application of data protection rules in order to protect the fundamental rights and freedoms of individuals with regard to the processing of their personal data, but also to facilitate the free flow of personal data within the European Union. Individuals and data controller(s) and processor(s) will be able to interact with this authority in case of a breach of rights, but also for any other question related to the application of the GDPR and the national laws in force regarding the protection of personal data. See Articles 51 to 67 (Chapter VI) of the GDPR.

Data Protection Impact Assessment (DPIA): Refers to a specific study of personal data processing risks, data protection threats and assets that must be carried out when a processing of personal data is likely to result in a high risk to the rights and freedoms of data subjects. The DPIA helps identifying adapted risk mitigation/elimination measures at technical and organizational levels. See Articles 35 and 36 GDPR.

Record of processing activities: The record of processing activities provides an inventory of personal data processing and an overview of what the controller does with personal data. In particular, it identifies: the parties involved in the processing, the categories of data processed, what the data is used for, who accesses it and to whom it is communicated, how long the personal data is kept and how it is secured. This record of processing activities must be kept by the controller and, where applicable, the controller's representative. See Article 30 GDPR.

Code of conduct: Refers to a practical guide detailing concrete application of the applicable data protection regulation to a given sector of activity and consisting of good practices (retention period, information mentions, operating methods [SOPs], etc.) that can be adhered to by the controller and processor for a specific processing.

In the meaning of the GDPR, a Code of conduct is developed by professional actors (federations, professional organisations) representing categories of data controller or processor. An organisation can freely adhere to a code of conduct. The GDPR provides that a third-party body will be responsible for monitoring compliance with a code, which gives this compliance tool a binding character. Codes of conduct may be European or national; in the latter case, the national supervisory authority responsible for monitoring the application of the GDPR will validate their content before publishing them. See Article 40 GDPR.

Certification mechanism: Article 42 of the GDPR encourages the establishment of data protection certification mechanisms to demonstrate that processing activities comply with data protection regulations. The certification mechanism is a procedure whereby an external certification body or by the competent supervisory authority gives written assurance that a product, process or service complies with the requirements given in a standard. Certification is binding for the certified body and gives rise to regular checks by the third-party certifier on compliance with the standard through audits and examinations. This certification has a limited duration and must therefore be renewed. Articles 42 and 43 GDPR.

Challenges

Because of the rapid evolution of technologies and the growing production of health data and the promises that their processing holds for medical research and healthcare, there is a real desire to enhance the value of health data by allowing their accessibility and reuse, including cross-referencing with other data (geographical or environmental data, for example), in order to enable innovations and improvements of medical knowledge and patients’ care, notably through the acceleration of collaborative scientific research. However, aside the benefits that could be extracted from personal data processing, the collection of personal data entails risks for the rights and freedoms of data subjects that must be duly considered. Indeed, information revealing directly or indirectly the health status of individuals is potentially at risk of misuses (Please see also the subentry “Mission creep/Data misuse”) if it is used for other purposes, such as for discriminating, for example by an insurance company or a bank. It is therefore essential to plan appropriate measures that will ensure responsible and ethical personal data collection and processing operations in research and enhance trust of data subjects.

Thus, a balance must be struck between protecting the rights of individuals and making data available to the research and health community. Indeed, it is necessary to find a balance between the under-exploitation of data, which would jeopardise research carried out in the interest of the population, and unregulated sharing and accessibility, which would create major risks for the rights and freedoms of individuals.

It is in particular this balance between preserving the rights of individuals and enhancing the value of health data that is sought in the future regulation on the European Health Data Space (EHDS). This future EHDS aims to develop a specific ecosystem for health allowing the establishment of rules, practices, common standards, infrastructures and a governance framework. This space aims:

  • To facilitate the use of health data within Europe to improve patient care across Europe by allowing European citizens to control their own health data within their own country but also across borders, and
  • To promote the re-use of health data for research, innovation and policy-making.

The forthcoming establishment of the European Health Data Space (EHDS) has as a main objective to strengthen the effectiveness of individuals' rights over their data and to open new avenues for controlled health data access and processing for public interest purposes, including for scientific research.

In order for the individuals concerned to be able to exercise their rights under the GDPR and national data protection laws, they need to be informed in a clear and transparent manner about the processing carried out and their rights in relation to that processing. This information and preservation of individual rights is essential in order to guarantee so-called "participatory" research where individuals are consulted or at least informed about the uses of their data. Since medical research relies on the participation of individuals, it is essential to preserve a link with individuals in order to foster their trust. This trust is at the heart of the current regulatory movements in order to ensure the long-term responsible use of health data.

In this respect, the TEHDAS (Towards European Health Data Space) project  (TEHDAS is a joint action mobilising twenty-five European countries to provide the concepts to be included in the future European legislative act for the creation of the European Health Data Space) has placed citizen engagement at the heart of its work. This joint action will soon publish recommendations on this subject. A citizen e-consultation has been launched on this subject between the end of 2021 and the end of 2022 in order to collect the views of European citizens on the use and re-use of their health data, and thus to propose recommendations on this topic. Overall, this joint action aims to create governance conditions that allow all actors involved in the creation of this European Health Data Space to access health data in a secure and transparent way, regardless of where it is stored within Europe. Find out more and read their findings here.

Specifically with respect to genomic research, the Global Alliance for Genomics and Health (GA4GH) produces guidelines that are internationally recognized as standards for developing frameworks to enable responsible genomic data sharing within a human rights framework. Regarding citizens/patients’ engagement issues, they have produced guidelines for establishing a framework for involving and engaging participants, patients and publics in genomics research and health implementation.

Moreover, the future EU Regulations on artificial intelligence and on the future European Health Data Space (EHDS) offer an important place to ethical considerations related to the use of artificial intelligence and the use of health data. In particular, the European Commission has published guidelines on the concept of ethics by design for the design, development and use of ethical and responsible artificial intelligence solutions including data management and data protection rules.

Overall, it is important to recall that ethical considerations are an important aspect of medical research and are reflected in the World Medical Association's Declaration of Helsinki which is considered to be the founding text of bioethics. This declaration sets out the ethical principles applicable to medical research involving human beings, including research on human biological material and identifiable data. Moreover, ethical considerations related to medical research are also reflected in the European Regulation on clinical trials on medicinal products for human use of 16 April 2014. Indeed, its Recital 80 recalls that the Regulation is in line with the principles and good practices emanating from the Helsinki Declaration. In this respect, it is recalled that any individual (or his/her legal representative) must have given informed consent to participate in a clinical trial (Article 28(1)c. of Regulation (EU) No 536/2014 on clinical trials on medicinal products for human use).

Genetic data is particularly sensitive because of the intimate and unique nature of genetic heritage, making it particularly personal and identifiable (in particular where human genomic sequence is used) and can involve information relating to thirds such as family members. It has a real informative character but also a potentially predictive character concerning the state of health of an individual, which explains the special protection afforded to it at European level by the rules of the GDPR. In this respect, the Council of Europe's Oviedo Convention of 4 April 1997 prohibits any form of discrimination on the basis of genetic heritage. Also, the Charter of Fundamental Rights of the European Union in its article 21 prohibits any form of discrimination based on an individual's genetic characteristics.

Thus, because of the particular characteristics of genetic data, it is legally preferable not to consider it as data that can be anonymised but as pseudonymised in order to guarantee appropriate protection and individual rights.,  For some specific cases, such as somatic genetic testing (e.g using genome of tumour cells), it is still considered possible to anonymise the data under the condition that strict anonymisation conditions are respected (Reference text on the subject: Guidelines Article 29 Data Protection Working Party, Opinion 05/2014 on Anonymisation Techniques, 10 April 2014).

Also, one of the challenges at European level is the fragmentation of the regulatory landscape regarding the rules applicable to the processing of genetic and health data. Indeed, due to the specificity of such data, the GDPR has provided for the possibility for Member States to introduce additional conditions, including limitations on the processing of such data (Article 9(4) GDPR). Thus, the legal framework is likely to change from one State to another. The data controller will therefore have to ensure that, in addition to the rules of the GDPR, it complies with national laws.

Specifically for the re-use of data for scientific research, the regulatory framework at EU level has yet to be clarified. In practice, it is still subject to divergent interpretations in the various Member States. The European Data Protection Board has responded to some of the questions raised in this regard. Guidelines focusing on the processing of personal data for scientific research purposes are expected from the European Data Protection Board.

Artificial intelligence will soon be regulated by the future EU Regulation on artificial intelligence (AI Act) which was proposed on 21 April 2021. This future regulation aims to provide a harmonised framework for AIS developed and commercialised in the EU. The approach adopted is that of a risk classification due to the different issues for the violation of rights that exist in the use of AIS (data quality bias potentially leading to discriminatory bias), and even more so when dealing with sensitive data such as health and genetic data. This future regulation will also establish harmonised rules for the marketing, commissioning and use of AI systems in order to ensure a high level of protection within the EU and to guarantee the respect of EU values and fundamental rights and principles. The EU aims to promote the adoption of ethical and trustworthy AI. The High-Level Expert Group on AI consituted by the European Commission has already established guidelines on ethical and trustworthy AI systems. Also, it has provided recommendations for the development of ethical AI systems by design promoting respect for human beings, privacy, personal data protection and data governance, fairness, individual, social and environmental well-being, transparency, accountability and oversight.

UNESCO has also set out principles and values to be taken into account as a basis for the development of AI systems that are ethical, i.e. at the service of individuals, societies, the environment and ecosystems.

Also, the European Commission has recently proposed two directives on product liability to adapt to the digital age and one on AI liability rules to protect consumers and foster innovation.

Opportunities and incentives

Data sharing and re-use initiatives of health and genetic data for medical research are a real opportunity for the development of medical research and the improvement of patient care. In this respect, the future European Health Data Space has focused one of its objectives on promoting the re-use of data for research, innovation and public policy. Several European projects promote this re-use of data for research purposes in a framework that complies with data protection laws and preserves the rights and freedoms of individuals. We can cite as examples BBMRI-ERIC, the European project CINECA (Common Infrastructure for National Cohorts in Europe, Canada, and Africa), and the Data Analysis and Real World Interrogation Network (DARWIN EU).

About the Data Analysis and Real World Interrogation Network (DARWIN EU), this initiative has the ambition to establish a coordination center to provide timely and reliable evidence on the use, safety and efficacy of human medicines including vaccines from real-world healthcare databases across the EU. In particular, this data can be used by the European Medicines Agency and national medicines control authorities at any time throughout the life cycle of a medicine. The DARWIN project is directly linked to the establishment of the European Health Data Space, as the DARWIN project will contribute to the development of this data space. Indeed, their objectives are similar as the DARWIN initiative will enable the exchange of healthcare data for use in healthcare delivery, policy making, and research in Europe while fully respecting data protection requirements.

Interactions with regulators

Interactions between the controller and/or processor and a regulatory authority

Cooperation of the controller and processor with the supervisory authority (Article 31 GDPR). At the request of the competent supervisory authority, the controller and the processor must cooperate with that authority. In particular, the documentation (including the record of processing activities) attesting to the compliance of the processes for collecting and processing personal data must be available at the request of the supervisory authority.

Prior consultation of the supervisory authority by the data controller in the case of personal data processing presenting a high risk. As a reminder, when the planned data processing involves sensitive data, the controller must carry out a Data Protection Impact Assessment prior to processing the data (Article 35 GDPR). This impact assessment must therefore be carried out in the case of the processing of health and genetic data. If this analysis shows that the processing would present a high risk for the data subjects that the controller cannotmitigate through appropriate measures ,he/she must consult the supervisory authority prior to implementing the processing. National supervisory authorities adopted lists of processing subject to mandatory DPIA.

Personal data breach (Article 33 and 34 GDPR). In the event of a personal data breach, the controller must notify the competent supervisory authority as soon as possible (if possible within 72 hours of the breach if there are risks to the rights and freedoms of natural persons). The processor shall notify the controller of any personal data breach as soon as possible after becoming aware of it. Article 33 GDPR details the information that must be included in the breach notification. The controller must document any personal data breach and the steps taken to address it. This documentation may allow the controller to certify to the supervisory authority, if necessary, that the breach procedure has been followed.

If the breach in question is likely to result in a high risk to the rights and freedoms of an individual, the controller must communicate as soon as possible to the data subject in clear and simple terms the nature of the breach and must include the information described in Article 34(2) GDPR. In certain specific cases described in Article 34(3) GDPR, this communication is not necessary.

Interactions between the Data Protection Officer and a regulatory authority

Interaction between the Data Protection Officer and the national supervisory authority. By virtue of the tasks conferred by the GDPR (Article 39) on the Data Protection Officer, the latter must cooperate with the supervisory authority if necessary (for example by communicating the documentation attesting to the compliance of the data processing activities implemented) and must act as a contact point for the supervisory authority if necessary (in particular in the case of consultation prior to a data processing operation presenting a high risk).

Interactions between data subjects and a regulatory authority                                                                                                   

Complaints by individuals to a supervisory authority (Article 77 GDPR). Any data subject has the right to lodge a complaint with a supervisory authority if he/she considers that the processing of personal data relating to him/her constitutes a breach of data protection rules. The supervisory authority shall inform the person lodging the complaint of the status and outcome of the complaint, including the possibility of a judicial remedy as provided for in Article 78 GDPR.

Right to an effective judicial remedy against a supervisory authority (Article 78 GDPR). Any natural or legal person has the right to an effective judicial remedy where the supervisory authority fails to deal with a complaint or to inform the data subject within three months of the progress or outcome of the complaint lodged.

Right to judicial remedy against a controller or processor (Article 79 GDPR). Any data subject has the right to an effective judicial remedy if he/she considers that his/her rights under the GDPR have been infringed as a result of the processing of his/her personal data. This remedy may be sought in addition to any administrative or extrajudicial remedy including the right to lodge a complaint with the supervisory authority as provided for in Article 77 GDPR. Any person who has suffered material or non-material damage as a result of a breach of data protection rules has the right to obtain compensation from the controller and processor for the damage suffered (Article 82 GDPR).

Practical steps

On the collection, acquisition and processing of personal data

What are the principles to be respected in order to establish lawful data collection and processing? The collection and processing of data must meet certain conditions to be considered lawful. These conditions of lawfulness relate to the purpose for which the data is collected and processed, but also to the way in which the data is collected and processed.

Article 5 of the GDPR sets out the principles to be respected regarding the processing of personal data:

  • The data must be processed in a transparent manner: this implies that the data subjects are informed of the processing.
  • The purpose limitation principle: The purpose of the collection must be specific (i.e. not described in terms that are too broad or too vague), explicit (clarity and precision of the information given in advance and preferably in writing), and legitimate (legal requirement or justified by the research). This principle prohibits the processing of data for purposes that are not compatible with those for which they were originally processed. Specifically for scientific research, Recital 33 of the GDPR provides for the possibility of a more flexible approach to the specificity requirement by providing that individuals may give their consent for certain areas of research or certain parts of the research project in compliance with the ethical standards of scientific research provides when it is not possible to fully identify the purpose of processing personal data for scientific research purposes at the time of collection of the data. In practice this can be a research program comprising several research projects with the same overall aim.
  • The data minimisation principle: data must be adequate, relevant and limited to what is necessary for the purpose of the processing.
  • The accuracy principle: data must be accurate and kept up to date and corrected if necessary.
  • The principle of limited retention: identifying data will only be kept for as long as is necessary for the purpose of the processing operation. However, it is possible to exceed this period for scientific research purposes, provided that technical and organisational measures are implemented to guarantee the rights and freedoms of the data subjects (such as pseudonymisation of data to minimise the risk of identification of the data subjects).
  • The principle of integrity and confidentiality: This principle implies the implementation of technical or organisational measures limiting any misuse of data and guaranteeing data security.

Please see also the subentry "Main principles".

Who is responsible for implementing these principles? Article 5 GDPR designates the controller as the person responsible for compliance with these principles.

How to identify the data controller? The data controller is the legal entity or the natural person who determines the purposes and means of the data processing, i.e. the objective and the way to achieve it. In practice and in general it is the legal entity (for example a research institute) embodied by its legal representative. There may be several data controllers if more than one actor is involved in determining the purposes and means of the processing. In this case, these different actors will be jointly responsible for the processing. European Data Protection Board, Guidelines 07/2020 on the concepts of controller and processor in the GDPR .

Practical documentation on the designation of the data controller and data processor.

When should these principles be implemented? According to the principle of Privacy by Design and by Default (foreseen and described in Article 25 GDPR), the controller must implement appropriate technical and organisational measures (such as pseudonymisation) both when determining the means of processing and when carrying out the processing itself, in order to effectively implement the data protection principles (such as data minimisation) and to ensure that the rights and freedoms of data subjects are respected. A certification mechanism (provided for in Article 42 GDPR) can serve as a demonstration of compliance with the requirements of this principle.

How to ensure that the data processing carried out is lawful? As defined in article 5 GDPR, the processing must be lawful, i.e. any processing must be based on one of the legal grounds set out in article 6 GDPR. The legal basis for processing is the first condition for the lawfulness of the processing. It is prohibited to process personal data without a legal basis.

What are the legal bases under the GDPR for processing personal data? Article 6 GDPR requires:

  • consent: the person has consented to the processing of his/her data; or
  • a contract: the processing is necessary for the performance or preparation of a contract with the data subject; or
  • legal obligation: the processing is imposed by legal texts; or
  • public interest task: the processing is necessary for the performance of a public interest task; or
  • legitimate interest: the processing is necessary for the pursuit of legitimate interests of the body processing the data or of a third party, in strict compliance with the rights and interests of the persons whose data are processed; or
  • safeguarding vital interests: the processing is necessary to safeguard the vital interests of the data subject, or of a third party.

Of note, when another legal basis than consent is chosen, data subjects must nevertheless be informed of the data processing and have the right to object to it (opt-out mechanism).

Who defines the legal basis for the processing? The legal basis must be defined by the data controller on a case-by-case basis, in a manner appropriate to the situation. In accordance with the principle of responsibility of the data controller, the choice of the legal basis should be documented in order to attest the regulatory compliance of the data processing. Furthermore, the chosen legal basis should be mentioned in the information letters sent to the data subjects of the data processing. In practice, for research activities, the legal basis chosen is most commonly:

  • the public interest (Art.6(1)e) GDPR): used by public organisations or private organisations as long as they pursue a mission of public interest or are endowed with prerogatives of public power; or
  • the legitimate interest (Art.6(1)f) GDPR): commonly used by private organisations for processing that do not significantly affect the rights and interests of data subjects; OR
  • the consent (Art.6(1)a) GDPR).

Of note, the processing of special categories of data (such as health-related data or genetic data) must comply with a processing purpose foreseen by Article 9 GDPR (as explained below).

What are the specific rules applicable to the collection and processing of health and genetic data? As explained above, the GDPR has provided for a general prohibition principle for the collection and processing of special categories of personal data, including health and genetic data (Article 9(1) GDPR). However, there are several exceptions to this prohibition principle. They allow the collection and processing of data in very limited cases, including the consent of the data subject, the public interest of the processing, and scientific research purposes (Article 9(2) GDPR).

What are the specific rules applicable to the processing of health and genetic data for scientific research purposes? The purpose of scientific research is one of the exceptions provided for by the GDPR (Art.9(2)j) allowing the processing of sensitive personal data in accordance with specific provisions: the processing must be subject to appropriate safeguards for the rights and freedoms of the data subjects such as the implementation of technical and organisational measures, in particular to ensure compliance with the principle of data minimisation (Article 89(1) GDPR). In particular, the technique of pseudonymisation is explicitly referred to in the GDPR.

These required additional measures are justified by the fact that the collection and processing for research purposes benefit from certain exemptions which mainly concern the rights of the data subjects (exemptions to right of access, right of rectification, right to restriction of processing, right to object), insofar as these rights make it impossible or seriously hinder the achievement of the purpose of the research (Article 89(2) GDPR).

Also, as explained previously, some flexibility is provided for as to the delimitation of the research purpose. Indeed, Recital 33 of the GDPR provides that it is not always possible to fully identify the purpose of processing personal data for scientific research purposes at the time of collection of the data. Thus, a flexible understanding of this purpose is allowed at the time of the collection of the data subjects' consent: individuals may give their consent for certain areas of research or certain parts of the research project in compliance with the ethical standards of scientific research. Nevertheless, the description of the research aim must remain precise.

In addition, the GDPR provides for the obligation to carry out a Data Protection Impact Assessment (DPIA) for any processing of data considered to be at risk (Article 35 GDPR). As genetic and health data are considered to be special data due to their sensitivity, their processing will require this impact assessment to be carried out. This impact assessment ensures that the processing will be compliant with the GDPR and will respect the rights of the individuals concerned.

How to carry out a Data Protection Impact Assessment? The Data Protection Impact Assessment (DPIA) must be carried out before the processing operation is set up and must be reviewed during the course of the processing operation, particularly if major changes occur in the way the data are processed. The participants in the conduct of the DPIA are the controller, the Data Protection Officer, any processor(s), IT staff and the data subjects of the processing operation. The DPIA consists of three parts:

  1. A detailed description of the treatment implemented, including the technical and operational aspects of the treatment;
  2. An assessment of compliance with the fundamental principles of data protection, namely: an examination of the necessity of the processing and compliance with the proportionality principle (the data collected and processed are strictly necessary for the purpose of the processing) as well as a description of the measures put in place to guarantee the rights of the data subjects.
  3. A more technical study of the risks to data security (confidentiality, integrity and availability) and their potential impact on privacy. This study must be completed by a description of the technical and organisational measures envisaged to deal with these risks and protect the data.

The CNIL (French data protection supervisory authority) has developed a  pedagogical tool for carrying out this DPIA, available in English.

For more information: Article 29 Data Protection Working Party, Guidelines on Data Protection Impact Assessment (DPIA) and determining whether processing is "likely to result in a high risk" for the purposes of Regulation 2016/679.

What are the rules on informing data subjects about the processing of their personal data? The data controller must provide information to data subjects on the envisaged processing, in accordance with the principle of transparency. The GDPR distinguishes two situations in this respect:

  • Where data are collected directly from the data subject (Article 13 GDPR),
  • Where the data have not been obtained from the data subject (Article 14 GDPR).

This information should include in particular:

  • the identity and contact details of the controller,
  • the purposes of the processing and the legal basis for the processing,
  • the recipients of the personal data if applicable,
  • information on the individual rights attached to this data processing (Articles 15 to 20 GDPR) and on the procedure to be followed to exercise these rights,
  • the intention, if applicable, to carry out a transfer of data to a third country,
  • if applicable the intention to carry out further processing of the data for another purpose.

For a complete reading of the requirements related to individual information, please refer to articles 13 or 14 of the GDPR depending on the applicable situation.

As regards the situation where the personal data used have not been obtained from the data subject, article 14(5) GDPR provides for exceptions to this information obligation, such as in cases where :

  • the data subject already has this information,
  • or the provision of such information would require disproportionate efforts, in particular in the case of processing of data for scientific research purposes

Please refer to Article 14(5) GDPR to see all the exceptions provided for.

If such exceptions are put in place, the controller will have to document and explain the use of such exceptions.

This information to the data subjects should be renewed in case of substantial modification of the processing activities (i.e. concerning the main characteristics of the processing, such as a new purpose, an addition of sensitive data collection, change of controller, etc.) or in case of a specific event related to the data processing (e.g. in case of a data breach).

Is it mandatory to obtain consent from data subjects prior to the collection and processing of their health and genetic data? Consent to the processing of data is not always required. It depends on the legal basis for the processing. If the processing is based on the consent of the data subjects, then the collection of consent prior to the implementation of the processing is mandatory. This consent must comply with the requirements laid down by the GDPR, namely: free, specific (given for one or more purposes), informed (information given to the person about the processing) and unambiguous (given by a clear positive act without ambiguity).

Data subjects may change their mind at any time and withdraw their consent.

In addition to these requirements, in the case of processing of sensitive data (including health and genetic data), the criterion of explicitness is added, i.e. the data subject must expressly declare his/her consent (for example in writing).

The collection of consent must be documented by the data controllers, they must be able to prove that consent in accordance with these requirements has been obtained.

However, in the context of scientific research, often the legal basis of scientific research purposes will be preferred (Art.9(2)j and Art.6(1)e or f) GDPR). Recourse to this legal basis does not require the prior consent of individuals for the processing of their data. However, the obligation to inform must always be respected by allowing data subjects to object to the processing (informed opt-out mechanism and right to object).

Nevertheless, it is necessary to recall that Member States may introduce additional conditions, including limitations, regarding the processing of health and genetic data (Article 9(4) GDPR). Thus, national law may provide that the processing of these categories of data may require consent even if this is not required by the GDPR and regardless of the legal basis chosen for the data processing. The controller will then have to ensure compliance with the national laws in force in the Member States in which the data are collected and/or processed.

What should be considered before choosing consent as the legal basis for processing personal data? All the requirements of the GDPR regarding consent must be respected, namely that the consent is freely given, specific, informed and unambiguous (Article 4 GDPR). In addition to these criteria, the explicit nature of the consent (Article 9(2)a GDPR) must be respected in cases of processing of personal data considered as sensitive (such as health and genetic data). Moreover, as recalled by the European Data Protection Board, particular importance should be attached to the condition of "freely given" consent. Indeed, this implies that the person has made a choice and has real control. If there is a clear imbalance between the data subject and the controller, consent should not be a valid legal basis for processing personal data (Recital 43 of the GDPR). These elements are important in the context of medical research where situations of power imbalance between the sponsor/investigator of the research project may often exist (children, persons in a situation of institutional or hierarchical dependence, economically or socially disadvantaged categories of persons, etc.). The investigator and/or controller should take all these elements into account when choosing the legal basis for the data processing they intend to carry out.

What is the difference between consent to process data for research and consent to participate in research? It is necessary to distinguish between informed consent to research and consent to data processing for research. Indeed, participation in research is governed by national laws which may require informed consent under specific conditions. At the European level, Article 28 of the EU Clinical Trials Regulation recalls the mandatory nature of informed consent for any participation in a clinical trial. As the European Data Protection Board pointed out, these provisions are primarily a response to the essential ethical requirements relating to the supervision of research projects involving human beings and which derive from the Helsinki Declaration (European Data Protection Board, Opinion 3/2019 on questions and answers on the interaction between the Clinical Trials Regulation and the General Data Protection Regulation (GDPR) (Article 70(1)(b)) Adopted on 23 January 2019). Informed consent implies the provision of exhaustive, complete and intelligible information on the planned research. In this respect, Article 29 of the Clinical Trials Regulation describes the list of elements that must be provided to the individuals concerned in order for the information to be valid.

Furthermore, it is necessary to recall that informed consent to research is an ethical requirement under the Oviedo Convention, the Declaration of Taipei and the Declaration of Helsinki

Thus, this informed consent to participate in research is to be distinguished from the consent to the processing of personal data provided for by the GDPR, which can be used as a legal basis for processing and which must meet several criteria (free, informed, unambiguous, specific and explicit for the processing of sensitive data such as health and genetic data). The European Data Protection Board has reiterated this distinction between informed consent under the Clinical Trials Regulation and the notion of consent as a legal basis for processing personal data under the GDPR, in its Opinion 3/2019 on questions and answers on the interaction between the Clinical Trials Regulation and the General Data Protection Regulation.

Beyond the clinical trials situation, this informed consent has been described by the European Data Protection Board as a potential "appropriate safeguard" (provided for in Article 89(1) of the GDPR) to be put in place to safeguard the rights and freedoms of individuals in the context of data processing for scientific research purposes. Thus, even if the legal basis chosen is that of scientific research purposes and there is no legal obligation to obtain consent from data subjects, an ethical approach would require it. This position is also defended by the European project CINECA (Common Infrastructure for National Cohorts in Europe, Canada, and Africa).

More information on the difference between informed consent to participate in a clinical trial and consent as a legal basis for processing personal data under the GDPR: European Data Protection Board, Opinion 3/2019 on questions and answers on the interaction between the Clinical Trials Regulation and the General Data Protection Regulation, 23 January 2019.

Guide on Consent Policy: The Global Alliance for Genomics and Health (GA4GH) Consent Policy aims to guide the international sharing of genomic and health-related data in a way that respects autonomous decision-making while promoting the common good of international data sharing.

How to ensure the protection of the rights of data subjects to the processing of their personal data under the GDPR? The GDPR recognises several rights of data subjects to the processing of their personal data. The controller must take appropriate measures to provide the information referred to in Articles 13 and 14 of the GDPR to data subjects in order to ensure transparent processing of their personal data. This information must explain how to exercise the data subject's rights in order to guarantee their effectiveness. This information must be concise, transparent, comprehensible and easily accessible in simple and clear terms. In addition, the information must be adapted to the target audience, such as children.

These rights are listed in Chapter III of the GDPR and include:

  • the right to information: respect of the principle of transparency (Articles 13 and 14),
  • the right to access his or her data (Article 15),
  • the right to rectify personal data that are inaccurate (Article 16),
  • the right to erasure or otherwise known as the right to be forgotten on certain grounds (Article 17),
  • the right to restrict processing in certain specified situations (Article 18),
  • the right to data portability (Article 20),
  • the right to object: at any time to the processing of data (Article 21).

Of note, exceptions are foreseen by GDPR to some of these rights regarding data processing for scientific research purposes. Notably, the right to information, the right to erasure and the right to data portability may be not be applicable (if the exercise of these rights is likely to make it impossible or seriously compromise the achievement of the objectives of research). These exceptions must be justified and documented.

What are the rules to be respected when an artificial intelligence system is used in the context of health data processing? The use of these systems in the context of personal data processing is subject to the rules of the GDPR, namely :

  • define a purpose,
  • respecting the principle of transparency: which translates here into informing data subjects of the use of artificial intelligence in the processing of their personal data, but also by the explicability of AI, i.e. that individuals should be able to understand the results and conclusions created by the algorithm.
  • Respect the principle of data minimisation: use only the data necessary for the training and operation of the AI system in relation to the purpose of the processing.
  • Ensuring the exercise of individuals' rights with regard to the processing of their personal data: First of all, it is important to note that Article 22 of the GDPR provides that the data subject of a decision-making system based exclusively on automated processing which produces legal effects concerning him/her or significantly affects him/her in a similar way, has the right not to be subject to that decision. Furthermore, the rights of individuals in relation to the protection of personal data apply throughout the life cycle of the AI system using the personal data of the data subjects. The exercise of these rights concerns both the data contained in the databases used for training the system and the data processed, which also involves the data produced by the system. If AI is used for scientific research purposes, the exceptions to the individual rights provided for in the GDPR may also apply, provided that they are justified and documented.

Thus, the controller must be aware of all these requirements which must be respected (privacy by design, ethics by design). Furthermore, it is important that these systems are supervised and monitored on an ongoing basis, particularly in the case of machine learning, due to the highly evolving nature of these systems. It is the idea of human oversight that is part of the European Commission's guidelines for trusted AI. 

On the re-use/secondary use/further processing of personal data

Is it possible to reuse data for a different purpose than the one for which it was collected? Often referred to as "further processing" or "reuse of data", these practices consist of processing data for a purpose other than that for which it was initially collected. Health research today is largely based on the reuse of health data.

The GDPR does not explicitly address this issue but provides for the possibility of further processing of data for compatible purposes (Article 5(1)b). In order to verify whether the envisaged processing is compatible with the original purposes of processing, the controller must carry out a compatibility test if the further processing of the data is not based on the consent of the individual or on EU law.

Is it necessary to have a separate legal basis for further processing of health data? Recital 50 of the GDPR provides that if the purposes of the initial collection and further processing are compatible, a new legal basis separate from the one on which the data were collected is not necessary.

What is the compatibility test? The test consists of a case-by-case analysis of the context of the initial data collection and processing to ensure the compatibility of the further processing of the data, taking into account the legitimate expectations of the data subjects.

In order to carry out this test, the controller must take into account (Article 6(4) GDPR):

  • whether there is a link between the original and intended purposes,
  • the context in which the data were collected and the relationship between the data subjects and the controller,
  • the nature of the data and in particular whether the data belong to the special categories of data (thus including health data and genetic data),
  • the possible consequences of further processing for the data subjects,
  • and the existence of appropriate safeguards for the preservation of the rights and freedoms of the individuals concerned, including for example pseudonymisation.

What is the framework for the further use/re-use of data for scientific research? As explained above, scientific research today relies heavily on the re-use of data. Thus, specific provisions are foreseen for scientific research: further processing of data for scientific research purposes is not considered to be incompatible with the initial purposes (presumption of compatibility). That is, the processing will be considered a priori compatible with the initial purposes of the processing provided that appropriate safeguards for the rights and freedoms of data subjects are put in place (implementation of technical and organisational measures to respect the principle of data minimisation, including pseudonymisation - Article 89(2) GDPR). However, as the European Data Protection Supervisor recalls, this presumption is not a general authorisation for further use of data for all cases of research purposes, each case must be considered according to its context.

Data subjects will have to be informed of this further processing before it is carried out unless one of the exceptions to the right of information under Article 14(5)b GDPR applies.

The rights of the data subjects shall be guaranteed unless one of the exceptions provided for in Article 89(2) of the GDPR is applicable.

What specific rules apply to the secondary use of clinical trial data outside the clinical trial protocol for scientific purposes? The Clinical Trials Regulation specifically addresses this issue in Article 28(2). This article focuses in particular on consent. The situations covered are those where the sponsor wishes to process the data of a clinical trial participant outside the planned protocol but only and exclusively for scientific purposes. According to this article, the sponsor must seek consent for this specific purpose of processing (secondary use outside the protocol) from the data subject or his/her legal representative at the time when informed consent to participate in the clinical trial is sought.

However, as explained above, consent under Article 28(2) of the Clinical Trials Regulation must be distinguished from consent as a legal basis for the processing of personal data as provided for by the GDPR. Thus, the sponsor or investigator wishing to subsequently use personal data collected for scientific purposes different from those foreseen by the clinical trial protocol, will have to establish a legal basis which may not be consent (as understood under the GDPR). Moreover, as explained above, the presumption of compatibility provided for in Article 5(1)b. of the GDPR may apply, subject to compliance with the conditions laid down in Article 89 of the GDPR (appropriate safeguards for the rights and freedoms of the data subjects).

In any case, the rules on the processing of personal data as set out in the GDPR shall apply.

On monitoring compliance with the legal framework for the protection of personal data

Who is responsible for the compliance of the collection and processing of personal data? The GDPR has been designed according to a logic of accountability of the actors (articles 5 and 24 GDPR), i.e. apart from the procedures put in place in each country concerning the use or reuse of health data (for example, opinion of an ethical and scientific committee, specific authorisation of a competent authority in the matter), the data controller is obliged to implement data protection measures, which he/she will update if necessary, and must be able to prove the compliance of the data processing he/she has implemented with the applicable regulatory framework. This compliance work can be done in relation to the data protection officer (Article 37 GDPR). A data protection officer will have to be appointed by the controller and the processor when the activities implemented by the latter consist of large-scale processing of special categories of data, including health data and genetic data.

This documentation should include:

  • The record of the processing activities carried out,
  • the DPIAs of processing activities likely to result in high risks to the rights and freedoms of individuals,
  • the supervision of the data transfers carried out,
  • the information given to individuals on the use of their data (and, where appropriate, a description of the reasons for not informing individuals),
  • consent forms if applicable,
  • the measures in place to guarantee the rights of individuals,
  • and contracts with subcontractors for example, if applicable.

What should the record of processing activities contain? The controller(s) must keep a record of the processing activities carried out under their responsibility in written form. Article 30(1) of the GDPR details all the information that must be included in the record in order to be able to attest to the compliance of the processing activities. For example, the contact details of the controller, joint controllers and the data protection officer, if applicable, should be described; the purposes of the processing operation; a description of the categories of data subjects and personal data, etc. In practice, it is the DPO who keeps and updates the record of processing activities.

Also, each processor will have to keep a record of the processing activities carried out on behalf of the controller. Article 30(2) of the GDPR details the information that must be included.

The supervisory authorities may ask the DPO, controller and/or processor to make this record available in order to certify the compliance of the processing activities carried out.

What are the personal data security rules to be respected? Article 32 of the GDPR designates the controller and the processor as being responsible for implementing appropriate technical and organisational measures to ensure a level of security appropriate to the risk. Among these measures, we find:

  • pseudonymisation and encryption of personal data,
  • the means to guarantee the confidentiality of the data,
  • the means to restore access to data in the event of a physical or technical incident,
  • a procedure to regularly assess the effectiveness of the security measures put in place to ensure the security of the processing.

In order to best assess the security measures to be put in place, the controller and the processor must take into account the risks that the processing operation poses to the rights and freedoms of data subjects, in particular with regard to the potential destruction, loss, disclosure or unauthorised access of personal data.

In order to demonstrate compliance with these security requirements, an approved code of conduct (Article 40 GDPR) or an approved certification scheme (Article 42 GDPR) can be used to demonstrate compliance with the security requirements of the GDPR.

Code of conducts can be used as a tool for data transfers for ensuring appropriate safeguards to data transfers to third countries or international organisations. See European Data Protection Board Guidelines 04/2021 on Code of Conduct as tools for transfers.

European Union Legislation

Charter of Fundamental Rights of the European Union, OJ C 326, 26.10.2012, p. 391-407

Regulation (EU) No 536/2014 of the European Parliament and of the Council of 16 April 2014 on clinical trials on medicinal products for human use, and repealing Directive 2001/20/EC Text with EEA relevance, OJ L 158, 27.5.2014, p. 1-76, CELEX number: 32014R0536

Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance), OJ L 119, 4.5.2016, p. 1-88, CELEX number: 32016R0679

Proposal for a Regulation of the European Parliament and of the Council on the European Health Data Space, 3 May 2022, COM/2022/197 final, CELEX number: 52022PC0197

  •  Original text (available in the 24 official languages of the EU)

Proposal for a Regulation of the European Parliament and of the Council laying down harmonised rules on artificial intelligence (Artificial Intelligence Act) and amending certain Union Legislative Acts,  21 April 2021, COM/2021/206 final, CELEX number: 52021PC0206

Regulation (EU) 2022/868 of the European Parliament and of the Council of 30 May 2022 on European data governance and amending Regulation (EU) 2018/1724 (Data Governance Act) (Text with EEA relevance), PE/85/2021/REV/1, OJ L 152, 3.6.2022, p. 1-44, CELEX number: 32022R0868

European Union Guidance documents

On the concepts of controller and processor in the GDPR

On transparency under GDPR

On secondary use of health data

On the interplay between the Clinical Trials Regulation and the GDPR :

On artificial intelligence systems

On the Data Protection Impact Assessment (DPIA)

On personal data breach notification under GDPR

More

Acknowledgements

Published: 22/03/2023

Authors: 

Lisa Feriol, PhD student, CERPOP UMR1295 (Inserm and University of Toulouse Paul Sabatier), Ekitia (Not-for-profit data sharing organisation)

and Gauthier Chassang, lawyer, CERPOP UMR1295 (Inserm and University of Toulouse Paul Sabatier), Genotoul Societal Platform, Toulouse (GIS Genotoul)

Reviewed by Aurélie Mahalatchimy, EuroGCT WP4 Convenor