The development of gene and cell therapy necessarily involves the processing of data, whether personal or non-personal. On this regard, please refer to‘Data Classification’.
Personal data processing activities should be undertaken with due regard todata protection principles.
Among other principles encompassed in the GDPR and in other legal instruments referred to below, such as UNESCO’s International Declaration on Human Genetic Data, the following illustrate some of the main principles which apply to the processing of personal data:
These and other relevant principles ought to be read in light of the ever-changing technological and scientific landscape, which entails the need to constantly reinterpret them, as well as to reshape and update data protection rules in accordance with new risks and threats.
Moreover, health data and human genetic data, which are likely to be used in the development of gene and cell therapy, are considered a special category of personal data (also commonly referred to as “sensitive data”) which is specifically regulated by the law in light of its unique and sensitive nature.
This is particularly relevant for human genetic data which are processed not only for the development of gene therapy, but also for a myriad of other purposes, for instance for assuring effective and efficient screening in gene and cell therapy.
Human genetic data are of a very special and sensitive nature due to the quantity and quality of information (and conclusions thereof) which may be attained with its processing. For illustration, the processing of human genetic data may predict genetic predispositions and allow very accurate health (and non-health) related conclusions about individuals, which may bring about substantial impacts not only on the concerned person but also on the family, extending over generations and groups to which the concerned person belongs. Non rarely, the processing of genetic data involves uses, analysis and findings with a significance which may not be known at the time of processing.
Thus, human genetic data has a particularly singular status, as it entails higher thresholds of protection as well as a more sophisticated consideration of its corresponding risks and threats.
Considering the scientific, medical and technical capabilities developed through our time, especially during the 21st century, the processing of human genetic data has been an increasingly relevant concern from a societal, philosophical and legal perspective alike. In this context, the scientific and legal communities have been focusing on investigating an extensive list of topics related to the risks and threats brought about by the processing of human genetic data, such as the legal protection of human genetic data, the safeguard of genetic privacy and the prevention of genetic discrimination.
Among other national, regional, and international legal instruments, EU’s General Data Protection Regulation (GDPR)established provisions concerning genetic data and the risks associated with their processing. As the most relevant EU legal instrument for the enhancement, effectiveness and harmonization of personal data protection principles, the GDPR has shaped and influenced ethical and legal standards related to the processing of genetic data. As an illustration of the spotlight given to human genetic data within the scope of the data protection principles enshrined in the GDPR, it is important to recall that the definition of personal data itself includes a specific reference to genetic factors as being one of the identifiers of a natural person.
The GDPR imposes a general prohibition on the processing of “personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person's sex life or sexual orientation” (i.e., special categories of personal data).
Therefore, pursuant to the GDPR the processing of genetic data, i.e., “personal data relating to the inherited or acquired genetic characteristics of a natural person which give unique information about the physiology or the health of that natural person and which result, in particular, from an analysis of a biological sample from the natural person in question”, is generally prohibited.
Despite this general prohibition, the GDPR sets a list of exceptions (functioning as derogations from the main rule) determining the scenarios in which it is lawful to process special categories of data, including genetic data. According to Article 9(2), the processing of genetic data may be authorized for specific purposes and in accordance with specific conditions.
Among other, the processing of genetic data may be lawful when (i) the data subject has given explicit consent to the processing; (ii) the processing is necessary for reasons of public interest in the area of public health, such as protecting against serious cross-border threats to health or ensuring high standards of quality and safety of health care and of medicinal products or medical devices; or (iii) the processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) of the GDPR.
It is worth noting that where the processing of genetic data is lawful such processing should be undertaken on the basis of Union or Member State law, which provides for suitable and specific measures to safeguard the rights and freedoms of the data subject and in accordance with the general data protection principles.
The data protection principles set by the GDPR further build on the existing foundational groundwork, establishing tangible rules and thresholds of protection on which personal data legislation and processing activities should take place.
The importance of adopting thorough data protection principles and safeguards in the context of human genetic data processing has been further stressed in several binding and non-binding international and European instruments and guidelines, such as:
 Including the corresponding amendments and the Additional protocol to Convention 108 regarding supervisory authorities and transborder data flows (ETS No. 181).
Human genetic data may be processed and used in a wide variety of circumstances and contexts. Personal data relating to the inherited or acquired genetic characteristics of a natural person may result from the analysis of biological samples, such as chromosomal, deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) analysis, or from the analysis of other elements enabling equivalent information to be obtained. For instance, it has been recently suggested that it is possible to diagnose genetic disorders through the processing of facial biometric data using technologies such as computer imaging and deep learning algorithms, which demonstrates that genetic information can be collected from non-genetic data.
Thus, human genetic data may be processed in a wide range of industries, such as the pharmaceutical and health industries. Concerning the latter, genetic data may be used for the research and treatment of many diseases, such as cancer, neurological and psychiatric diseases, as well as cardiovascular diseases. The use of genomic technologies in healthcare increases our biological understanding and enables scientists, researchers, doctors, and healthcare professionals around the world to establish connections between molecular and physiological factors and specific clinical symptoms of the disease.
With the advent of the big data economy, the massive and automated processing of genetic data has become a common practice. Artificial intelligence is used to analyse and interpret genomic data at a large scale for clinical, pharmaceutical and biomedicine purposes, such as for medical diagnostic, risk prediction, precision medicine and drug discovery and development.
The processing of genetic data is not constrained to the healthcare and pharmaceutical industries but may also be relevant for other purposes and contexts, for example in fundamental research, anthropology, conservation biology, forensics and criminal justice/law enforcement.
Furthermore, human genetic data has become a business. In 2019, MIT (Massachusetts Institute of Technology) Technology Review estimated that more than 26 million consumers had added their DNA to four leading commercial ancestry and health databases. In this context, genetic data may be collected by genetic testing kits in direct-to-consumergenetic-testing services provided by companies which give clients details about their ancestry and an unprecedented level of access to their personal genome. Additionally, genetic data may also be used for novel marketing strategies and practices, such as the offering of DNA-based discounts or DNA-centric services.
The broad range of activities and purposes which make use of genetic data or entail its processing makes it very challenging to pinpoint the exact stakeholders which are of relevance in this context. However, the list below aims to summarize the major stakeholders involved in genetic data processing activities:
Patients, consumers and, in general, natural persons to whom the genetic data concerns;
Healthcare providers (including doctors, clinics and hospitals);
Labs and genetic testing providers;
Scientists, researchers and academia (such as universities and scientific research organizations);
Direct-to-consumer private companies and businesses (such as genetic testing kit services, ancestry and genetic information services and, in general, other companies processing genetic data for DNA/genetic-based services and commercial applications not foreseen above);
Governments and judicial authorities (such as governmental agencies, law enforcement agencies and courts);
IT and technological providers (such as cloud and database service providers and AI developers).
Generally speaking, entities processing genetic data would do so in the role of data controllers or data processors, case in which specific obligations contained in the GDPR would apply and these entities would be responsible for ensuring the main data protection principles are respected.
At the same time, data protection authorities – as the stakeholder responsible for monitoring and ensuring compliance with the applicable data protection principles and obligations – would be responsible for verifying that the main data protection principles are respected by all stakeholders.
Self-determination: As a general principle, self-determination was coined in the context of international law and corresponds to the legal right of people to decide their own destiny in the international order. The core principle of self-determination arises from customary international law and is recognized in various international treaties, such as the United Nations Charter and the International Covenant on Civil and Political Rights.
However, the self-determination principle has an external and internal dimension.  Internal self-determination relates to the right of people to govern themselves without outside interference - which underpins a relatively personal dimension. External self-determination relates to the right of people to determine their own political status and to be free of foreign interference or alien domination, including the formation of their own independent state - which underpins a political and social dimension.
In the context of human genetic data processing, the internal dimension of the self-determination principle would be expressed as a right to informational self-determination. This principle was first coined by the Federal Constitutional Court of Germany in its landmark decision on the Federal Census Act of 1983, commonly referred to as "Census Verdict".
As the right to decide independently and freely what happens to a person’s personal data, informational self-determination protects personality and privacy, constituting one of the basic ideas underlying the fundamental right to the protection of personal data, as established by Art. 8 of the Charter of Fundamental Rights of the European Union.
Despite the relevance of self-determination and individual control over data in the GDPR, the fact is that personal data, including genetic data, can be processed without the data holder’s consent, provided that certain conditions are met and the remaining data protection principles are respected. Consent is one of the legal grounded provided for in Articles 6(1)(a) and 9(2)(a) of the GDPR, but both norms include other possible legal grounds that disregard consent.
Lawfulness, fairness and transparency – Article 5(1) (a) GDPR:Personal data must be processed lawfully, fairly and in a transparent manner in relation to the data subject.Although there is some overlap between the three elements of lawfulness, fairness and transparency, stakeholders should equally satisfy all three. According to this principle, it would not suffice to prove data processing is lawful when, at the same time, it is also unfair or non-transparent.
Lawfulness: Personal data must be processed in accordance with a specific and previously identified lawful basis. The legal grounds on which personal data may be processed are identified in articles 6 (for all personal data) and 9 (an additional legal ground for sensitive data only) of the GDPR. Where there is no lawful basis for the data processing activity, data processing is unlawful and in breach of this principle.
The lawfulness principle may also be interpreted in a more general sense, meaning, for instance, that the data controller may not process personal data in a manner which is to be considered unlawful in a general sense, which encompasses other legal obligations, whether from a criminal or civil nature. For illustration, data processing activities may be considered unlawful if they entail a breach of industry-specific legislation or regulations.
The GDPR further includes detailed provisions on lawfulness in its articles 6 to 10.
Fairness: Processing of personal data must be fair. If there are aspects of a personal data processing activity which are unfair this principle would be breached independently if there is a lawful basis for the processing.
Fairness in data processing means that the data controller should handle personal data in a manner that data subjects may reasonably expect. Additionally, it also means that personal data should not be used in ways which result in unjustified or unproportionate consequences or adverse effects for the data subject. Thus, the fairness principle is extremely intertwined with the concept of proportionality and with the expectations reasonably held by the data subject in relation to his/her data.
Fairness in data processing is also connected to the particular method used for its collection. For illustration, if the data subject is deceived or misled during the personal data collection operation, chances are that such data processing activity will be unfair.
In order to assess whether a personal data processing operation is fair, attention should be put into considering how it affects the interests and fundamental rights of the data subjects concerned. However, it should be noted that personal data may sometimes be used in a way that negatively affects a data subject without necessarily being unfair.
Transparency:The transparency principle is fundamentally intertwined with the fairness principle. Transparency means that data processing activities must be undertaken in a clear, open and honest manner and that the data subject is given thorough but tangible information (in clear and plain language, as well as in an easily accessible manner) concerning all aspects related to the data processing activities, such as the identity and characteristics of the data controller and processor (if applicable), the reasons and manners in which such data processing activities will take place, the purposes for which personal data is being used, and if any personal data transfers will take place, amongst other relevant information.
Such information is vital for a correct and fair assessment by the data subject in relation to the data processing activities, which is fundamental for his/her choice on whether to proceed with the activities entailing such data processing.
Pursuant to the GDPR, the transparency of data processing operations demands that data subjects must be informed in accordance with articles 12, 13 and 14.
Purpose limitation – Article 5(1)( b) GDPR: The purpose limitation principle requires that personal data is collected and processed for specified, explicit and legitimate purposes and that such data should not be further processed in a manner which is incompatible with those previously defined purposes (essentially, the purpose must be defined before the time of collection). It also means that the data controller should clearly define and state what that purpose is, as well as that the personal data collected should correspond to personal data which are necessary for fulfilling that purpose.
There are, however, some exceptions to the purpose limitation principle:
When the data subject consents (Recital 50 and Art. 6(4) GDPR)
When the secondary purpose is authorised by European or national law (Recital 50 and Art. 6(4) GDPR
When the secondary purpose meets the requirements of the so-called research exception (Arts 5(1)(b) and 89(1) GDPR)
When - according to the second part of Article 5(1)(b) – the secondary purpose is not incompatible with the original one which is assessed under the so-called “compatibility test” (Recital 50 and Article 6(4) GDPR).
Data minimisation – Article 5(1) (c) GDPR:Data minimization means that data processing operations should solely entail the collection and processing of data which are adequate, relevant and limited to what is necessary to achieve the corresponding purpose. This means that the quantity and quality of data should be kept to the minimum necessary in light of the objectives pursued.
Accuracy – Article 5(1) (d) GDPR:Personal data should be correct and, where necessary, kept up to date. It means data controllers are responsible for undertaking reasonable steps to ensure the elimination or rectification of personal data which are inaccurate or untrue, having regard to the purposes for which they are processed. Data subjects have the right to request that inaccurate or incomplete data be erased or rectified within 30 days.
Storage duration limitation – Article 5(1) (e) GDPR:Personal data shouldbekept in a form which permits the identification of data subjects for as long as is necessary for the purposes for which the personal data are processed. Personal data may not be stored or kept for periods longer than necessary. In order to ensure that personal data are not kept longer than necessary, time limits and data retention policies should be established by the data controller for erasure or periodic review.
Integrity and confidentiality– Article 5(1) (e) GDPR: Personal data should be processed in a manner that ensures its appropriate security, including protection against unauthorised or unlawful processing and against accidental loss, destruction or damage. The data controller must ensure it has installed the appropriate technical or organisational security measures to protect personal data.
FAIR(findable, accessible, interoperable, reusable): This principle/acronym has been suggested by the scientific community  as a relevant principle in the domain of personal data, but it is not recognised in the GDPR, and may be in contradiction with its spirit and content. It is in the future European Health Data Space (EHDS) that the FAIR principle finds a suitable recognition. Given the importance and complexity of human genetic data and of DNA-based information, it is of paramount importance to assure that human genetic data are kept up to the highest data science standards. This is particularly important in situations where such data are processed by digital, automated or electronic means.
Provided that all security measures are in place (for instance, authentication, encryption, pseudonymization, among others) it is important to highlight the advantages of keeping human genetic data findable, accessible, interoperable and reusable in order to assure its smooth, efficient and secure management and processing.
The FAIR principles emphasise the capacity of computational systems to find, access, interoperate, and reuse data (i.e., machine-actionability), as well as underpin the right of the data subject to access and receive personal data in a structured, commonly used, and machine-readable format and the right to transfer those data to another party without hindrance.
Since the handling of human genetic data is increasingly reliant on computational networks and entails an increasingly higher data volume, complexity and speed, the necessity of data integration and interoperability with applications and workflows for analysis, storage, and processing may be achieved by the assurance of the FAIR principles together with the general data protection obligations established by the GDPR, especially in matters of data security, availability, confidentiality and integrity.
Genetic data is unique and distinguishes an individual from other individuals;
Genetic data may also reveal information about the individual’s blood relatives (biological family), including those in succeeding and preceding generations, and carry legal implications for them;
Genetic data can characterise a group of individuals (e.g., ethnic communities);
Genetic information is often unknown to the individual and does not depend on the bearer's individual will since genetic data are in principle non-modifiable (it is important to recall, however, that contemporary scientific developments enable the modification of genetic data through, e.g., gene editing technologies such as CRISPR)
Genetic data can be easily obtained or extracted from raw material although this data may at times be of dubious quality;
Considering the developments in research, genetic data may reveal more information in the future and be used by an ever-increasing number of agencies for various purposes.
The above and other characteristics illustrate the very particular nature of human genetic data, which brings about substantial challenges in the application of data protection legislation and principles. Among other challenges, the processing of human genetic data presupposes a conflict between two essential interests which should be balanced. On the one hand, the rights and interests of data subjects whose genetic data are processed, as well as a general interest in protecting privacy and, at the same time, controlling the use of human genetic data and associated technologies so as to safeguard its lawfulness, fairness and transparency. On the other hand, interests related to the role of human genetic data for the advance of science, for medical research and other purposes which are instrumental to the progress, development and well-being of our society, notably in matters related to biology and biotechnology, health and security.
The collective dimension of human genetic data
Due to common biological genetic characteristics, the processing of genetic data may assess health risks or determine biological relationships pertaining not only to the concerned data subject but also to others, including extended family. Thus, data processing operations may not only affect the right to privacy of one individual but may also entail consequences to the privacy of a group of individuals.
This raises very serious and complex legal issues related to data confidentiality, such as the right of access to genetic information by the biological family of the data subject. In this context, it is challenging to strike a balance between the data subject’s right not to disclose (or to disclose) his/her genetic information and the potential implications and consequences of its disclosure (or non-disclosure) for the members of the family. In essence, there are two major concerns involved (and interrelated between them): i) do genetic data belong exclusively to the specific individual from whom these data are collected or, in contrast, to a group of individuals?; ii) do family members have the right to access such data in the absence of the concerned individual’s consent?
In its Article 29 Working Party Working Document on Genetic Data (hereinafter also “WP29” or “Working Party”) stated that “to the extent that genetic data has a family dimension, it can be argued that it is “shared” information, with family members having a right to information that may have implications for their own health and future life”.
This would, however, raise major implications in the interpretation and application of data protection legislation and principles. Such a solution would entail, for instance, that other family members could also be considered “data subjects” or, alternatively, albeit not being considered data subjects in relation to such data, that family members would nonetheless have the right to access and receive this information due to the importance of the interests at stake from their perspective (traditionally, the exercise of the right of access and information is generally legally contingent on the condition of data subject and is not exercised by third parties on its behalf).
The most relevant legal instruments regulating data protection have an individualistic approach to data protection principles which challenges or, at least, seems slightly incompatible with the legal issues raised by the nature and characteristics of human genetic data. It should be noted, however, that other relevant instruments, such as Recommendation R (97) 5 on the Protection of Medical Data, the Statement on DNA Sampling by the Hugo Ethics Committee and the UNESCO’s 2003 International Declaration on Human Genetic Data, have a different approach which takes into account the “significant impact on the family”, referring to the characteristics within a “related group of individuals”, as well as information concerning “family members of the individual” and going as far as stating that “special consideration should be made for access by immediate relatives”.
Legal basis, purposes and rights in the context of human genetic data processing
Considering the variety of uses which may be given to human genetic data, the vast array of information which may be extracted from its processing, as well as the quantity and quality of different purposes for which such data may be used and further processed after collection, it may be often challenging for data controllers to determine in a precise and timely manner the purposes, legal basis and data protection rules associated with such activities. For instance, the idiosyncrasies of human genetic data processing make it challenging to uphold the purpose limitation, data minimisation and accuracy principles, especially in light of the complexity and sensitivity of the information extracted from genetic data – which entails a substantial risk of misuse and/or re-use through additional analysis of the original data.
Furthermore, the determination of such elements (such as legal basis and data processing purpose) will most likely have to occur before the data processing operations take place, at a very early stage (data controllers ought to determine and clearly explain data processing details to data subjects upfront) and in terms which may hamper the task of properly defining the corresponding legal requirements. For instance, the conclusions which will be drawn from human genetic data processing, as well as the specific context and ends to which these data will be used, may not be entirely known at the time of collection (and may vary according to several dynamic factors, such as time, budget, technical capabilities, among others). This would probably be the case in the context of long-term medical and scientific research projects when they are at an early stage.
The determination of the legal basis pursuant to articles 6 (lawfulness of processing) and 9 (processing of special categories of personal data) of the GDPR is contingent on the specific purpose for which such data will be used and, in its turn, will have a significant impact on the application and relevancy of the data subject’s rights. All such elements are intertwined in some way and all of them vary substantially according to the context, operational reality and background in which human genetic data is to be processed. This means that it may be a complex challenge to determine the rights and obligations which arise from a particular data processing operation with the necessary degree of precision. This could jeopardize the fulfilment of data subjects’ rights which, in its turn, could subsequently hinder data controller’s operations.
Moreover, the specific manner in which certain principles and rights are to be applied is not always entirely clear in a human genetic data processing setting.
For instance, the data accuracy principle and the right to rectification may be challenging since the processing of genetic data often includes insights and results which may contain errors that are kept intentionally within certain margins for various purposes (e.g., in order to assess and continuously improve the efficacy and efficiency of genomic data analysis). Or, differently, the fact that certain conclusions deriving from the results of genetic data processing may be a matter of scientific opinion which depends on several factors, such as available scientific knowledge or specific scientific areas. Similarly, despite the existence of certain genetic data modification methods and technologies, e.g., gene editing and genetic engineering technologies, genetic data and the mediums in which it is kept have fundamental non-modifiable characteristics which difficult its rectification.
Non-discrimination and non-stigmatisation
The characteristics of human genetic data (such as its permanent and essentially irrevocable nature) bring about severe risks of discriminatory or stigmatising treatment, for instance, in the context of employment and insurance. In an employment scenario, e.g., the scenario in which an employer decides to fire a worker based on the fact that this particular employee has a genetic predisposition for cancer and thus might be on sick leave for several months. Likewise, in an insurance scenario, the insurance company might deny a health insurance policy based on the knowledge of the genetic predisposition of the individual to develop cancer.
Article 29 Data Protection Working Party considers that the processing of genetic data in the context of employment should be prohibited in principle. Such processing should only be authorised under exceptional circumstances. The Recommendation of the Committee of Ministers on the processing of personal data in the context of employment stresses that human genetic data must not be processed for purposes related to the assessment or evaluation of the suitability/performance of employees or job applicants, independently if the concerned individual has provided his/her consent for such processing.
At the same time, the Working Party considers that the processing of genetic data for insurance purposes should be prohibited in principle and only authorised in exceptional circumstances, which must be clearly foreseen in the law. The Working Party further stresses that “the use of genetic data in the insurance field could lead to an insurance applicant or members of his family being discriminated against on the basis of their genetic profile”. The Recommendation of the Committee of Ministers on the processing of personal health-related data for insurance purposes sets tangible provisions with the objective of preventing genetic discrimination in the insurance context, such as the practice of predictive genetic tests and the imposition of higher insurance premiums and taxes based on a person’s increased risk according to the conclusions obtained by the processing of genetic data.
Furthermore, the use of biometric and genetic data has also been observed in the creation of massive databases with genetic profiles for a range of purposes which may interfere with data subjects’ fundamental right to data protection. For instance, human genetic data may be processed by law enforcement agencies in the context of criminal investigations and is often kept at the governmental level for various purposes (even though, the processing of data for law enforcement purposes is not regulated by the GDPR, but by the LED). This data may often be stored independently of the outcome or purpose of such activities. At the same time, the access to and use of these genetic profiles and data is frequently undertaken without the data subjects’ awareness or consent.
Lastly, the integration of human genetic data within other data sets enables the establishment of links between different types of personal data (such as contact details or an address) and may allow the tracking and surveillance of an individual within several different data points and with an unprecedented level of detail, potentially affecting them in ways they neither desire nor expect. For example, in genomic data applications such as direct-to-consumer genetic testing which may be further used for profiling and commercial strategies based on an individual’s genetic information combined with other personal data.
The processing of human genetic data may also “redefine family relationships, for example, by confirming or disproving paternity, locating previously unknown relatives, or identifying anonymous gamete donors. (…) Thus, when they design, conduct and discuss their research, investigators need to consider how genomic data are used and how the type of use affects whether or not the data are controlled outside the research setting as well”. 
The issue of storage duration/data retention is yet another topic of extreme relevance in the context of the processing of human genetic data. In a nutshell, the principle of storage limitation means that personal data must not be kept for any longer than it is necessary.
In the context of medical and scientific research, however, the application of this principle is not as straightforward as one might think. Among other aspects, for two main reasons: (i) a substantial proportion of the human genetic data processed for medical and scientific research purposes is often collected in another context, such as in the clinical and healthcare context, and every so often the processing of such data constitutes a further processing activity pursuant to Article 6(4) (conformity test) or 89(3) (research exception) of the GDPR; (ii) most data used in this context undergoes some sort of digital or automated process which aims to cover the identity of the data subject – however, such processes do not always guarantee the total and irreversible anonymization of the data at stake, but rather are mere pseudonymization or encryption processes in the context of which some link with the identity of the original data subject is still kept.
Typically, genetic data collected for research purposes should be anonymous. This would ensure accordance with the storage limitation principle. However, several scientific and medical activities which make use of human genetic data require that the results of its analysis could be linked to an individual, and such link is, often, necessary to achieve the purposes of the research itself.
At the same time, the characteristics of human genetic data, such as stored DNA, could enable a permanent link to a particular person (even if this person is not directly identified through other factors such as name or gender), given that certain genetic factors are, in themselves, identifiers of a natural person (i.e., these factors may inextricably be a part of an individual’s identity, without the need of further identifying an individual through other means).
As mentioned in Article 29 Data Protection Working Party’s Working Document on Genetic Data, “according to a definition of the task group established by the Danish government to assess the need for further legislative proposals in Denmark, a bio-bank is defined as a structured collection of human biological material which is accessible under certain criteria and where information contained in the biological material can be traced back to individuals”.
Furthermore, it is important to recall that several jurisdictions foresee the obligation to conserve certain data for long periods, such as data related to clinical trials. The EU Regulation on clinical trials on medicinal products for human use, for instance, foresees in its article 58 the archiving of the content of the clinical trial master file by the sponsor and the investigator for at least 25 years after the end of the clinical trial - such medical files shall be archived in accordance with national law.
Similarly, according to the WP29 Working Document on Genetic Data, “the Dutch data protection authority has been confronted with situations where anonymisation or deletion of data kept in biobanks could substantially diminish the value and functions of such data bases, since the data would no longer be linked to identifiable individuals. Examples are databases for longitudinal researches, sometimes encompassing several generations, such as the cancer registration. Arguments from the field for longer retention periods should be taken into account in such cases.”
This illustrates that, in certain cases, it is challenging to balance the need for data storage and conservation with the principle of storage limitation and, at the same time, that the application of the latter in the context of the processing of human genetic data for research purposes is not always entirely clear.
A different case, however, is the storage of human genetic data for criminal investigation. The matter is not regulated by the GDPR, but by the LED. Article 5 of the LED, about “Time-limits for storage and review”, states that “Member States shall provide for appropriate time limits to be established for the erasure of personal data or for a periodic review of the need for the storage of personal data. Procedural measures shall ensure that those time limits are observed”.
According to article 11 of Convention 108 +, any exception to the principles applicable to the processing of human genetic data, including storage limitation, must constitute a necessary and proportionate measure to pursue aims provided for by law.
A leading case in this regard comes from the European Court of Human Rights. In S. and Marper v. the United Kingdom, the European Court of Human Rights decided that the indefinite storage of biometric and genetic data (such as fingerprints, cellular samples and DNA profiles) after the investigation against the suspect has ended was not a necessary and proportionate measure in a democratic society.
Likewise, in M.K. v. France (2013), also from the European Court of Human Rights, the Court found the violation of Article 8 of the European Convention on Human Rights as the retention of an innocent person’s data for 25 years was not necessary for a democratic society.
Pursuant to paragraph 8 of the Recommendation No. R (92) 1 of the Committee of Ministers of the Council of Europe on the Use of Analysis of Deoxyribonucleic Acid (DNA) within the Framework of the Criminal Justice System: “samples or other body tissues taken from individuals for DNA analysis should not be kept after the rendering of the final decision in the case for which they were used, unless it is necessary for purposes directly linked to those for which they were collected”. Additionally, “measures should be taken to ensure that the results of DNA analysis and the information so derived is deleted when it is no longer necessary to keep it for the purposes for which it was used. The results of DNA analysis and the information so derived may, however, be retained where the individual concerned has been convicted of serious offences against the life, integrity or security of persons. In such cases strict storage periods should be defined by domestic law.”
Bearing in mind the sensitivity and value of human genetic data, as well as that cloud computing and other data storage and sharing technologies become increasingly important for genetic data storage and analysis, such data may be highly targeted by cybersecurity attacks and external intrusion attempts. Thus, the integrity and security of the networks and technologies used in the processing of such data is a vital aspect. In this regard, the implementation of the appropriate technical and organizational measures is of the utmost importance for all stakeholders involved.
A key underlying and cross-cutting challenge is regulatory and legal fragmentation, especially in the context of international or regional law. Despite the existence of international and European legal instruments (both binding and non-binding), the divergent and overlapping interpretation of the rules and terms introduced in international and European law may potentially render the existing rules, best practices, guidelines and standards ineffective or non-enforceable. Especially, given that human genetic data processing activities are often undertaken in a cross-border context.
In this respect, the agile development of science and technology increases the number of public and private entities engaging in human data processing activities, which in its turn leads to an overall increase in legislative and regulatory activities and, ultimately, in the emergence of national and international rules in matters of data protection in the context of human genetic data processing.
In sum, there are several international legal instruments which contain essential definitions and rules pertaining to human genetic data processing (e.g., the GDPR, Convention 108+or Recommendation R (97) 5 on the Protection of Medical Data), and also significative powers of international and European bodies, such as the European Commission. Nonetheless, it should be noted that “the monitoring and enforcement of the application of data protection legislation falls primarily under the competence of national authorities, in particular data protection authorities and courts”. Additionally, pursuant to Article 9(4) of the GDPR, Member States may provide further conditions or limitations in relation to the processing of genetic, biometric or health-related data.
Highly dynamic and fast-paced environment
As previously detailed in the section “Stakeholders”, human genetic data is increasingly relevant for a wide variety of purposes, both scientific and commercial. Developments in genetic data science and processing technologies make it substantially harder to tackle the corresponding risks and threats, as well as to adapt the current legal framework and principles to adequately safeguard the fundamental rights of data subjects. Given the ever-changing conditions of technological and scientific development, it is challenging to retain flexibility and adequacy and at the same time comply with the legal provisions and principles (which may become obsolete).
The current legal framework aims to tackle such challenges by maintaining deliberately general, flexible and broad language (both in terms of definitions and of principles). One illustration of such is the definition of genetic data in the GDPR, which is detailed but at the same time based on general terms so that it does not become irrelevant. Article 4 of the GDPR, for example, illustrates genetic data as personal data obtained from the analysis of biological samples. Per se, this definition could come short in depicting the entire reality at stake. However, the joint interpretation of Article 4 (which states “in particular”, suggesting that it is a non-exhaustive definition) and recital 34, according to which genetic data are also obtained from the analysis of other elements (and not only biological samples), provides an all-encompassing conceptual definition aiming to include realities which may not be known at the present time.
Nevertheless, genetics technology has only recently started to be widely used and it is fair to assume that genetic data analysis techniques will only increase both in complexity and dimension. The agile development and swift changes in such processes, as well as the expected reduction in the costs, depict a future that may hold complex legal data protection challenges in the context of human genetic data processing. Such challenges become increasingly relevant in light of new risks and threats and the necessity to provide appropriate legal safeguards and guarantees which may not be yet in place.
 Article 29 Data Protection Working Party, Working Document on Genetic Data, 17 March 2004, p. 4-5
 Article 29 Data Protection Working Party, Working Document on Genetic Data, 17 March 2004, p. 10
 Recommendation CM/Rec (2015)5 of the Committee of Ministers to member States on the processing of personal data in the context of employment, article 9(3).
 Article 29 Data Protection Working Party, Working Document on Genetic Data, 17 March 2004, p. 10
 Recommendation CM/Rec (2016)8 of the Committee of Ministers to the member States on the processing of personal health-related data for insurance purposes, including data resulting from genetic tests, principle 4.
 Wan, Z., Hazel, J.W., Clayton, E.W. et al. Sociotechnical safeguards for genomic data privacy. Nat Rev Genet23, 429–445 (2022). https://doi.org/10.1038/s41576-022-00455-y
 Regulation (EU) No 536/2014 of the European Parliament and of the Council of 16 April 2014 on clinical trials on medicinal products for human use.
Case of S. and Marper v. The United Kingdom, Applications nos. 30562/04 and 30566/04, 4 December 2008, ECHR [GC].
Case of M.K. v. France, Application no. 19522/09, 18 April 2013, ECHR.
European Parliament (Committee on Petitions), Notice to Members, 15/03/2019, https://www.europarl.europa.eu/doceo/document/PETI-CM-637225_EN.pdf.
 Christopher Kuner, Lee A. Bygrave, Christopher Docksey, The EU General Data Protection Regulation, A Commentary, Oxford University Press, 2020, p. 201.
 Explanatory memorandum to Recommendation CM/Rec(2019)2 of the Committee of Ministers to member States on the protection of health related data, par. 69.
Opportunities and incentives
Despite all challenges raised above, the processing of human genetic data is essential for a wide range of activities in our society and, as previously conveyed, future technological and scientific breakthroughs will unravel the full potential which might be achieved with the processing of genetic data.
Especially in the health, medical and pharmaceutical industries, namely in the context of academic research laboratories, research based on human genetic data is of paramount importance. In this respect, it is essential to emphasize that international, regional, and national legal instruments alike include a wider degree of freedom and a less demanding legal treatment in relation to the processing of human genetic data for research and archiving purposes in the public interest, as well as scientific, historical, or statistical purposes.
In fact, although Convention 108 + establishes that human genetic data processing should be carried out based on the free, specific, informed and unambiguous consent of the data subject or some other legitimate ground set out by law, and the GDPR establishes that the processing of special categories of data (which includes human genetic data) is prohibited unless specific exceptions defined by the regulation apply; both legal instruments establish clear derogations concerning the processing of human genetic data for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes.
Irrespectively, such processing should be based on the law and subject to appropriate safeguards. For instance, in certain cases, it might be required that the data subject is not identifiable through technical and organizational measures such as anonymization (e.g., in the results or statistics which derive from such data processing). However, if the identity of the data subject is essential for the processing activity, an exception to this rule may potentially be applicable.
The GDPR contains several exemptions to the personal data protection principles and somewhat restricts the rights of the data subject in the interest of science, such as specific derogations to the principles of purpose limitation and storage limitation. For instance, Article 5 establishes that the further processing of personal data is permitted for scientific research purposes and shall not be considered incompatible with the initial purposes, as well as that personal data may be stored for longer periods for such purposes.
Moreover, Article 14 (5) of the GDPR includes an exception to the data controller’s information obligations in cases where personal data have not been obtained from the data subject if the provision of such information involves a disproportionate effort in the context of the processing for archiving, scientific or historical research purposes. Similar exceptions apply to the right to be forgotten (Article 17 of the GDPR) and to the right to object (Article 21 of the GDPR) in cases of data processing for scientific research.
Interactions with regulators
National data protection authorities oversee compliance with the data protection principles and with the GDPR. According to the accountability principle established by the GDPR (Article 5(2)), data controllers shall demonstrate compliance with data protection obligations, rules and provisions.
Despite the possibility of data subjects lodging a complaint with the relevant data protection authorities, authorities may also actively engage in monitoring activities. Non-compliance with data protection obligations may be sanctioned with administrative fines which may go up to 20 million Euros or 4% of the total worldwide annual turnover of the company (Article 183 GDPR).
In addition, other authorities may also be involved in the context of the processing of human genetic data (and related activities), such as sectoral and industry-specific regional and national authorities. In the context of medical and clinical research, for instance, there might be the need to obtain certain authorizations/certifications or to prove compliance with certain sectoral laws. For illustration, this is the case of clinical trials for the investigation of the safety and efficacy of human medicines. In this context, other European (such as the European Medicines Agency) and national authorities may also engage in compliance monitoring or overall regulatory activities.
As mentioned before, the processing of genetic data is generally prohibited unless one of the exemptions and conditions provided for in Article 9(2) GDPR is fulfilled. Thus, the first and foremost practical step which should be undertaken by a potential controller of human genetic data is to assess whether the data which is to be processed is, in fact, personal data and, additionally, if such data constitutes genetic data, i.e., a special category of data.
Once such evaluation is undertaken, it becomes essential to evaluate and determine the purposes, legal basis and data protection safeguards associated with such activities. The lawful, fair and transparent processing of human genetic data entails, among other, the following aspects: i) the data controller has identified an appropriate lawful basis (both pursuant to Article 6 and 9 of the GDPR, cumulatively); ii) the activities pursued with this personal data are not unlawful in a general sense; iii) the data controller has assessed and considered the manners in which the processing may affect data subjects’ rights (including any necessary balance tests, proportionality exercises and impact assessments); iv) personal data will be processed in manners which may be reasonably expected by the data subject (or in manners which the data controller is able to clearly and transparently explain and justify); v)data collection and processing activities do not deceive or mislead data subjects.
Frequently, the processing of genetic data may be based on the data subject's consent. In this context, it is essential to ensure that such consent was voluntary, clearly expressed, free and informed, by a written or an oral statement. Thus, consent should be given by a statement or by clear affirmative action. Inactivity, passive behaviours or pre-validated forms or checkboxes do not constitute consent (Recital 32 GDPR). Additionally, the data subject has the right to withdraw his/her consent at any time without prejudice to the data processing activities undertaken until such withdrawal.
The data controller should provide appropriate information to the data subject pertaining to the purposes for which human genetic data will be processed, the risks and consequences associated with the processing of such data, as well as the possibility to transfer personal data to data processors (in essence, entities which process personal data on their behalf). Concerning the latter, measures should be put in place and instructions should be given by the controller to the processor (e.g., through contractual instruments such as data processing agreements) to ensure that all processing activities undertaken by the processor are lawful, fair, transparent and secure.
Furthermore, the principle of non-discrimination and non-stigmatization entails that the processing ofhuman genetic data is not used for purposes which discriminate in a way that is intended to infringe, or has the effect of infringing, human rights, fundamental freedoms or human dignity of an individual or for purposes that lead to the stigmatization of an individual, a family, a group or communities. The principle of prohibition of discrimination and/or stigmatization in the processing of genetic data is reinforced by a number of important international legal instruments, inter alia, the GDPR (for instance, recitals 75 and 85), the International Declaration on Human Genetic Data (for instance, Articles 3 and 7) and Convention 108 +.
In this context, it is of paramount importance to ensure informationalself-determination in a way which assures that data subjects can decide independently and freely what happens to their personal data. This includes special consideration to the purpose limitation principle. Data controllers may only process genetic data with the data subject's consent or acknowledgement (as applicable), transparently sharing information regarding any data-sharing agreements and data retention periods. Additionally, data controllers must limit the collection of genetic data to limited and clearly defined purposes, in order to minimise negative impacts on the data subject’s rights. The purpose of the processing and/or the retention of genetic data must be properly justified, as well as necessary and proportionate in accordance with the objectives it seeks to achieve.
Concerning the latter, Article 16 of the International Declaration on Human Genetic Data imposes that the processing ofhuman genetic data (as well as human proteomic data and the biological samples collected) should not be undertaken for a different purpose which is not compatible with the original purpose.
Due regard must also be observed in relation to the minimization of the human genetic data processed in accordance with what is strictly necessary for the corresponding purpose. Data controllers must ensure that any processed data are adequate, relevant, and not excessive considering the purpose for which they are collected and/or further processed. This principle has a strong link to the proportionality principle, given that it presupposes a proportionality test aimed at achieving the balance between the data processing purpose and data subject’s rights, with a view to assess and determine what is the minimum data necessary for the purpose at stake (i.e., the data which fall upon what is needed to achieve it and the data which surpasses such objectives).
For illustration, according to the WP29’s Working Data on Human Genetic Data, the Spanish Data Protection Authority “deemed that the creation of a file of genetic samples to identify newborns through DNA testingwas not in order. The aim of such files would be to prevent mother-infant mismatches. The Spanish Data Protection Authority took the view that the creation of a genetic file would contravene the principle of proportionality since the same result could be reliably obtained with other means (e.g., identity bracelets or footprints)”.
Also closely associated with the proportionality, purpose limitation and data minimization principles, data controllers may only keep human genetic data for as long as such personal data is necessary. This means that personal data may not be stored or kept for periods longer than those needed in light of the purpose for which they were collected, as well as that such periods must be clearly and thoroughly defined in relation to each purpose(i.e., storage duration limitation principle).
For illustration, Article 21 of the International Declaration on Human Genetic Data imposes, among other provisions, that “human genetic data, human proteomic data and the biological samples collected from a suspect in the course of a criminal investigation should be destroyed when they are no longer necessary, unless otherwise provided for by domestic law consistent with the international law of human rights”, as well as that such data “should be available for forensic purposes and civil proceedings only for as long as they are necessary for those proceedings, unless otherwise provided for by domestic law consistent with the international law of human rights”.
The security, quality, reliability and accuracy of human genetic data entails that data controllers should also seek to prevent that any activities in the context of human genetic data processing (such as genetic test results) contain errors, mistakes or inaccuracies, as well as the obligation to update, rectify or modify the data whenever necessary (which also links to the data subject’s right to data rectification).
Article 15 of the International Declaration on Human Genetic Data establishes that “the persons and entities responsible for the processing of human genetic data, human proteomic data and biological samples should take the necessary measures to ensure the accuracy, reliability, quality and security of these data and the processing of biological samples. They should exercise rigour, caution, honesty and integrity in the processing and interpretation of human genetic data, human proteomic data or biological samples, in view of their ethical, legal and social implications".
Furthermore, data controllers should ensure that human genetic data are kept in such a way which is consistent with thorough security, integrity and confidentiality measures and safeguards. This means that human genetic data should not be disclosed or made accessible to third parties (i.e., should be treated as confidential) and that the privacy of data subjects participating in human genetic data processing activities should be protected, but also that data controllers should implement all necessary technical and organizational measures in order to protect its security and integrity. For instance, with a view to preventing unauthorised access, illegal modification, or data loss/destruction (in general, protection against any unauthorised or unlawful processing), data controller must ensure robust data infrastructure which is not permeable to cybersecurity risks, such as external cyberattacks and internal misuse.
In this regard, Article 14 of the International Declaration on Human Genetic Data states that “human genetic data collected for the purposes of scientific research should not normally be linked to an identifiable person. Even when such data or biological samples are unlinked to an identifiable person, the necessary precautions should be taken to ensure the security of the data or biological samples”.
Article 32 of the GDPR imposes that data controllers and processors “shall implement appropriate technical and organisational measures to ensure a level of security appropriate to the risk”, taking into account the state of the art, the costs of implementation of such measures and the nature, scope, context and purposes of processing, as well as the risk of varying likelihood and severity for the rights and freedoms of natural persons. This includes, inter alia, the pseudonymisation and encryption of personal data, ensuring the confidentiality, integrity, availability and resilience of processing systems and services, as well as the ability to restore the availability and access to personal data in a timely manner in the event of a physical or technical incident. Such measures should be continuously reviewed and improved through processes established for the regular testing, assessment and evaluation of the effectiveness of the technical and organisational measures in place.
Article 29 Data Protection Working Party, Working Document on Genetic Data, 17 March 2004, p. 6.
European Union Legislation
Directive 2001/83/EC of the European Parliament and of the Council of 6 November 2001 on the Community code relating to medicinal products for human use, OJ L 311, 28.11.2001, p. 67-128, CELEX number: 32001L0083
Regulation (EU) No 536/2014 of the European Parliament and of the Council of 16 April 2014 on clinical trials on medicinal products for human use, and repealing Directive 2001/20/EC Text with EEA relevance, OJ L 158, 27.5.2014, p. 1-76, CELEX number: 32014R0536
Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance), OJ L 119, 4.5.2016, p. 1-88, CELEX number: 32016R0679
Directive (EU) 2016/680 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data by competent authorities for the purposes of the prevention, investigation, detection or prosecution of criminal offences or the execution of criminal penalties, and on the free movement of such data, and repealing Council Framework Decision 2008/977/JHA, OJ L 119, 4.5.2016, p. 89-131, CELEX number: 32016L0680
Convention for the protection of Human Rights and Dignity of the Human Being with regard to the Application of Biology and Medicine: Convention on Human Rights and Biomedicine, European Treaty Series - No. 164, Oviedo, 4 April 1997
Explanatory Report to the Additional Protocol to the Convention on Human Rights and Biomedicine concerning Genetic Testing for Health Purposes, Council of Europe Treaty Series - No. 203, Strasbourg, 27 December 2008
Additional protocol to the Convention forthe Protection of Individuals with regard to Automatic Processing of Personal Data, regarding supervisory authorities and transborder data flows, Council of Europe, European Treaty Series - No. 181, Strasbourg, 8 November 2001
Recommendation CM/Rec(2016)8 of the Committee of Ministers to the member States on the processing of personal health-related data for insurance purposes, including data resulting from genetic tests, 26 October 2016