Privacy of Clinical Data

In 1997, the Institute of Medicine [47] reported on significant lacunae in both technology and policy in protecting confidential patient data. Among the problems of most concern emphasized in this report was the relatively unrestricted access by third parties to these data for secondary uses and the lack of the adequacy of the anonymization process (in practice and theory [170, 171]). Subsequent to this effort, the clinical informatics community developed several model confidentiality policies [153] and cryptographic identification systems [111, 174].

As the fruits of the Human Genome Project are translated first into clinical research protocols and then into clinical practice, personally identifiable genomic data will find their way into several instances of the information system. The challenges posed to security and privacy of such data might seem to dwarf any encountered to date with conventional clinical data. At first blush, genomic information is likely to be much more predictive of current and future health status than most clinical measurements. And, with very few exceptions, an individual's genome is uniquely identifying and applicable to that individual's entire life. This identifiability is apparently more reliable, persistent, and specific than typically cited identifiers, including a person's name, social security number, date of birth, or address. In practice, however, the widespread nature of clinical databases and the ease with which a unique individual can be identified [170] allow for breaches of privacy to be effected relatively trivially, even without genetic information. Furthermore, as demonstrated by insurance companies, highly accurate actuarial and prognostic knowledge can be developed solely on the basis of clinical history, habits such as smoking, family history,a physical examination, and a handful of laboratory tests. It remains to demonstrated, given all the genomic reductionist fallacies reviewed at the beginning of this book, just how much additional prognostic accuracy will be delivered by the use of personal genomic data.

Nonetheless, at the very least, the architects of information systems storing genetic data should learn from all the mistakes and designs developed for the security architectures and privacy policies of conventional clinical information systems. Conversely, the dire concerns voiced over the storage of personal genetic data will likely generate new policies and security architectures that will enhance the confidentiality of clinical information systems. Moreover, when personal genetic data become incorporated into routine medical practice, the safeguards in place for the medical record's confidentiality will have significant import for the confidentiality of the genetic data that will be referenced in the medical record. It also may further drive the increased degree of patients' control of their own medical record [125].

Why should we bring up these concerns at all in a book on functional genomics? Simply because there are two strong imperatives to keep the privacy of the patient or tissue donor intact. First, on the simple grounds of human rights, there is a basic or implicit contract between researchers and patients. The second and less altruistic motivation is that if the confidentiality of the patient is breached, even the perception of the loss of confidentiality (particularly if it involves a disease for which there is attached social stigma or risk of loss of employment or income) could jeopardize an entire study. Or it could well lead to a broad societal reaction against genomic studies. Even now, there exist multiple recourses for the patient to obtain both civil and criminal penalties against the researcher. These penalties have further been formalized by the recent components of the Health Insurance Portability and Accountability Act (HIPAA) as amended in the final days of the Clinton administration in 2000 [32].[2] In this brief section we outline the major obligations involved in protecting the confidentiality of genomic patient data and common pitfalls in implementing them.

6.3.1 Anonymization

In theory,if a database were to be truly anonymous, then, even if it fell into the possession of an unscrupulous party, no harm could come of it. Therefore, a first attempt at the design of a functional genomics database will typically involve one of the following two strategies for anonymizing a database.

The first strategy is to delete or replace common identifying attributes, such as name, date of birth, or address, which could be used very easily with external databases to identify the patient.[3]. The risk with this approach is that although it does indeed reduce the ease of identification to eliminate these common "face recognizable" identifiers, the records can still be readily identifiable. The underlying insight is that there will be a list of attributes, such as which doctor the patient visited, when the patient was first diagnosed, or what kind of cancer the patient had, which can serve to uniquely identify the patient. At the very least, it can be used to reduce the set of patients matching these attributes to a handful of individuals. Then, if there is an incentive to do so, there is little or no barrier to reidentification of the patient record. This approach has been most extensively developed and documented by Latanya Sweeney [169,170], who documented that even meticulously deidentified records such as those from the billing systems of Medicare could be used to re-identify patients, including prominent politicians. She developed a collection of programs, called SCRUB, which enable an increasing degree of anonymity due to elimination of data fields which could be used to particularly identify a single patient. Nonetheless, one of the clearer messages of that research was that to the degree that a clinical database is useful in providing clinical detail to characterize the patient, there is an increasing risk of breach of the anonymity of that record. With this in consideration, designers or architects of functional genomic databases should be mindful of being parsimonious in their choice of attributes used to annotate these clinical databases, such that only the minimum needed for the purposes for which the databases are designed should be implemented.

The second, alternative approach is to segregate those fields, including name and addresses, that would be the most revealing of the patient's identity and place them into a separate identification database controlled by a third trusted party (trusted by the patient) who would only allow the patient data to be "joined" to the nonidentifying data with explicit permission of the patient or the institutional review board of a research institution. That is, the reidentification of the patient would be determined by consent and disclosure policies and practice. We note that the anonymity of the nonidentifying data could still be compromised by the same reidentification techniques applied to the scrubbed databases described above.

In our own investigations of multicenter large-scale genomic databases, we have investigated the use of cryptographic health identification systems which could be used locally or nationally to create systems that enabled either the patient, the provider, or the institution to determine very specifically under which circumstances the patient record could be joined to the identifying information [111]. Furthermore, we have argued for patient-controlled encryption of their record [125] to prevent the application of reidentification techniques.

In the end, however, it should be recognized that no matter how secure the means used to encrypt the database or to anonymize it, if there is a sufficiently large financial or personal incentive to breach the privacy of a particular individual, it will be done, whether through a direct assault on the protection measures used, or through a corruption of internal controls be they technical or sociological. In this light, it is most reasonable to ensure that when patient samples are entered into a database, they be obtained only upon very extensive debriefing of the patient of the measures that will be used to protect the record, as well as the potential risks. With full disclosure and consent, there is much less likely to be acrimony and post hoc disruptions that may affect the integrity of the entire functional genomic investigation undertaken by a particular investigator. 6.3.2 Privacy rules

By passing the HIPAA, Congress established standards governing the privacy of identifiable health information. Though this legislation covers many aspects of privacy in clinical medical care, there is a significant area of overlap with clinical research and the practice of obtaining consent from human beings involved in such research.

Those organizations covered by the HIPAA legislation and the privacy rule can continue to use properly deidentified health information for research purposes. However, those organizations can only use and disclose protectedhealth information for research purposes with permission from each individual, or without individual authorization in a restricted set of cases.

1. An institutional review board has approved a waiver of research participant's authorization. This could be done if all of the following conditions are met:

♦ Use of the information involves no more than minimal risk to the individuals;

♦ The waiver will not adversely affect the privacy rights of the individuals;

♦ The research could not be done without a waiver or without use of the information;

♦ The risk to the research subjects is low compared to the potential benefits to those subjects;

♦ A plan is in place to protect against improper use;

♦ Identifiers to the information will be destroyed as early as possible;

♦ The information will not be re-used.

2. The researcher documents that the data will be used only to design or assess the feasibility of a research project.

3. The researcher documents that the protected health information of deceased individuals is being used and is necessary for research on their decedents.

This section is not meant to provide specific legal advice on this subject. More information about the implications of HIPAA on clinical research is available at There are other U. S. laws to protect research subjects, including the Common Rule from the Federal Drug Administration. Privacy rules in other countries, especially in the European Union, will differ. The reader is strongly encouraged to consult his or her own institutional review board, or equivalent organization, for further guidance on this subject.

[2]And as pursued through at least the early months of the George W. Bush administration.

[3]A list of such identifiers that have been deemed of particular risk by the U.S. Department of Health and Human Services as part of HIPAA can be obtained at

Was this article helpful?

0 0

Post a comment