Confidentiality, Data Security, and Cancer Research:
March 23, 1999
Perspectives from the National Cancer Institute
This paper explores the tension in cancer research between the need to protect the confidentiality of individuals and the need for access to information. It proposes a process and a series of measures that would move toward satisfactorily reconciling these needs. The measures include creation of barriers to unimpeded information flow in order to prevent the inappropriate identification of individuals, coupled with a sensible consent policy allowing prospective participants in studies to make informed choices. The biomedical research community has a generally successful record of safeguarding the confidentiality of individuals participating in research. There is no reason to think that a systemic problem with data confidentiality presently exists within the research enterprise. On the other hand, striking advances in biomedical science and the proliferation of electronic data storage, linkage, and transmission have created significant new challenges in maintaining confidentiality. New legislation aimed at redressing lapses in other areas of medicine - for example, in the management of the medical record in the ordinary setting of day-to-day hospital business - should not impose broad and inflexible restrictions on researchers' access to data. The goal of the research community, and of society generally, should be to develop and employ effective confidentiality protections without compromising the critical research necessary to improve human health. More specifically:
In general, society can attain absolute security only by an unacceptable sacrifice of the common good. Absolute security is virtually never a realistic goal, and American society does not strive for it in other vital areas. Threats posed by terrorists, bank robbers, or computer hackers are countered by high levels of pre-emptive security, bolstered by criminal penalties for violations. In health care, as society institutes appropriate security measures that suit the current scientific and technological environment, it should simultaneously promote the development of legislation that provides criminal penalties for willful, unauthorized attempts to gain access to private medical information.
- The research community must make certain that strong, state-of-the-art security measures are developed and in place for insuring the confidentiality of data relating to research participants. Research organizations of similar types should develop standards, policies, and procedures for the handling of patient-identifiable information, where these do not yet exist. They should define the extent to which patient-identifiable data are necessary for research use, eliminate wherever possible the collection, storage, and transmission of information with identifiers, and, whenever identifiers are necessary, protect against unauthorized access or disclosure. Audit processes to assure adherence to policies and procedures safeguarding confidentiality should be developed, along with sanctions for lack of compliance. Consumer advocates with interest in research should participate actively and contribute to developing these policies, procedures, and practices.
- Potential problems of discrimination in employment or insurance relating to health status are very real. These should be addressed directly by legislation banning discrimination, not by making data inaccessible to researchers. Such restrictions will not attack the central problem but will slow the generation of new knowledge and prevent progress in preventing and treating disease.
- The research community should initiate an inclusive process aimed at developing a national consensus about the circumstances under which informed consent is needed for studies using data or tissues already stored in repositories. Workable, balanced policies and procedures should be developed to provide people with the opportunity to give consent for the future use of tissues and data obtained under conditions of both routine care and clinical investigation.
- To develop effective and practical approaches to assuring the confidentiality of participants in research, the concerned communities should come together to identify best practices where they exist, pinpoint gaps where they do not, and propose new procedures or mechanisms wherever these may be required.
Table of Contents
- What is an Individual Identifier?
- What Kinds of Access to Information Requires Individual Informed Consent?
- Clinical Trials
- Human Genetics and the Study of Inherited Susceptibility to Disease in Populations
- Other Uses of Human Tissue in Research
- Surveillance Databases and Registries
The achievements of research in preventing disease and improving the care of the sick will surely stand as a major legacy of the 20th century. The conquest and prevention of disease and our ability to track disease incidence in the population and obtain clues about causation depend ultimately on being able to relate exposures or interventions to medical outcomes. Consider some familiar examples. What is the relationship between use of estrogens and the risk of developing breast cancer? Or between exposure to toxic chemicals and birth defects? Or between fluoridation of the drinking water supply and the incidence of dental caries? Does the administration of aspirin or beta-blockers reduce the risk of death after a heart attack? Can breast cancer be prevented by the use of antiestrogen hormones? Can the use of anti-HIV drugs reduce the transmission of the HIV virus from pregnant women to newborns?
These questions, and others like them in all areas of medicine, are answered in various ways. In some cases, researchers invite people to participate in studies in which an intervention (a drug, a diagnostic, a diet, etc.) is applied, and the result is compared with the experience of a similar group of people given a different intervention. Prospective participants are informed of the potential risks and benefits and must give written informed consent before their entry into the study. In other cases, researchers examine data or stored tissue already acquired by others, in order to track trends in the occurrence of particular diseases or relate outcomes to particular characteristics of the stored tissue. In yet other cases, people with certain histories or medical characteristics are identified and, after giving formal consent, answer questionnaires that solicit certain kinds of information about experience, exposure, or family history. What all these approaches have in common is that careful observations on individual people, and sometimes on tissue specimens from them, are made, recorded, and analyzed.
In the past, scientists, the public, and the government have cooperated to collect information relevant to the health of the public and of individual people. These efforts have yielded invaluable information about epidemics, disease causation, trends in diseases over time, and the effectiveness of medical or public-health interventions in treating and preventing disease. Many of these studies have included the collection of potentially sensitive information, which has been effectively held in confidence.
In recent years, however, the public has become increasingly concerned about the extent to which sensitive medical information is actually protected. Why now? The lax security accorded the medical records in some health-care facilities has received wide publicity. The availability of large computerized data banks, the ability to transmit vast repositories of information electronically, and the dissemination of electronic informatics systems throughout the health-care delivery system are additional cause for concern. Certain types of medical information are particularly sensitive. HIV status and studies of inherited susceptibility to disease can render a person ineligible for employment or insurance; additionally, genetic information can stigmatize relatives of the tested person. We are told that some cancer survivors have had disastrous experiences with denials of insurance coverage and discrimination in employment. Finally, at a time of great changes in the health-care system, many people are no longer interacting with a trusted family physician but with less personal "provider" systems, which they may neither understand nor trust.
Intensifying concern about confidentiality in health care is leading to action by policymakers. Several states have enacted legislation barring casual access to patient records by health professionals not directly involved in the patient's care. The U.S. Congress must pass a privacy law by August, 1999; if they do not, the Secretary of Health and Human Services must promulgate regulations by February, 2000 governing the confidentiality of health information included in electronic transactions. The National Research Council recently issued the recommendations of a committee of experts on confidentiality and security of health-care information in the electronic era1. All these efforts have the laudable goal of keeping information that should be confidential secure from inappropriate access.
Unfortunately, in the current concern about confidentiality protections, the needs of clinical research are not often explicitly addressed. Some of the contemplated or actual legal remedies, or their interpretation, seriously threaten the conduct of important medical research. One of the difficulties in the formation of policy to cover a broad and complex area like health care is that laws or regulations designed to protect against worst-case scenarios may exert unintended and highly adverse consequences on the conduct of medical research, which depends on access to and exchange of information.
We shall now outline potential problems with data confidentiality in several key areas of cancer research, show how these problems have been addressed by the research community, and indicate what more needs to be done. First, however, we turn to two fundamental questions that underlie the entire discussion.
WHAT IS AN INDIVIDUAL IDENTIFIER?
Adequate provision of confidentiality requires that "identifiers" be stripped away from data used in research, so that the researcher can work without being able to associate the data with particular individuals. What kind of information can lead to identification of an individual? Unfortunately, this question has no simple answer. Name, address, and social security number would probably satisfy anyone's definition. Consider, however, the case of individuals with certain rare diseases. Such people - rare and perhaps conspicuous for one reason or another - may be well known in their local communities. It might require very little information to violate their confidentiality. Even innocent information like age, sex, and hair color might link potentially sensitive medical details with them unambiguously. In a small community, even release of a pedigree might lead to identification of individuals without actually naming anyone. Looking ahead toward the era of molecular medicine, imagine a large databank consisting of the genetic code for certain genes for all people having a certain condition. A clever snoop knowing a particular gene sequence could, with access to the databank, deduce the identity of the individual having that sequence. This would be possible, of course, only if the data were not adequately protected against such intrusions. Admittedly fanciful today, this possibility may not seem like science fiction much longer.
These examples show that the term "individual identifier" does not easily lend itself to absolute definitions. Certain pieces of information about people might or might not serve as identifiers, depending on the time, energy, and access to other information that a snoop brings to the search. In other words, the designation of certain data elements as an "identifier" is, in part, a function of the risk that society is willing to tolerate of inappropriate disclosure of personal information.
WHAT KINDS OF ACCESS TO INFORMATION REQUIRES INDIVIDUAL INFORMED CONSENT?
For many years the law has stipulated that the entry of persons onto research studies requires their formal, written consent. Regulations emanating from the Department of Health and Human Services describe in detail what informed consent shall entail for studies supported with federal funds, and how the protection of participants in research shall be assured. For prospective2 studies of any kind involving more than minimal risk, the need for informed consent is quite clear, and much scholarly and practical discussion has centered on how informed consent - a subtle and difficult concept - is to be achieved operationally.
For retrospective studies, however, the requirements are less clear. If a researcher wishes to utilize information or tissue already in some kind of repository (for example, a hospital chart, a pathology department, a centralized database, or a tissue bank), when is it appropriate for the researcher to go back and obtain informed consent from the person? Always? Never? Only when the data can be traced back to the individual being studied? This is currently unresolved and the subject of lively national discussion. The National Bioethics Advisory Commission is currently considering the special case of tissue resources, but the issue extends to any stored medical or informational resource relating to or derived from people.
The answers to these questions have enormous practical consequences. If a researcher wishes to work with large informational or tissue repositories collected in the past, the logistical problems in attempting to secure informed consent are formidable. Some of the subjects will have died, moved, or changed names since the repository was created, and the process of re-contacting them will be labor-intensive, expensive, time-consuming, and only partially successful. This is especially true when the research study goes beyond the walls of health-care institutions entirely and focuses on people without disease living in the community. These problems are of such magnitude, in fact, that they will effectively discourage such studies from being done at all. When the insertion of new research questions into a large study currently being performed requires re-consenting of the participants, the process is usually possible, since the population under study is being actively followed and the location of the study subjects is known. Here too, however, the practical difficulties may be immense.
Although these issues are relevant to all of biomedical research, we shall now focus on the implications for cancer. We shall consider examples of the major types of cancer studies in which confidentiality considerations loom large. Having identified actual or potential problems, we then suggest a process for solving them.
As elsewhere in the research arena, trials examining new interventions for prevention, treatment, detection, or diagnosis of disease require that a balance be set between the meticulous protection of research information and the need for efficient acquisition, transfer, and future use of this information to answer important questions.
The past record of clinical researchers in safeguarding the confidentiality of participants in cancer clinical trials has been excellent; over the more than four-decade history of NCI's extensive multi-center clinical trials program, we know of no breaches of security at research organizations. It is also true, however, that currently there are no standard operating procedures for data security that are in force across NCI's cooperative trials program. Individual data centers employ a variety of strategies to insure confidentiality. All employ security measures appropriate for paper files containing identifiers, including use of locked storage locations and mandatory escorting of non-employees within work areas. Methods for handling, archiving, and transferring clinical data are evolving rapidly with developments in medical informatics. In large clinical trials, timely and efficient management of study information is crucial for assuring the safety of participants, identifying unexpected toxicities, and monitoring the quality of study data. Encryption can allow a high level of security as information is transferred and shared with investigators. The research process can then proceed without compromising the confidentiality of participants.
The informed consent of individual participants undergirds the entire research enterprise. An individual's willingness to consent is determined partly by the extent to which personally identifying and potentially sensitive information will remain confidential. Consent allows the researcher to collect and analyze personal data according to the plans described in the research protocol. If future uses of these data are anticipated, these uses are specifically included in the informed consent. But one can never anticipate all future uses of research information that may benefit either the study participants or people in the future. These uses may relate to long-term side effects experienced by participants, possible risks or benefits to organ systems not anticipated by the original investigators, or psychosocial effects of the disease or its treatment. For example, the ability to pursue long-term follow-up on a completed clinical trial in breast cancer allowed investigators to identify an increased incidence of endometrial cancer in women treated with the hormone tamoxifen. This finding had implications not only for those patients enrolled on the study but also the thousands of women outside the trial, taking the drug as part of their routine care. The finding led to development of closer endometrial monitoring policies for healthy women taking tamoxifen as part of a large breast cancer prevention study. It has also given researchers greater impetus for the discovery and clinical testing of more specific anti-estrogen hormones that will have the potential benefits of tamoxifen without the adverse effects on the uterine endometrium. In another example, the use of past research data permitted the identification of the increased risk of secondary leukemia in women receiving adjuvant chemotherapy for breast cancer with cyclophosphamide and doxorubicin. This finding permitted notification of women treated with this regimen and allowed investigators to inform future patients more adequately of the balance of anticipated risk and benefit before beginning therapy.
Secondary and previously unanticipated use of data may have a highly beneficial effect on the efficiency and effectiveness of planning the next major study in a particular area. In many childhood cancers, for example, cure rates are quite high, but the curative treatment has significant toxicities. The thrust of much research in pediatric oncology is, therefore, to maintain or increase the effectiveness of treatment while simultaneously decreasing unwanted side effects. Planning such studies has required the ability to link a particular toxicity with patient-specific information about therapy received.
Many issues concerning future use of data apply also to the future use of tissue; this is discussed later under "Other Uses of Human Tissue in Research."
In summary, practices for the handling of an individual participant's data in clinical trials have been adequate for assuring data security, at least with the paper-file storage, transfer, and analysis procedures of the past. As informatics technology evolves, however, and as data transmission on public-use electronic highways becomes commonplace, clinical trialists must consider what additional kinds of security measures should be developed and adopted as standard practice.
- Under the auspices of the NCI and led by experts in the clinical cooperative groups, which have decades of experience in handling clinical data, the cancer clinical trials community and representatives of advocacy groups should assess the adequacy of the confidentiality protections that are in place or planned for the cooperative trials program. The community should then establish standards for the collection, storage, and transmission of clinical trials data that will minimize the risk of disclosing the identity of participants. Once developed, these standards should be made widely available, and any research group conducting clinical trials, including individual investigators and data-management centers, should adopt them as the minimum standard.
- On an ongoing basis, clinical trialists should define the extent to which data that can identify individuals are necessary for research use. Where such identifiers are actually necessary, trialists should provide protections against their unauthorized disclosure. Where they are not, they should not be included in clinical trial databases.
HUMAN GENETICS AND THE STUDY OF INHERITED SUSCEPTIBILITY TO DISEASE IN POPULATIONS
Probably more than any other recent scientific advance, the development of techniques to characterize the genes of individuals has raised a host of concerns about the impact of research results on people's lives. Much of the concern centers on the potential for loss of health care and life insurance and for discrimination in employment. We limit ourselves here to those that have more or less direct relevance to issues of confidentiality. These include measures to insure confidentiality of research participants, adequacy of informed consent, and the extent to which participants should be notified about the outcome of research results that may have implications for their health or that of their family members.
In general discussions and in the deliberations of institutional review boards (IRB), the phrase "genetic testing3" is applied loosely to a wide variety of settings, including, for example, assessing somatic mutations in tumors and evaluating germline mutations in individuals. The current level of public discourse does not often distinguish between these and seems to regard them as equivalently risky. Clearly, however, there are very different health implications for the family of an individual whose tumor harbors somatic mutations than for one whose DNA contains a germline mutation. Somatic mutations carry no risk to an individual's family members, while germline mutations can clearly stigmatize both the individual and the family. Even within the spectrum of germline mutations, there is great variation in clinical import. Some germline changes are simply common variants of genes (so-called polymorphisms) that may have no relation to disease; other germline mutations lead to changes in the function of the genes that contain them and these changes may confer increased risk for a particular disease. To treat all these cases under the same assumptions is clearly not appropriate.
Many important research hypotheses involving variations in genes or in gene expression4 can be addressed efficiently and quickly by using specimens stored in repositories. The use of these specimens is threatened by the apparent reluctance of many IRBs to approve studies requiring access to tissue specimens. To the extent that this reluctance is scientifically informed, one might simply view it as evidence that the IRB system works well. If, however, IRB refusal to approve a study is based on misperceptions of the goals and risks of research, the refusal does not serve to protect participants in research and is counterproductive scientifically. In either case it appears likely that concerns about the confidentiality of individually identifiable data weigh heavily on IRB members. A number of professional organizations have become concerned about increasing restrictions surrounding the use of specimen repositories, including the American College of Pathologists, the American Association of Medical Colleges, and the National Academy of Sciences.
Many effective safeguards to protect the confidentiality of research subjects are already in place. Many possible problems relating to confidentiality can be dealt with by adequately "anonymizing" the specimens. Although IRBs will, in principle, approve the use of samples for genetic characterization when "identifiers" are removed, individual IRBs differ as to what qualifies as an identifier. Some IRBs require delinking of even most risk-factor information because of concern, discussed previously, that an investigator might discover the identity of a study subject by assembling a sufficiently detailed demographic profile (for example, a 43 year-old white woman born in Marlboro, VT, now living in Lake Oswego OR, who had smoked 40 pack-years when the sample was collected). Security or encryption methods could be devised that would make impossible the linking of participant identity with specimens but still allow linking to risk-factor information. If these methods were available, maximal use could be made of large stored collections of biospecimens. This is one possible area for attention. Encryption, access barriers, and audit trails can all serve to protect sensitive data.
When subjects undergo genetic characterization in a study involving large collections of stored samples from prospective cohorts, should they be informed of the results? This question is highly relevant to confidentiality considerations, since the answer will help determine what kinds of data security measures should be put in place. The most common rationale for notification is that the information will be useful to the participants in some way - either in treating or preventing disease or in making life decisions. Usefulness presupposes two things. The first is that we have enough information to make the results interpretable and meaningful to people5. In other words, we should understand the effects of the presence of certain genes on the risk of specific diseases and the health impact on an individual carrying the genes. For many cancer-related genes under investigation, this knowledge is incomplete or absent. Second, the results should be reliable. Current U.S law stipulates that only laboratories regulated by the Clinical Laboratory Improvement Act (CLIA) can generate results for use in patient care. Thus, if investigators should find a mutation known to confer higher risk for a specific disease, and if the mutation were found in a research laboratory rather than a CLIA-authorized one, many assume that the finding would have to be confirmed by an independent analysis in a CLIA laboratory before the results could be given to an individual. One solution might be to inform study subjects initially that they will not receive their own individual results, but that newsletter updates at suitable intervals will provide study results in summary form, along with recommendations for follow-up. Each individual could then decide whether to seek additional information. Another possibility is to inform patients initially that any positive results emanating from a research lab will be confirmed in a CLIA laboratory and then discussed with them, and appropriate counseling provided. Any viable solution will have to be such that the costs of confirmation and subsequent genetic counseling do not represent an undue financial burden.
In summary, our emerging ability to determine genetic susceptibility to disease raises particularly sensitive confidentiality issues, made more pressing by the fragmentary state of our present knowledge about how to handle the information and the potential for real harm to individuals if confidentiality is violated. On the other hand, overly stringent measures to protect confidentiality and a societal decision to permit research participants absolute authority over their tissue long after its donation may seriously impair the generation of new knowledge at a time of unprecedented opportunity.
In collaboration with experts in the cancer research and advocacy communities, NCI should coordinate the development of:
- methods that will inhibit the potential for violations of confidentiality of sensitive health information used in research. Encryption, for example, would permit the removal of comprehensible individual identifiers from demographic and outcome information but would enable linking of this information with samples from biorepositories. A number of other technological security tools, such as access barriers and audit trails, can help restrict and monitor access to sensitive information.
- systems to assure the secure, one-way flow of information about the study subject from whom the specimen derives. Such systems would prevent anyone unconnected with clinical care from identifying an individual participant from the research record and would prevent research results from reaching the participant's health-care record. They would allow follow-up information about outcome to continue to flow to the researcher (see next section).
- Educational materials for members of IRBs about risk, research on genetic susceptibility to cancer, and the associated confidentiality protections, so that IRB decision-making can be well-informed.
OTHER USES OF HUMAN TISSUE IN RESEARCH
In addition to use in genetic studies, human tissue specimens are used in a number of other ways, including study of the effects of exposures to toxic or carcinogenic agents, evaluation of new therapeutics, and identification of cellular components that may lead to the early detection or better prognostication of disease. Some of these studies do not depend on maintaining a link to an individual's identity. On the other hand, studies attempting to correlate biological characteristics of tissue specimens with long-term outcome (for example, with the development or course of a disease) require that information about the individual be obtained over time and be linked to the test results on the tissues from that individual. The actual identity of the participant need not be known, but the linkage between information on the tissue and information on the participant must be maintained and available to the researcher.
Study of human specimens, such as blood or tissue, is covered by the regulations that govern all research in people, so long as a link to the participant's identity is maintained. In order to realize fully the promise of modern biology and medicine, we must avoid the development of regulations that do not protect research participants but do inhibit the development of new tools to analyze the biological characteristics of tissue as they relate to exposures and outcome. The current regulations address levels of risk to participants and prescribe protections for these different levels. These protections involve how the research is reviewed by the institution and whether informed consent is required.
In addition to some of the needs already mentioned under Genetics:
- Informed consent should be obtained widely within institutions permitting the future use of tissue obtained in routine care for research purposes, provided that the research does not link the specimen to individual identifiers. A current procedure and form suitable for this purpose is about to undergo field-testing under the auspices of the NCI.
- Information systems must be developed and deployed that will assure the confidentiality of research participants while facilitating access to the clinical and outcome data necessary for research on tissue specimens. Such systems would securely encrypt individual identifiers while maintaining an encrypted link between tissue specimens and the medical information from individuals that is necessary for research. These systems would, in effect, interpose firewalls between the researcher and the medical care setting; they would prevent the researcher from identifying the participant, the participant's physician, or the participant's family. Firewalls would also block unvalidated research results from being used in medical decision-making. These barriers would not, however, impede the flow of essential information, stripped of personal identifiers, from the medical-care setting to the researcher.
SURVEILLANCE DATABASES AND REGISTRIES
Tracking of disease incidence and patterns in large populations depends on the accurate and timely reporting of new cases to a data repository. In the past the control of certain infectious diseases such as polio, syphilis, and tuberculosis has depended on legally mandated reporting to departments of health. Following the creation of a state tumor registry in Connecticut early in this century, many states have established state-based cancer registries, and hospital-based registries located in health-care institutions themselves have been common for more than 60 years. State or regional population-based registries have been instrumental in measuring progress in cancer control on a national basis for more than 20 years, principally through NCI's Surveillance Epidemiology and End Results (SEER) reporting system. In 1994 Congress established the National Program of Cancer Registries (NPCR), which is managed by the Center for Disease Control and Prevention (CDC). NPCR objectives are to assist states in developing model legislation and regulations to establish, enhance, and maintain population-based state cancer registries. Data from such registries satisfying established criteria for quality can be used to identify cancer trends, patterns, clusters, results of implementing health practices on a large scale, the effectiveness of public health programs, and planning. Centralized registry management requires consolidation of multiple medical or health records containing detailed information on individuals. Registry personnel must follow established guidelines, policies, and procedures for collecting, transferring, maintaining, processing, and protecting these records, whether paper or electronic, as well as for insuring their quality. A range of organizational practices and technological approaches exist to assure the confidentiality of sensitive information.
SEER and other population-based registries have developed a wide range of written and implied policies and procedures to assure secure handling and processing of all data collection, storage, and access to their confidential databases, including research studies involving linkage of records. Typically, employees of the registries are required to sign pledges to maintain and protect confidential information. Paper and electronic files are locked in secure areas with restricted access. Almost all SEER registries are managed by academic centers with extensive experience in clinical and epidemiological research governed by the IRB process. They have established research review committees that govern access to confidential information by persons external to the registry and for use of linked data files. In addition, written agreements outline the responsibilities of investigators requesting registry data and requirements for maintenance of confidentiality. Signed research agreements are prepared as part of the review and approval process, often involving an IRB. Procedures to insure confidentiality exist for data processing by registry staff or independent third parties. Protocols developed by SEER staff in Seattle and Los Angeles provide for secure data handling in the presence of patient contact, either directly or through physicians. Registry personnel mediate contact with patients, typically through the physician, who seeks permission from the patient to be contacted by the researcher.
National discussions regarding implementation of the Health Insurance Portability and Accountability Act (HIPAA), along with the general concern about individual confidentiality, have focused attention on population-based cancer registries. In particular, continued access to health records by federal agencies such as the NIH and the CDC, as well as state and local government agencies for public health surveillance, appears to be in some jeopardy. Some maintain that consent should be required to transmit such information to an organization with statutory authority to monitor health status. As already noted, there is concern in some quarters that the removal from databases of obvious individual identifiers may not be sufficient when patient records are used for analysis and publication of aggregate data. A number of standard procedures have been developed to protect confidentiality further. These include not transmitting unique identifiers to central databases; limiting inclusion of specific address information on public-use data tapes; and restricting analysis of small groups to a sufficiently large size that a single individual, institution, or health-care provider cannot be identified.
Changes in the health-care system have made the collection of population-based information even more complex, as confidential data must be consolidated from a diversity of delivery systems such as managed-care organizations; this increases the technical requirements for managing and protecting confidential health records. Most central population-based cancer registries have established policies and procedures that guide their work. In a few states, legislation has been proposed, or has already passed, which severely limits access to linked health-care data and human tissue without informed consent, even when the study in question requires only linked data and not individual identification. Clearly, views have shifted on the proper balance between permissible uses of public-health information systems and individual confidentiality.
- Population-based cancer registries should assess the extent to which their current policies and procedures adequately protect the confidentiality of individuals. They should develop standards for the collection and management of confidential data and access to it, including protocols for research studies. The infrastructure provided by professional groups and trans-organizational groups, such as the North American Association of Central Cancer Registries (NAACCR) should lead this effort with NCI support.
The issues outlined here are genuinely difficult and not susceptible to quick fixes or easy answers. Many of the specific recommendations for action noted above are in various stages of formulation and implementation. Concern about these issues is widespread within the scientific and advocacy communities, and a number of professional societies are attempting to develop recommendations. The education of policy makers, health-care professionals, researchers, and advocates will be a long and complex process. It begins with a clear articulation of the issues and the establishment of a dialogue in which all concerned constituencies share. Basic, clinical, and population scientists, biomedical ethicists, advocacy groups, the Office for Protection from Research Risks, the Food and Drug Administration, and the National Institutes of Health are all key stakeholders. A degree of consensus on the best course of action will be difficult to obtain, but the need is urgent.
1 For the Record: Protecting Electronic Health Information. Committee on Maintaining Confidentiality and Security in Health Care Applications of the National Information Infrastructure. National Academy Press, Washington, 1997. This monograph provides an excellent general overview of the nature of concerns with the privacy and confidentiality of health care information in the electronic age, as well as specific recommendations.
2 "Prospective" means that the intervention of interest is performed and all relevant information and observations on its effects are gathered after entry onto the study. By contrast, "retrospective" studies focus on information that has already been collected.
3 The term "genetic test" refers to any laboratory assay on tissue that gives information about the sequence or code present in a particular segment of a person's DNA. Since each person inherits a full set of gene sequences from both parents, and since genes are normally duplicated with only very rare mistakes, the gene sequences in normal body tissues track very closely with parental sequences, except where the normal process of genetic recombination has caused mixing of sequences from both parents after fertilization. If a parent has a mutation (a change from the normal sequence) in a particular gene, this mutation may be passed on through the germ (sperm or egg) cells to a child. This transmission of mutations through the "germline" is the basis for the inheritance of disease susceptibility from one generation to the next. Germline mutations are passed on to all the cells in the body. Cancers are very different from most normal tissues, however. Malignant cells also carry inherited (germline) mutations. In addition, however, as cancer cells reproduce and tumors grow, the genes of cancer cells are very unstable and mutations often accumulate in them over time. These are called "somatic" mutations, since they are not inherited from either parent but are produced in the tumor cells themselves.
4 All normal cells in the body contain the same genes. What makes a brain cell different from a kidney cell is in the particular set of genes that each turns on (or "expresses") in the course of growth and development. When compared to normal cells, cancer cells have abnormalities not only in their genetic sequences but also in their patterns of gene expression.
5 For example, after sequencing a particular gene segment, we might find a mutation at a particular place in the sequence. We might not yet know, however, whether this mutation has any significance in changing a person's risk for developing a disease. Should the person be made aware of this mutation without such information?