Multipurpose Databases

Large data sets based on demographic information, disease occurrences and prescribing information are now available from several sources for use by trained and competent observers. We have extensive populations, particularly in the United States, Canada and the United Kingdom, for whom routine information about demography, drug exposure and disease experience are available in reasonably standardised formats. We have skilled analysts available to review such data sets for important causal associations between drugs and events. These information sets are extremely powerful tools and must be used with skill and great care lest the results reported turn out to be erroneous. In such circumstances great damage could be done both to public health and also to the data sets themselves. It is therefore crucially important that investigators ensure the validity of their observations by careful scrutiny of at least a sample (if not all) of the basic records. To rely solely on computer codes for disease identification without returning to verify basic written records is, in the opinion of this author, likely to lead to serious potential for error. Failure to undertake basic validations could easily lead to untold collateral damage to the reputation of individual medicines, and indeed to the parent data set itself.

Recently there have been some examples of conflicting conclusions emanating from different investigators reviewing the same topic from within the largest database in the United Kingdom, the General Practice Research Database. This may seem surprising at first sight. However, it must be clearly understood that the world-wide experience in this exciting area is confined to relatively small groups of investigators, as there are formidable logistical problems to overcome in entering and conducting research on these data resources. For example, drug-, symptom- and disease-codes tend to change with time during the years of data accrual. This is not territory for the amateur or the unwary! One simply cannot go to these extremely complex information systems and expect to perform high-quality research overnight. The issues are usually technically challenging and epidemio-logically extremely complex.

Classical epidemiology is well used to dealing with fixed properties of individual patients, such as sex, height, weight, parity, smoking habits, etc. or one-off exposures to toxic substances, such as chemicals or infective agents. It is not so comfor table dealing with intermittent exposures at varying doses such as is customarily the case in drug epidemiology studies. There are some areas where the exposure status can be somewhat constant. Examples of these would be the use of oral contraceptives and hormone treatments (replacement therapies with oestrogens, insulin, thyroxine, etc.). Even here, however, patients regularly change individual preparations and great care must be taken to ensure accuracy and fairness in data interpretation. In other areas intermittent exposures are the norm.

In embarking upon a drug safety study in a large database, the investigator must clearly specify the hypothesis to be tested. (Such databases are so complex as to be generally unsuitable for hypothesis-generation except under very confined circumstances arising usually within an individual study.) Once one has defined the hypothesis, exposure and outcome status have to be assessed accurately. The nature of the study design has to be identified. Is it a follow-up study, a case-control study, or will it be a nested case-control study within a large group of subjects exposed to a individual medicine or class of medicines?

Failure of clarity at this stage could doom the study from the onset. Investigators interested in a particular hypothesis can often be mesmerised by the apparent abundance of information available to them. They should keep in mind that it is crucial to restrict themselves to appropriate comparisons. Thus if one is looking at the effect of, say, hormone replacement therapy on osteoporosis, the relevant outcome measure available in such databases is generally a fracture. However, not all fractures are relevant. Indeed, most are irrelevant to the hypothesis, as they will have an obvious and sufficient cause, such as a road traffic or other accident, an underlying neoplasm or pre-existing bone disease. Similarly, not all exposures to hormones are relevant. For example, it would seem unlikely (biologically implausible) that a single prescription for such treatment would be relevant to the outcome of interest. Trained epidemiologists are used to thinking of chance, bias and confounding as explanations for any associations they see in data. Although the items mentioned above are forms of bias, they tend to be obscure to all but those trained in the complexities of pharmacoepidemiology. Yet they are crucial issues to consider before one embarks on a seemingly large and promising study. Reflect that a negative outcome to a project could be because the study drug does not cause the outcome of interest. However, it could also arise from the fact that there is so much "noise" in the system that an investigator cannot not see the true link between drug and disease when it is in front of him because of inappropriate inclusions in the disease and drug exposure categories and inappropriate inclusions and exclusions in the comparator population. Finally, there is the problem of missing information found in all systems, yet requiring particularly careful handling in a multi-purpose database. Such information rarely leads to a false-positive conclusion, but it could result in missing a key finding. The main safeguard here is familiarity with the data set itself.

I have spent some time on this topic because I fear that the availability of more and more powerful information systems could lead to an epidemic of poorly undertaken studies that would reflect badly on the fledgling science of pharmaco-epidemiology. This would be a matter of great regret, as the subject is of major importance for the future safety of patients, prescribers, dispensers and manufacturers alike. All have different perspectives, yet all share a common goal of getting the safest medicines to the appropriate patients at the right dose and at the right time. For a guide to some of the less obvious pitfalls in this type of research see the recent paper by Jick and colleagues in the Lancet (1998; 352: 1767-1770).

The development of pharmacovigilance is now at a critical stage. With powerful new tools at our disposal we have at last the opportunity to provide the public with some of the reassurances it requires from the industry and the professions. Ironically, it has taken over 35 years since David Finney originally recommended this approach in a seminal article in the Journal of Chronic Diseases (1965; 18: 77-98). It is crucial that we now rise to this challenge with enthusiasm and skill, seizing the opportunities that present themselves to us in these powerful information systems and surmounting the local difficulties relating to anonymisation of data sets, scientific rigour and credibility. For once we in the United Kingdom are in possession of a world-beating facility for research in the form of the General Practice Research Database, due solely to the foresight of its founding practitioner, the large numbers of collaborating general practitioners, and the analytic skills of the supporting Drug Surveillance Program.

Delicious Diabetic Recipes

Delicious Diabetic Recipes

This brilliant guide will teach you how to cook all those delicious recipes for people who have diabetes.

Get My Free Ebook

Post a comment