Dave Peck

Bebbington (Chapter 1) has outlined current thinking about the classification of depression. Although some debate exists, it is generally recognised that depression is best construed as a unitary disorder along a continuum of severity. The main exception to this simple classification is bipolar depression, in which periods of depression are interspersed with periods of excitement and mania.

Snaith (1993) examined the content of a range of depression rating scales and noted great divergence, some apparently focusing on items more associated with anxiety than with mood. He concluded with a plea for a more sophisticated approach to the construction and validation of rating scales; in particular, that they should be more closely related to accepted clinical definitions of depression. Despite such pleas, the most commonly used measures of depression, in clinical practice and in research studies, are still standard rating scales and questionnaires.

There are several alternatives to the traditional questionnaires and rating scales in the measurement of depression, using quite different methodologies. These alternatives can and perhaps should be used in conjunction with more traditional methods; the methods tap aspects of depression that are often neglected by scales and questionnaires, such as slowness. Many of them have levels of validity that suggest that they could be used to monitor changes in the level of depression. In this chapter, these alternatives will be termed 'behavioural' methods. They will receive more attention than might seem to be justified on the basis of the frequency of their reported use, in the hope that readers will consider using them in their own clinical and research endeavours. All measurements are subject to measurement error; however, the sources of error in behavioural methods are different from those of the traditional scales. A more rounded and comprehensive assessment of depression should result if behavioural measures are included in assessment batteries, despite their being more time-consuming than questionnaires.

Bebbington has already noted that many instruments have been devised to detect or diagnose mental illness, including depression, such as Schedules for Clinical Assessment in Neuropsychiatry (SCAN), Structured Clinical Interview for DSM IV (SCID) and the

Mood Disorders: A Handbook of Science and Practice. Edited by M. Power. © 2004 John Wiley & Sons, Ltd. ISBN 0-470-84390-X.

Present State Examination (PSE). Although some of these diagnostic instruments are used to assess levels of depression, this is not their main function; they will not be covered in this chapter.

The focus of this chapter is therefore on psychometric instruments and other methods that are used to assess, and monitor change in, levels of depression. Most of the rating scales are copyrighted and are only available commercially. However, some are available free on the Internet, courtesy of pharmaceutical firms; for example, the Hamilton and the Zung scales can be downloaded from www.wellbutrin-sr.com/hcp/depression/hamilton.html.


It is possible to classify psychometric measures of depression in several ways. One useful way is according to whether the measure is nomothetic or idiographic; that is, whether the measure is a standard instrument with set questions and a set of norms against which an individual score can be compared (nomothetic); or whether the instrument has a standard format, but with the content entirely determined by the specifics of the individual's problems (idiographic). A second way is according to whether the instrument is completed by the client (self-completed) or by the clinician (observer-completed); the latter can be subdivided into instruments based on standard rating scales and those based on structured interviews. A third classification is according to whether the instrument is designed to assess depression alone, or whether it includes assessments of other mental states (multistate instruments).

These differences are important because the different kinds of instrument often tap different aspects of depression, and may be differentially sensitive to the occurrence and rate of clinical change.

A comprehensive guide to psychometric measures of depression, including sample copies of some instruments, has been published (Nezu et al., 2000).

Nomothetic observer-completed instruments

These instruments should be administered by a trained interviewer, but, in practice, it is unlikely that all clinicians will have received the relevant training. The advantages of observer-completed instruments are several: they are more suitable if clients are very depressed, uncooperative or distracted; clients can be given the opportunity to elaborate on their responses; and clinicians often have a different but equally valuable and valid perspective on client problems. A disadvantage is that clients may be unwilling to respond openly to certain items if they are face-to-face with a clinician.

There are many instruments in this category, but, in practice, the instrument overwhelmingly used in research publications is the Hamilton Depression Rating Scale (HDRS). For example, Snaith (1996) reported that the HDRS was used in 66% of publications that employed a depression rating scale. Similarly, a survey (carried out in the preparation of this chapter) of the British Journal of Psychiatry and the American Journal of Psychiatry from 1990 to 2002 showed that the HDRS was employed in 152 papers, with the next most common instrument in this category being the Montgomery-Asberg Depression Rating Scale (three papers). Although it is more difficult to estimate its use in clinical practice, it is likely that the HDRS predominates here as well. This predominance is puzzling because much dissatisfaction has been expressed with the HDRS (Snaith, 1996).


The HDRS comprises 17 items, rated 0-2 or 0-4, and emphasises somatic symptoms. There are several alternative versions of the HDRS, with extra items to cover more psychological symptoms, but mostly the 17-item version is employed. It focuses on symptoms over the last week or so. Little guidance is provided on how to administer the scale; the lack of published guidance suggests that the observer should be an experienced clinician. Specific training is required for its use, and it should be used only as part of a comprehensive assessment, along with information from a variety of other sources; unfortunately, such recommendations in the use of the HDRS are widely ignored (Snaith, 1996). Hamilton (1967) is the key reference. The psychometric properties of HDRS have been extensively investigated. Interrater reliability is generally good (O'Hara & Rehm, 1983).

Questions have been raised about the sensitivity of the HDRS to detect differences between drug treatments, and this lack of sensitivity may reduce the statistical power of studies using this scale (Faries et al., 2000). A six-item version of the HDRS has been developed; O'Sullivan et al. (1997) reported that this short version was as useful as the longer versions in terms of sensitivity to change after drug treatments.


The MADRS contains 10 items, rated on a four-point scale; useful descriptions of all the items are provided, as well as cues at each rating point. The administration should be preceded by a flexible clinical interview. The MADRS focuses entirely on psychological aspects of depression; this lack of somatic items is said to make it particularly suitable for use in general medical populations because it omits aspects of depression such as poor appetite that could also occur in a physical illness with no depressive component.

Nomothetic self-completed instruments

The same survey of the two journals referred to above indicated that the Beck Depression Inventory (BDI) is similarly predominant, especially in research studies. The BDI was employed in 54 papers in the journals examined; the next most frequent (the Zung Self-Rating Depression Scale) was employed in six papers. Similarly, Richter et al. (1998) claimed that the BDI has been used in more than 2000 research studies. It is likely that the BDI has a similar predominant position in clinical practice. Again this predominance is puzzling because the BDI has been the subject of much criticism (see below). However, the extensive knowledge that has accumulated about this inventory, and practitioner inertia, may explain its continued popularity.


The BDI has 21 items, rated on a four-point scale of severity (0-3), focusing mainly on psychological aspects of depression. Items were derived from the authors' clinical experience. The original reference is Becketal. (1961). According to Richter et al. (1998), scores on the BDI tend to be markedly skewed, most scores being in the lower ranges. These authors also note that even in a sample of psychiatric patients, the mean item score on the three-point scale rarely exceeds 2, suggesting that the BDI scales do not discriminate well between levels of severity; internal consistency is acceptable but test-retest reliability is poor. Some factor-analytic studies report that the BDI is multifactorial; others, that it measures just one factor. A more recent version of the BDI (BDI II) has been developed (Beck et al., 1996). Some of the original items have been replaced (such as symptoms of weight loss), and the BDI II is now explicitly linked to DSM-IV criteria, with a common time frame of 2 weeks. The psychometric properties of BDI II are promising (Dozois et al., 1998; Steer et al., 2001), and many of the deficiencies of the original version appear to have been rectified.

Although the BDI was not designed to detect cases, several studies have indicated that it can be usefully employed in this way; for example, after myocardial infarction (Strik et al., 2001) and in low back pain (Love, 1987).


This is a self-completed version of the observer-completed MADRS. Svanborg and Asberg (2001) compared it with the BDI. The instruments correlated at +0.87 and were equally effective in discriminating between different diagnoses and in assessing sensitivity to change. The authors criticised the content of the BDI, claiming that the items are unduly influenced by 'maladaptive personality traits'.


This scale comprises 20 items rated on a four-point scale covering symptoms over the last week. Psychological and somatic items have similar weight. It correlates moderately with other scales, but there is inconsistent evidence on its sensitivity to change. Becker (1988) has provided a useful short review.

Conquering Fear In The 21th Century

Conquering Fear In The 21th Century

The Ultimate Guide To Overcoming Fear And Getting Breakthroughs. Fear is without doubt among the strongest and most influential emotional responses we have, and it may act as both a protective and destructive force depending upon the situation.

Get My Free Ebook

Post a comment