Common measures of mental health
This section gives brief details on the most widely used measures listed in the Catalogue. For each measure, it provides information on:
- The nature of the measure (interview or questionnaire)
- The main mental health problem(s) assessed
- The reporter(s) and reporting period
- The number of items included and how they are scored
Most questionnaire measures are designed to generate dimensional (total scale) scores; some also have well-validated sub-scales. For many questionnaire measures it is also possible to apply cut-points to identify individuals with high scores, likely to reflect clinically significant problems. We include details of widely-used cut-points where these are available. Where past studies suggest that different cut-points may be appropriate for different population sub-groups, researchers are advised to consult relevant publications to guide their selection.
For each measure we include reference(s) to papers describing the development of the instrument; these typically also include some data on reliability and validity. For many of the more widely-used instruments, further psychometric data (including on the suitability of the measure for use in different population sub-groups) are available in subsequent publications. Once again, researchers are advised to consult such additional sources if they are planning studies of samples that differ markedly from the original validation samples.
Alcohol Use Disorders Identification Test (AUDIT)
The AUDIT is a 10-item self-report screening tool developed by the World Health Organization to assess alcohol consumption, symptoms of dependence, and harmful alcohol use. Most items relate to use in the past year. The AUDIT was designed to be used internationally, and has been validated in a wide range of population groups. Each item is scored on a 0-4 point scale (total score range 0-40, higher scores indicating more difficulties). A score of 8 or more is considered to indicate hazardous or harmful alcohol use in many samples; other cut-points have been proposed for particular populations/ sub-groups.
The AUDIT-C is a brief 3-item self-report screen for heavy drinking and/or active alcohol abuse or dependence. The 3 items are each scored 0-4 (total score 0-12). A score of 3 or more in women and 4 or more in men is considered optimal for identifying hazardous drinking or active alcohol use disorders. The AUDIT-C has sound psychometric properties, with good sensitivity and specificity for identifying problematic alcohol use in community and primary care samples.
Babor T., Higgins-Biddle J.C., Saunders J.B., Monteiro M.G. (2001). The Alcohol Use Disorders Identification Test: Guidelines for Use in Primary Care. World Health Organization, Geneva.
Saunders JB, Aasland OG, Babor TF, Grant M. (1993). Development of the alcohol use disorders identification test (AUDIT): WHO collaborative project on early detection of persons with harmful alcohol consumption-II. Addiction 88, 791–804.
Bush, K., Kivlahan, D.R., McDonell, M.B., Fihn, S.D., & Bradley, K.A. (1998). The AUDIT alcohol consumption questions (AUDIT-C) - An effective brief screening test for problem drinking. Archives of Internal Medicine, 158, 1789-1795.
The CAGE is a brief (4 item) screening test designed to identify problem drinking and potential alcohol problems. (The name is an acronym of its 4 questions: Have you ever felt you needed to Cut down on your drinking? Have people Annoyed you by criticizing your drinking? Have you ever felt Guilty about drinking? Have you ever felt you needed a drink first thing in the morning (Eye-opener) to steady your nerves or to get rid of a hangover?]). Two ‘yes’ responses are indicative of alcohol problems. The CAGE has been validated in a wide range of population sub-groups.
Mayfield, D., McLeod, G. & Hall, P. (1974). The CAGE questionnaire: validation of a new alcoholism screening instrument. American Journal of Psychiatry, 131, 1121-1123.
Center for Epidemiologic Studies Depression Scale (CES-D)
The Center for Epidemiological Studies Depression Scale (CES-D) is a 20-item self-report measure designed to measure current (past week) levels of depression in general population samples. Items are rated on 4-point Likert scales (scored 0-3, total score 0-60), reflecting the frequency and severity with which symptoms are experienced. The CES-D demonstrated high levels of reliability in both general population and patient samples. A cut-off score of >=16 is widely used to identify depressed respondents, though different cut-offs may be appropriate in some groups (eg older adults). Briefer versions of the CES-D, based on 8, 10 and 11 items, have also been used in some studies.
Radloff, L.S. (1977). The CES-D scale: a self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 385–401.
Clinical Interview Schedule – revised (CIS-R)
The CIS-R is a structured interview examining the presence of symptoms of common mental disorders (CMD) in the past week. It covers 14 types of CMD symptoms (somatic symptoms, fatigue, concentration and forgetfulness, depression, depressive ideas, worry, anxiety, sleep problems, irritability, worry about physical health, phobias, panic, compulsions and obsessions), and six (non-mutually exclusive) ICD-10 disorders (Generalized anxiety disorder, depression, phobias, obsessive compulsive disorder, panic disorder, and CMD not otherwise specified [NOS]), together with a continuous scale that reflects the overall severity of CMD psychopathology. The CIS-R has been shown to be equally reliable when administered by interviewer or in a computer-assisted self-administered format. It has been widely used in population surveys.
Lewis, G., Pelosi, A.J., Araya, R., & Dunn, G. (1992). Measuring psychiatric disorder in the community: a standardized assessment for use by lay interviewers. Psychological Medicine, 22, 465–486.
Development and Wellbeing Assessment (DAWBA)
The DAWBA is a package of interviews, questionnaires and rating techniques designed to generate ICD-10 and DSM-IV or DSM-5 psychiatric diagnoses about 2-17 year olds. (Versions of the DAWBA are now also available for adults, but have not yet been used in any of the cohorts included here). Information can be collected from up to three sources:
- An interview with the parents of 2-17 year olds
- An interview with 11-17 year olds themselves
- A questionnaire for completion by teachers of 2-17 year olds
The full DAWBA package covers the following diagnoses:
- Separation anxiety
- Specific phobia
- Social phobia
- Panic disorder/agoraphobia
- Post-traumatic stress disorder
- Obsessive compulsive disorder
- Generalised anxiety disorder
- Body dysmorphic disorder
- Disruptive mood dysregulation disorder
- Major depression
- Oppositional defiant disorder
- Conduct disorder
- Eating disorders, including anorexia, bulimia and binge eating
- Autism spectrum disorders
- Tic disorders, including Tourette syndrome
- Bipolar disorders.
The interviews can be completed in person or on line. The DAWBA includes a mix of ‘closed’/structured questions and open-ended questions, where respondents describe their difficulties in their own words. Information from different informants is drawn together by a computer program that predicts the likely diagnosis or diagnoses from responses to the closed questions, and generates six probability bands, ranging from a probability of less than 0.1% to a probability of over 70% that the child has the relevant diagnosis. In some studies all the data, including the verbatim transcripts, are then reviewed by an experienced clinical rater who accepts or overturns the computer-generated diagnosis.
Further details of all aspects of the DAWBA are available on the web-site dawba.info
Goodman, R., Ford, T., Richards, H., Gatward, R. & Meltzer, H. (2000) The Development and Well-Being Assessment: Description and initial validation of an integrated assessment of child and adolescent psychopathology. Journal of Child Psychology and Psychiatry, 41, 645-55.
GAD-7 Generalized Anxiety Disorder Assessment
The GAD-7 is a brief self-report scale designed as a screen for symptoms of Generalized Anxiety Disorder (GAD). The 7 items are scored 0-3 (total score range 0-21), reflecting the frequency of experiencing symptoms of GAD in the past 2 weeks. The GAD-7 shows acceptable/good internal reliability, and has been validated as a screen for GAD in both clinical and population samples. Total scores of 5, 10 and 15 represent cut-points for mild, moderate and severe anxiety.
The GAD-2 is a very brief (2 item) self-report scale designed as an initial screening tool for core symptoms of Generalized Anxiety Disorder (GAD). The 2 items are each scored 0-3 (total score range 0-6), reflecting the frequency of experiencing core symptoms of GAD in the past 2 weeks. A score of 3 is the preferred cut-point for identifying possible cases, where further diagnostic evaluation for GAD would be warranted in clinical settings. The GAD-2 has sound psychometric properties, with good sensitivity and specificity for identifying cases of GAD in primary care.
Spitzer, R.L., Kroenke, K., Williams, J.B.W., & Löwe, B. (2006). A brief measure for assessing generalized anxiety disorder: the GAD-7. Archives of Internal Medicine, 166, 1092–1097.
Kroenke, K., Spitzer, R.L., Williams, J.B., Monahan, P.O., & Löwe, B. (2007). Anxiety disorders in primary care: prevalence, impairment, comorbidity, and detection. Annals of Internal Medicine, 146, 317-25.
General Health Questionnaire (GHQ)
The GHQ is a self-report screening questionnaire for identifying minor psychiatric disorders in community samples. It is suitable for all ages from adolescent upwards. It assesses the respondent’s current state and asks if that differs from his or her usual state – so it is sensitive to short-term psychiatric disorders but not to long-standing attributes of the respondent. It focuses on two main areas:
- The inability to carry out normal functions
- The appearance of new and distressing phenomena.
The GHQ is available in four versions:
- GHQ-60: fully detailed 60-item questionnaire
- GHQ-30: a short form without items relating to physical illness
- GHQ-28: a 28 item scaled version – assessing somatic symptoms, anxiety and insomnia, social dysfunction and severe depression
- GHQ-12: a reliable and sensitive short form
There are different methods of scoring the items:
- GHQ scoring (0-0-1-1).
- Likert scoring (0-1-2-3)
- Modified Likert scoring (0-0-1-2)
- C-GHQ scoring (0-0-1-1) for positive items, where agreement indicates health, and 0-1-1-1 for negative items, where agreement indicates illness
Thresholds can be applied to total scores to identify ‘caseness’; specific thresholds vary for different versions of the GHQ, different scoring methods, and are sometimes suggested for different population sub-groups.
Goldberg, D. P., & Hillier, V. F. (1979). A scaled version of the General Health Questionnaire. Psychological Medicine, 9, 139–145.
Goldberg, D., Gater, R., Sartorius, N., Ustun, T.B., Piccinelli, M., Gureje, O., & Rutter, C. (1997). The validity of two versions of the GHQ in the WHO study of mental illness in general health care. Psychological Medicine, 27, 191-197.
Hospital Anxiety and Depression Scale (HADS)
The Hospital Anxiety and Depression Scale (HADS) is a 14-item self-report screening measure of past week anxiety and depression symptoms for adults (age 18 and above) in clinical and community settings. The scale excludes somatic symptoms likely to be present in patients with physical illnesses. Items are scored 0-3, giving a range of 0-21 for each of anxiety (HADS-A) and depression (HADS-D). Scores of 8 and above on each sub-scale have been reported to maximize sensitivity and specificity in determining caseness. The HADS has been widely translated, and has been found to perform well in assessing symptom severity and caseness in clinical and community samples.
Zigmond, A.S., & Snaith, R.P. (1983). The Hospital Anxiety and Depression Scale. Acta Psychiatrica Scandinavica, 67, 361–370.
Kessler Psychological Distress Scale: K10 and K6
The K10 (10 item) and K6 (6 item) Kessler Psychological Distress scales were developed to provide short screens for past month non-specific psychological distress in population samples, with the particular aim of maximizing precision in the clinical range of the distribution (ie the 90th-99th percentile range). Items are scored 1-5 on Likert-type scales (total score range K10: 10-50, K6: 6-30). The scales have good psychometric properties, and discriminate strongly in community samples between individuals with and without interview-identified disorder.
Kessler, R.C., Andrews, G., Colpe, L.J., Hiripi, E., Mroczek, D.K., Normand, S.L.T., Walters, E.E., & Zaslavsky, A.M. (2002). Short screening scales to monitor population prevalences and trends in non-specific psychological distress. Psychological Medicine, 32, 959-976.
Kessler, R.C., Barker, P.R., Colpe, L.J., Epstein, J.F., Gfroerer, J.C., Hiripi, E., Howes, M.J., Normand, S.L.T., Manderscheid, R.W., Walters, E.E., & Zaslavsky, A.M. (2003). Screening for serious mental illness in the general population. Archives of General Psychiatry, 60, 184-189.
The Malaise Inventory is a self-report measure of psychological distress, including 24 yes/no items (total score range 0-24). It was developed for use in the Isle of Wight epidemiological studies of the 1960s, and was subsequently used (in full, or in an abbreviated 9 item version) in the adolescent/adult sweeps of the 1958 and 1970 British birth cohort studies, and the first sweep of the Millennium Cohort Study. The items primarily tap symptoms of depression and anxiety, but also include some related somatic symptoms. The Malaise Inventory shows acceptable internal reliability, and the full scale shows good validity with respect to interview-assessed major depressive disorder. Scores of >=5 on the 15-item psychological sub-scale, or >=3 on the abbreviated 9-item version, have been taken to reflect clinically significant difficulties.
Reference: Rodgers, B., Pickles, A., Power, C., Collishaw, S. & Maughan, B. (1999). Validity of the Malaise Inventory in general population samples. Social Psychiatry and Psychiatric Epidemiology 34, 333-341.
Mood and Feelings Questionnaire (MFQ, SMFQ)
The Mood and Feelings Questionnaire (MFQ, 33 items) and the Short Mood and Feelings Questionnaire (SMFQ, 13 items) are brief measures of recent (past 2 weeks) depressive symptomatology. The MFQ/SMFQ show good psychometric properties and correlate highly with indicators of clinical depression. Each instrument comes in 3 versions (child self-report, adult self-report, and parent report about a child). Items are rated on 3-point scales (not true/sometimes/true) coded 0-2, and summed to provide a total score (range 0-26 for the SMFQ). The authors do not recommend a particular cut-point for identifying ‘high’ scores, arguing that the most appropriate cut-point will vary with the purpose of the study. Further details are available from the MFQ web-page.
Reference: Angold, A., Costello, E. J., Messer, S. C., Pickles, A., Winder, F., & Silver, D. (1995). The development of a short questionnaire for use in epidemiological studies of depression in children and adolescents. International Journal of Methods in Psychiatric Research, 5, 237–249.
Patient Health Questionnaire Depression Scale (PHQ)
The Patient Health Questionnaire Depression Scale (PHQ) is available in 3 versions: PHQ-9, PHQ-8 and PHQ-2.
The PHQ-9 includes the full 9-item Depression Module of the self-report Patient Health Questionnaire, designed to screen for depression symptoms in the past two weeks. The items reflect the 9 DSM-IV criteria for depressive disorders. Each item is rated on a 4-point scale (scored 0-3, total score range 0-27), reflecting the frequency with which symptoms are experienced. The PHQ-9 has been extensively evaluated and found to be valid as both a severity and a diagnostic measure in both patient and community samples. A cut-off score of >=10 has been found to maximize combined sensitivity and specificity overall, and for subgroups, by comparison with semi-structured diagnostic interview assessments of depression. Scores of 10 or higher are commonly used to identify individuals with depression.
The PHQ-8 includes the first 8 items of the PHQ-9, but omits Item 9 (which asks about thoughts of death and self-harm). The PHQ-8 correlates highly with the PHQ-9, and the cut-offs that maximize combined sensitivity and specificity by comparison with interview-based measures of depression are the same. As with the PHQ-9, a cut-off score of >=10 is used to screen for major depression.
The PHQ-2 is a brief 2-item ‘first step’ screen for depressed mood and anhedonia in the past 2 weeks. The 2 items are each scored 0-3 (total score 0-6). A score of 3 is suggested as the optimal cut-point for identifying possible cases; a more detailed evaluation would then be needed to determine whether individuals meet criteria for a diagnosis of depression.
Kroenke, K., Spitzer, R.L. & Williams, J.B.W. (2001). The PHQ-9: validity of a brief depression severity measure. Journal of General Internal Medicine, 16, 606–613.
Kroenke, K., Strine, T.W., Spitzer, R.L., Williams, J.B., Berry, J.T. & Mokdad, A.H. (2009). The PHQ-8 as a measure of current depression in the general population. Journal of Affective Disorders, 114, 163–173.
Wu, Y., Levis, B., Riehm, K.E., Saadat, N., Levis, A.W., Azar, M. …Thombs, B.T. (2020). Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9: a systematic review and individual participant data meta-analysis. Psychological Medicine, 50, 1368–1380.
Kroenke, K., Spitzer, R.L., Williams, J.B. (2003). The Patient Health Questionnaire-2: Validity of a two-item depression screener. Medical Care, 41, 1284–92.
PTSD Checklist (PCL)
The various versions of the PCL Checklist are self-report scales assessing symptoms of Posttraumatic Stress Disorder for DSM-IV and DSM-5.
PCL-C (Civilian), PCL-M (Military) and PCL-S (Specific):
The PCL-C, PCL-M and PCL-S are 17-item self-report scales assessing DSM-IV symptoms of posttraumatic stress disorder in the past month. The 3 versions (civilian, military and specific) vary slightly in the instructions and wording of the phrase referring to the index event. Symptoms are rated on 1-5 scales. The PCL has excellent internal consistency and adequate re-test reliability.
PTSD Checklist (PCL-5)
The PCL-5 is a 20-item self-report measure that assesses the DSM-5 symptoms of PTSD. Symptoms are rated on 0-4 scales, and versions are available for scoring symptoms experienced in the past week or the past month. It can be used as a screen for PTSD, or where necessary, to make a provisional PTSD diagnosis.
Note:the wording of the PCL-5 items reflects both changes to symptoms included in the DSM-IV versions of the checklist and the addition of new symptoms in DSM-5. The change in the rating scale (from 1-5 to 0-4), combined with the increase from 17 to 20 items means that PCL-5 scores are not compatible with PCL for DSM-IV scores and cannot be used interchangeably.
Weathers, F. W., Litz, B. T., Huska, J. A., & Keane, T. M. (1994). PTSD Checklist—Civilian version. Boston National Center for PTSD, Behavioral Science Division.
Blevins, C. A., Weathers, F. W., Davis, M. T., Witte, T. K., & Domino, J. L. (2015). The Posttraumatic Stress Disorder Checklist for DSM-5 (PCL-5): Development and initial psychometric evaluation. Journal of Traumatic Stress, 28, 489-498.
Problem Gambling Severity Index (PGSI)
The Problem Gambling Severity Index (PGSI) is a 9-item screen for problem gambling, and includes items that reflect both gambling behaviours and gambling-related consequences. Items reflect severity of gambling in the past 12 months, and are scored 0-3 (total score range 0-27). A score of 0 indicates recreational gambling; 1–2=low risk gambling; 3–7=moderate risk gambling; and scores of 8–27 (maximum) indicate problem gambling. The PGSI has been reported to show high internal reliability, uni-dimensionality, and good item-response characteristics.
Ferris, J., & Wynne, H. (2001). The Canadian Problem Gambling Index: Final report. Ottawa, Canadian Centre on Substance Abuse.
Revised Children’s Anxiety and Depression Scale (RCADS)
The Revised Children’s Anxiety and Depression Scale (RCADS) is a 47-item scale designed to correspond to dimensions of several anxiety disorders in DSM-IV, along with major depression. Versions are available for self-report by children and adolescents (ages 8-18 years) and parent/carer report (RCADS-P). Items are scored 0-3, and can be summed to generate sub-scales relating to separation anxiety, social phobia, generalized anxiety disorder, panic disorder, obsessive compulsive disorder and low mood. Scoring guides are also available to generate T scores. The RCADS is available in a variety of languages, and has good psychometric properties.
Chorpita, B.F., Yim, L.M., Moffitt, C.E., Unemoto, L.A., & Francis, S.E. (2000). Assessment of symptoms of DSM-IV anxiety and depression in children: A Revised Child Anxiety and Depression Scale. Behaviour Research and Therapy, 38, 835-855.
Short-Form Health Survey (SF-36, SF-12)
The SF-36 is a 36 item self-report measure of health-related quality of life in the past 4 weeks. It assesses eight health concepts, which include general mental health (psychological distress and well-being) and limitations in usual role activities because of emotional problems. Scoring algorithms are used to generate sub-scales (each scored from 0 [low] to 100 [high]), along with a mental component score (MCS); many study data-sets include pre-derived variables for these scores. The SF-36 has been widely validated for use in clinical practice, policy evaluations and population surveys, and with different population sub-groups.
Short-Form Health Survey (SF-12)
The SF-12 is a shortened form of the SF-36, designed for use in population samples as a multi-factorial measure of health-related quality of life. It assesses general health, functional limitations, and mood and anxiety symptoms in the past 4 weeks. In the standard scoring, item weights are used to derive two orthogonal factors reflecting physical and mental health. The Mental Health Component Summary (MCS-12) has been shown to be a useful screening instrument for depression and anxiety disorders in community samples. There are no universally accepted cut-points on the MCS-12. An alternative scoring method, based on an oblique factor solution, has been used to create the RAND Mental Health Component scale (RAND MHC-12), which performs similarly to the MCS-12.
Ware, J.J. & Sherbourne, C.D. (1992). The MOS 36-item short-form health survey (SF-36). 1. Conceptual framework and item selection. Medical Care, 30, 473-483.
Ware, J., Kosinski, M., & Keller, S. (1996). A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Medical Care, 34, 220-233.
Hays, R.D. (1998). RAND-36 Health Status Inventory. The Psychological Corporation, San Antonio.
Strengths and Difficulties Questionnaire (SDQ)
The Strengths and Difficulties Questionnaire (part of the DAWBA family of mental health measures) is a brief screening questionnaire for emotional and behavioural difficulties and prosocial behaviours in 3-16 year olds. Versions are available for completion by parents, teachers, and 11-16 year-old young people.
The SDQ rates behaviour in the past 6 months, and has 5 sub-scales: emotional symptoms, conduct problems, hyperactivity/inattention, peer relationship problems, prosocial behaviour. Each sub-scale includes 5 items rated 0 (not true), 1 (somewhat true) or 2 (certainly true). Scores from the emotional, conduct, hyperactivity and peer problem sub-scales can be summed to give a Total Difficulties score (range 0-40). Some versions of the SDQ also include an Impact Supplement, asking whether the respondent thinks the young person has a problem, and if so, its chronicity, and related distress, social impairment, and burden to others.
The SDQ has been widely validated in many countries and population sub-groups.
Full details of the SDQ (including related publications and normative data from a number of countries) are available on the web-site sdqinfo.com
Goodman, R. (1997). The Strengths and Difficulties Questionnaire: A Research Note. Journal of Child Psychology and Psychiatry, 38, 581-586.
Warwick-Edinburgh Mental Wellbeing Scale (WEMWBS, SWEMWBS)
The Warwick-Edinburgh Mental Well-being Scale (WEMWBS) was developed to enable the monitoring of adolescent and adult mental wellbeing in the general population. The items are all worded positively and cover both feeling (happiness, pleasure, and enjoyment) and ‘functioning’ aspects of mental wellbeing (purpose, meaning, and fulfilment) experienced in the past 2 weeks. The scale is unifactorial, has good reliability, and has been extensively used in a wide range of surveys and population sub-groups in the UK and internationally.
The full WEMWBS scale incudes 14 items, each with 5 response categories; item scores are summed to provide a total score (range 14–70). The 7-item SWEMWBS scale is a shortened version with a preponderance of functioning items. Raw scores for the shortened version need to be transformed using a conversion table (see Stewart-Brown et al, 2009). UK population norms for the 14 and 7 item versions have been generated from responses to the Health Survey for England, 2011, and are available at: WEMWBS and SWEMBS population norms Health Survey for England 2011. Further normative data for the 7-item version are reported in Fat et al (2017).
Tennant, R., Hiller, L., Fishwick, R., Platt, S., Joseph, S., Weich, S., Parkinson, J., Secker, J., & Stewart-Brown, S. (2007). The Warwick-Edinburgh mental well-being scale (WEMWBS): development and UK validation. Health and Quality of Life Outcomes, 5, Article 63.
Stewart-Brown, S., Tennant, A., Tennant, R., Platt, S., Parkinson, J., & Weich, S. (2009). Internal construct validity of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS): a Rasch analysis using data from the Scottish Health Education Population Survey. Health and Quality of Life Outcomes, 7, Article 15.
Fat, L.N., Scholes, S., Boniface, S., Mindell, J., & Stewart-Brown, S. (2017). Evaluating and establishing national norms for mental wellbeing using the short Warwick–Edinburgh Mental Well-being Scale (SWEMWBS): findings from the Health Survey for England. Quality of Life Research, 26, 1129–1144.