Common measures of mental health, personality and temperament

This section gives brief details on the most widely used measures listed in the Catalogue. For each measure, it provides information on:

The nature of the measure (interview or questionnaire)
The main mental health problem(s) and aspects of personality and temperament assessed
The reporter(s) and reporting period
The number of items included and how they are scored

Most questionnaire measures are designed to generate dimensional (total scale) scores; some also have well-validated sub-scales. For many questionnaire measures it is also possible to apply cut-points to identify individuals with high scores, likely to reflect clinically significant problems. We include details of widely-used cut-points where these are available. Where past studies suggest that different cut-points may be appropriate for different population sub-groups, researchers are advised to consult relevant publications to guide their selection.

For each measure we include reference(s) to papers or books describing the development of the instrument; these typically also include some data on reliability and validity. For many of the more widely-used instruments, further psychometric data (including on the suitability of the measure for use in different population sub-groups) are available in subsequent publications. Once again, researchers are advised to consult such additional sources if they are planning studies of samples that differ markedly from the original validation samples.

COMMON MEASURES OF MENTAL HEALTH

Alcohol Use Disorders Identification Test (AUDIT) - CAGE Questionnaire - Center for Epidemiologic Studies Depression Scale (CES-D) - Clinical Interview Schedule - Revised (CIS-R) - Development and Wellbeing Assessment (DAWBA) - GAD-7 Generalized Anxiety Disorder Assessment - General Health Questionnaire (GHQ) - Hospital Anxiety and Depression Scale (HADS) - Kessler Psychological Distress Scale: K10 and K6 - Malaise Inventory - Mood and Feelings Questionnaire (MFQ, SMFQ) - Patient Health Questionnaire Depression Scale (PHQ) - PTSD Checklist (PCL) - Problem Gambling Severity Index (PGSI) - Revised Children’s Anxiety and Depression Scale (RCADS) - Short Form Health Survey (SF-36, SF-12) - Strengths & Difficulties Questionnaire (SDQ) - Warwick-Edinburgh Mental Wellbeing Scale (WEMWBS, SWEMWBS)

COMMON MEASURES OF PERSONALITY

Big Five Inventory (BFI) - Eysenck Personality Questionnaire (EPQ) - International Personality Item Pool (IPIP) - Midlife Development Inventory (MIDI)

COMMON MEASURES OF TEMPERAMENT

Infant Behavior Questionnaire-Revised (IBQ-R) - Early Childhood Behavior Questionnaire (ECBQ) - Children’s Behavior Questionnaire (CBQ) - EAS Temperament Survey (EAS)

COMMON MEASURES OF MENTAL HEALTH

Alcohol Use Disorders Identification Test (AUDIT)

The AUDIT is a 10-item self-report screening tool developed by the World Health Organization to assess alcohol consumption, symptoms of dependence, and harmful alcohol use. Most items relate to use in the past year. The AUDIT was designed to be used internationally, and has been validated in a wide range of population groups. Each item is scored on a 0-4 point scale (total score range 0-40, higher scores indicating more difficulties). A score of 8 or more is considered to indicate hazardous or harmful alcohol use in many samples; other cut-points have been proposed for particular populations/ sub-groups.

AUDIT-C

The AUDIT-C is a brief 3-item self-report screen for heavy drinking and/or active alcohol abuse or dependence. The 3 items are each scored 0-4 (total score 0-12). A score of 3 or more in women and 4 or more in men is considered optimal for identifying hazardous drinking or active alcohol use disorders. The AUDIT-C has sound psychometric properties, with good sensitivity and specificity for identifying problematic alcohol use in community and primary care samples.

References
Babor T., Higgins-Biddle J.C., Saunders J.B., Monteiro M.G. (2001). The Alcohol Use Disorders Identification Test: Guidelines for Use in Primary Care. World Health Organization, Geneva.

Saunders JB, Aasland OG, Babor TF, Grant M. (1993). Development of the alcohol use disorders identification test (AUDIT): WHO collaborative project on early detection of persons with harmful alcohol consumption-II. Addiction 88, 791–804.

Bush, K., Kivlahan, D.R., McDonell, M.B., Fihn, S.D., & Bradley, K.A. (1998). The AUDIT alcohol consumption questions (AUDIT-C) - An effective brief screening test for problem drinking. Archives of Internal Medicine, 158, 1789-1795.

CAGE Questionnaire

The CAGE is a brief (4 item) screening test designed to identify problem drinking and potential alcohol problems. (The name is an acronym of its 4 questions: Have you ever felt you needed to Cut down on your drinking? Have people Annoyed you by criticizing your drinking? Have you ever felt Guilty about drinking? Have you ever felt you needed a drink first thing in the morning (Eye-opener) to steady your nerves or to get rid of a hangover?]). Two ‘yes’ responses are indicative of alcohol problems. The CAGE has been validated in a wide range of population sub-groups.

Reference
Mayfield, D., McLeod, G. & Hall, P. (1974). The CAGE questionnaire: validation of a new alcoholism screening instrument. American Journal of Psychiatry, 131, 1121-1123.

Center for Epidemiologic Studies Depression Scale (CES-D)

The Center for Epidemiological Studies Depression Scale (CES-D) is a 20-item self-report measure designed to measure current (past week) levels of depression in general population samples. Items are rated on 4-point Likert scales (scored 0-3, total score 0-60), reflecting the frequency and severity with which symptoms are experienced. The CES-D demonstrated high levels of reliability in both general population and patient samples. A cut-off score of >=16 is widely used to identify depressed respondents, though different cut-offs may be appropriate in some groups (eg older adults). Briefer versions of the CES-D, based on 8, 10 and 11 items, have also been used in some studies.

Reference
Radloff, L.S. (1977). The CES-D scale: a self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 385–401.

Clinical Interview Schedule – revised (CIS-R)

The CIS-R is a structured interview examining the presence of symptoms of common mental disorders (CMD) in the past week. It covers 14 types of CMD symptoms (somatic symptoms, fatigue, concentration and forgetfulness, depression, depressive ideas, worry, anxiety, sleep problems, irritability, worry about physical health, phobias, panic, compulsions and obsessions), and six (non-mutually exclusive) ICD-10 disorders (Generalized anxiety disorder, depression, phobias, obsessive compulsive disorder, panic disorder, and CMD not otherwise specified [NOS]), together with a continuous scale that reflects the overall severity of CMD psychopathology. The CIS-R has been shown to be equally reliable when administered by interviewer or in a computer-assisted self-administered format. It has been widely used in population surveys.

Reference
Lewis, G., Pelosi, A.J., Araya, R., & Dunn, G. (1992). Measuring psychiatric disorder in the community: a standardized assessment for use by lay interviewers. Psychological Medicine, 22, 465–486.

Development and Wellbeing Assessment (DAWBA)

The DAWBA is a package of interviews, questionnaires and rating techniques designed to generate ICD-10 and DSM-IV or DSM-5 psychiatric diagnoses about 2-17 year olds. (Versions of the DAWBA are now also available for adults, but have not yet been used in any of the cohorts included here). Information can be collected from up to three sources:

An interview with the parents of 2-17 year olds
An interview with 11-17 year olds themselves
A questionnaire for completion by teachers of 2-17 year olds

The full DAWBA package covers the following diagnoses:

Separation anxiety
Specific phobia
Social phobia
Panic disorder/agoraphobia
Post-traumatic stress disorder
Obsessive compulsive disorder
Generalised anxiety disorder
Body dysmorphic disorder
Disruptive mood dysregulation disorder
Major depression
ADHD/hyperkinesis
Oppositional defiant disorder
Conduct disorder
Eating disorders, including anorexia, bulimia and binge eating
Autism spectrum disorders
Tic disorders, including Tourette syndrome
Bipolar disorders.

The interviews can be completed in person or on line. The DAWBA includes a mix of ‘closed’/structured questions and open-ended questions, where respondents describe their difficulties in their own words. Information from different informants is drawn together by a computer program that predicts the likely diagnosis or diagnoses from responses to the closed questions, and generates six probability bands, ranging from a probability of less than 0.1% to a probability of over 70% that the child has the relevant diagnosis. In some studies all the data, including the verbatim transcripts, are then reviewed by an experienced clinical rater who accepts or overturns the computer-generated diagnosis.

Further details of all aspects of the DAWBA are available on the web-site dawba.info

Reference
Goodman, R., Ford, T., Richards, H., Gatward, R. & Meltzer, H. (2000) The Development and Well-Being Assessment: Description and initial validation of an integrated assessment of child and adolescent psychopathology. Journal of Child Psychology and Psychiatry, 41, 645-55.

Generalized Anxiety Disorder Assessment (GAD-7)

The GAD-7 is a brief self-report scale designed as a screen for symptoms of Generalized Anxiety Disorder (GAD). The 7 items are scored 0-3 (total score range 0-21), reflecting the frequency of experiencing symptoms of GAD in the past 2 weeks. The GAD-7 shows acceptable/good internal reliability, and has been validated as a screen for GAD in both clinical and population samples. Total scores of 5, 10 and 15 represent cut-points for mild, moderate and severe anxiety.

GAD-2

The GAD-2 is a very brief (2 item) self-report scale designed as an initial screening tool for core symptoms of Generalized Anxiety Disorder (GAD).  The 2 items are each scored 0-3 (total score range 0-6), reflecting the frequency of experiencing core symptoms of GAD in the past 2 weeks.  A score of 3 is the preferred cut-point for identifying possible cases, where further diagnostic evaluation for GAD would be warranted in clinical settings. The GAD-2 has sound psychometric properties, with good sensitivity and specificity for identifying cases of GAD in primary care.

References
Spitzer, R.L., Kroenke, K., Williams, J.B.W., & Löwe, B. (2006). A brief measure for assessing generalized anxiety disorder: the GAD-7. Archives of Internal Medicine, 166, 1092–1097.

Kroenke, K., Spitzer, R.L., Williams, J.B., Monahan, P.O., & Löwe, B. (2007). Anxiety disorders in primary care: prevalence, impairment, comorbidity, and detection. Annals of Internal Medicine, 146, 317-25.

General Health Questionnaire (GHQ)

The GHQ is a self-report screening questionnaire for identifying minor psychiatric disorders in community samples. It is suitable for all ages from adolescent upwards. It assesses the respondent’s current state and asks if that differs from his or her usual state – so it is sensitive to short-term psychiatric disorders but not to long-standing attributes of the respondent. It focuses on two main areas:

The inability to carry out normal functions
The appearance of new and distressing phenomena.

The GHQ is available in four versions:

GHQ-60: fully detailed 60-item questionnaire
GHQ-30: a short form without items relating to physical illness
GHQ-28: a 28 item scaled version – assessing somatic symptoms, anxiety and insomnia, social dysfunction and severe depression
GHQ-12: a reliable and sensitive short form

There are different methods of scoring the items:

GHQ scoring (0-0-1-1).
Likert scoring (0-1-2-3)
Modified Likert scoring (0-0-1-2)
C-GHQ scoring (0-0-1-1) for positive items, where agreement indicates health, and 0-1-1-1 for negative items, where agreement indicates illness

Thresholds can be applied to total scores to identify ‘caseness’; specific thresholds vary for different versions of the GHQ, different scoring methods, and are sometimes suggested for different population sub-groups.

References
Goldberg, D. P., & Hillier, V. F. (1979). A scaled version of the General Health Questionnaire. Psychological Medicine, 9, 139–145.

Goldberg, D., Gater, R., Sartorius, N., Ustun, T.B., Piccinelli, M., Gureje, O., & Rutter, C. (1997). The validity of two versions of the GHQ in the WHO study of mental illness in general health care. Psychological Medicine, 27, 191-197.

Hospital Anxiety and Depression Scale (HADS)

The Hospital Anxiety and Depression Scale (HADS) is a 14-item self-report screening measure of past week anxiety and depression symptoms for adults (age 18 and above) in clinical and community settings. The scale excludes somatic symptoms likely to be present in patients with physical illnesses. Items are scored 0-3, giving a range of 0-21 for each of anxiety (HADS-A) and depression (HADS-D). Scores of 8 and above on each sub-scale have been reported to maximize sensitivity and specificity in determining caseness. The HADS has been widely translated, and has been found to perform well in assessing symptom severity and caseness in clinical and community samples.

Reference
Zigmond, A.S., & Snaith, R.P. (1983). The Hospital Anxiety and Depression Scale.  Acta Psychiatrica Scandinavica, 67, 361–370.

Kessler Psychological Distress Scale (K10) (K6)

The K10 (10 item) and K6 (6 item) Kessler Psychological Distress scales were developed to provide short screens for past month non-specific psychological distress in population samples, with the particular aim of maximizing precision in the clinical range of the distribution (ie the 90th-99th percentile range). Items are scored 1-5 on Likert-type scales (total score range K10: 10-50, K6: 6-30). The scales have good psychometric properties, and discriminate strongly in community samples between individuals with and without interview-identified disorder.

References
Kessler, R.C., Andrews, G., Colpe, L.J., Hiripi, E., Mroczek, D.K., Normand, S.L.T., Walters, E.E., & Zaslavsky, A.M. (2002). Short screening scales to monitor population prevalences and trends in non-specific psychological distress. Psychological Medicine, 32, 959-976.

Kessler, R.C., Barker, P.R., Colpe, L.J., Epstein, J.F., Gfroerer, J.C., Hiripi, E., Howes, M.J., Normand, S.L.T., Manderscheid, R.W., Walters, E.E., & Zaslavsky, A.M. (2003). Screening for serious mental illness in the general population. Archives of General Psychiatry, 60, 184-189.

Malaise Inventory

The Malaise Inventory is a self-report measure of psychological distress, including 24 yes/no items (total score range 0-24). It was developed for use in the Isle of Wight epidemiological studies of the 1960s, and was subsequently used (in full, or in an abbreviated 9 item version) in the adolescent/adult sweeps of the 1958 and 1970 British birth cohort studies, and the first sweep of the Millennium Cohort Study. The items primarily tap symptoms of depression and anxiety, but also include some related somatic symptoms. The Malaise Inventory shows acceptable internal reliability, and the full scale shows good validity with respect to interview-assessed major depressive disorder. Scores of >=5 on the 15-item psychological sub-scale, or >=3 on the abbreviated 9-item version, have been taken to reflect clinically significant difficulties.

Reference: Rodgers, B., Pickles, A., Power, C., Collishaw, S. & Maughan, B. (1999). Validity of the Malaise Inventory in general population samples. Social Psychiatry and Psychiatric Epidemiology 34, 333-341.

Mood and Feelings Questionnaire (MFQ, SMFQ)

The Mood and Feelings Questionnaire (MFQ, 33 items) and the Short Mood and Feelings Questionnaire (SMFQ, 13 items) are brief measures of recent (past 2 weeks) depressive symptomatology. The MFQ/SMFQ show good psychometric properties and correlate highly with indicators of clinical depression. Each instrument comes in 3 versions (child self-report, adult self-report, and parent report about a child). Items are rated on 3-point scales (not true/sometimes/true) coded 0-2, and summed to provide a total score (range 0-26 for the SMFQ). The authors do not recommend a particular cut-point for identifying ‘high’ scores, arguing that the most appropriate cut-point will vary with the purpose of the study. Further details are available from the MFQ web-page.

Reference: Angold, A., Costello, E. J., Messer, S. C., Pickles, A., Winder, F., & Silver, D. (1995). The development of a short questionnaire for use in epidemiological studies of depression in children and adolescents. International Journal of Methods in Psychiatric Research, 5, 237–249.

Patient Health Questionnaire Depression Scale (PHQ)

The Patient Health Questionnaire Depression Scale (PHQ) is available in 3 versions: PHQ-9, PHQ-8 and PHQ-2.

PHQ-9

The PHQ-9 includes the full 9-item Depression Module of the self-report Patient Health Questionnaire, designed to screen for depression symptoms in the past two weeks. The items reflect the 9 DSM-IV criteria for depressive disorders. Each item is rated on a 4-point scale (scored 0-3, total score range 0-27), reflecting the frequency with which symptoms are experienced. The PHQ-9 has been extensively evaluated and found to be valid as both a severity and a diagnostic measure in both patient and community samples. A cut-off score of >=10 has been found to maximize combined sensitivity and specificity overall, and for subgroups, by comparison with semi-structured diagnostic interview assessments of depression. Scores of 10 or higher are commonly used to identify individuals with depression.

PHQ-8

The PHQ-8 includes the first 8 items of the PHQ-9, but omits Item 9 (which asks about thoughts of death and self-harm). The PHQ-8 correlates highly with the PHQ-9, and the cut-offs that maximize combined sensitivity and specificity by comparison with interview-based measures of depression are the same. As with the PHQ-9, a cut-off score of >=10 is used to screen for major depression.

PHQ-2

The PHQ-2 is a brief 2-item ‘first step’ screen for depressed mood and anhedonia in the past 2 weeks. The 2 items are each scored 0-3 (total score 0-6). A score of 3 is suggested as the optimal cut-point for identifying possible cases; a more detailed evaluation would then be needed to determine whether individuals meet criteria for a diagnosis of depression.

References
Kroenke, K., Spitzer, R.L. & Williams, J.B.W. (2001). The PHQ-9: validity of a brief depression severity measure. Journal of General Internal Medicine, 16, 606–613.

Kroenke, K., Strine, T.W., Spitzer, R.L., Williams, J.B., Berry, J.T. & Mokdad, A.H. (2009). The PHQ-8 as a measure of current depression in the general population. Journal of Affective Disorders, 114, 163–173.

Wu, Y., Levis, B., Riehm, K.E., Saadat, N., Levis, A.W., Azar, M. …Thombs, B.T. (2020). Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9: a systematic review and individual participant data meta-analysis. Psychological Medicine, 50, 1368–1380.

Kroenke, K., Spitzer, R.L., Williams, J.B. (2003). The Patient Health Questionnaire-2: Validity of a two-item depression screener. Medical Care, 41, 1284–92.

PTSD Checklist (PCL)

The various versions of the PCL Checklist are self-report scales assessing symptoms of Posttraumatic Stress Disorder for DSM-IV and DSM-5.

PCL-C (Civilian), PCL-M (Military) and PCL-S (Specific):

The PCL-C, PCL-M and PCL-S are 17-item self-report scales assessing DSM-IV symptoms of posttraumatic stress disorder in the past month. The 3 versions (civilian, military and specific) vary slightly in the instructions and wording of the phrase referring to the index event. Symptoms are rated on 1-5 scales. The PCL has excellent internal consistency and adequate re-test reliability.

PTSD Checklist (PCL-5)

The PCL-5 is a 20-item self-report measure that assesses the DSM-5 symptoms of PTSD. Symptoms are rated on 0-4 scales, and versions are available for scoring symptoms experienced in the past week or the past month. It can be used as a screen for PTSD, or where necessary, to make a provisional PTSD diagnosis.

Note:the wording of the PCL-5 items reflects both changes to symptoms included in the DSM-IV versions of the checklist and the addition of new symptoms in DSM-5. The change in the rating scale (from 1-5 to 0-4), combined with the increase from 17 to 20 items means that PCL-5 scores are not compatible with PCL for DSM-IV scores and cannot be used interchangeably.

References
Weathers, F. W., Litz, B. T., Huska, J. A., & Keane, T. M. (1994). PTSD Checklist—Civilian version. Boston National Center for PTSD, Behavioral Science Division.

Blevins, C. A., Weathers, F. W., Davis, M. T., Witte, T. K., & Domino, J. L. (2015). The Posttraumatic Stress Disorder Checklist for DSM-5 (PCL-5): Development and initial psychometric evaluation. Journal of Traumatic Stress, 28, 489-498.

Problem Gambling Severity Index (PGSI)

The Problem Gambling Severity Index (PGSI) is a 9-item screen for problem gambling, and includes items that reflect both gambling behaviours and gambling-related consequences. Items reflect severity of gambling in the past 12 months, and are scored 0-3 (total score range 0-27). A score of 0 indicates recreational gambling; 1–2=low risk gambling; 3–7=moderate risk gambling; and scores of 8–27 (maximum) indicate problem gambling. The PGSI has been reported to show high internal reliability, uni-dimensionality, and good item-response characteristics.

Reference
Ferris, J., & Wynne, H. (2001). The Canadian Problem Gambling Index: Final report. Ottawa, Canadian Centre on Substance Abuse.

Revised Children’s Anxiety and Depression Scale (RCADS)

The Revised Children’s Anxiety and Depression Scale (RCADS) is a 47-item scale designed to correspond to dimensions of several anxiety disorders in DSM-IV, along with major depression. Versions are available for self-report by children and adolescents (ages 8-18 years) and parent/carer report (RCADS-P). Items are scored 0-3, and can be summed to generate sub-scales relating to separation anxiety, social phobia, generalized anxiety disorder, panic disorder, obsessive compulsive disorder and low mood. Scoring guides are also available to generate T scores. The RCADS is available in a variety of languages, and has good psychometric properties.

Reference
Chorpita, B.F., Yim, L.M., Moffitt, C.E., Unemoto, L.A., & Francis, S.E. (2000). Assessment of symptoms of DSM-IV anxiety and depression in children: A Revised Child Anxiety and Depression Scale. Behaviour Research and Therapy, 38, 835-855.

Short-Form Health Survey (SF-36, SF-12)

The SF-36 is a 36 item self-report measure of health-related quality of life in the past 4 weeks. It assesses eight health concepts, which include general mental health (psychological distress and well-being) and limitations in usual role activities because of emotional problems. Scoring algorithms are used to generate sub-scales (each scored from 0 [low] to 100 [high]), along with a mental component score (MCS); many study data-sets include pre-derived variables for these scores. The SF-36 has been widely validated for use in clinical practice, policy evaluations and population surveys, and with different population sub-groups.

Short-Form Health Survey (SF-12)

The SF-12 is a shortened form of the SF-36, designed for use in population samples as a multi-factorial measure of health-related quality of life. It assesses general health, functional limitations, and mood and anxiety symptoms in the past 4 weeks. In the standard scoring, item weights are used to derive two orthogonal factors reflecting physical and mental health. The Mental Health Component Summary (MCS-12) has been shown to be a useful screening instrument for depression and anxiety disorders in community samples. There are no universally accepted cut-points on the MCS-12. An alternative scoring method, based on an oblique factor solution, has been used to create the RAND Mental Health Component scale (RAND MHC-12), which performs similarly to the MCS-12.

References
Ware, J.J. & Sherbourne, C.D. (1992). The MOS 36-item short-form health survey (SF-36). 1. Conceptual framework and item selection. Medical Care, 30, 473-483.

Ware, J., Kosinski, M., & Keller, S. (1996). A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Medical Care, 34, 220-233.

Hays, R.D. (1998). RAND-36 Health Status Inventory. The Psychological Corporation, San Antonio.

Strengths and Difficulties Questionnaire (SDQ)

The Strengths and Difficulties Questionnaire (part of the DAWBA family of mental health measures) is a brief screening questionnaire for emotional and behavioural difficulties and prosocial behaviours in 3-16 year olds. Versions are available for completion by parents, teachers, and 11-16 year-old young people.

The SDQ rates behaviour in the past 6 months, and has 5 sub-scales: emotional symptoms, conduct problems, hyperactivity/inattention, peer relationship problems, prosocial behaviour. Each sub-scale includes 5 items rated 0 (not true), 1 (somewhat true) or 2 (certainly true). Scores from the emotional, conduct, hyperactivity and peer problem sub-scales can be summed to give a Total Difficulties score (range 0-40). Some versions of the SDQ also include an Impact Supplement, asking whether the respondent thinks the young person has a problem, and if so, its chronicity, and related distress, social impairment, and burden to others.

The SDQ has been widely validated in many countries and population sub-groups.

Full details of the SDQ (including related publications and normative data from a number of countries) are available on the web-site sdqinfo.com

Reference
Goodman, R. (1997). The Strengths and Difficulties Questionnaire: A Research Note. Journal of Child Psychology and Psychiatry, 38, 581-586.

Warwick-Edinburgh Mental Wellbeing Scale (WEMWBS, SWEMWBS)

The Warwick-Edinburgh Mental Well-being Scale (WEMWBS) was developed to enable the monitoring of adolescent and adult mental wellbeing in the general population. The items are all worded positively and cover both feeling (happiness, pleasure, and enjoyment) and ‘functioning’ aspects of mental wellbeing (purpose, meaning, and fulfilment) experienced in the past 2 weeks. The scale is unifactorial, has good reliability, and has been extensively used in a wide range of surveys and population sub-groups in the UK and internationally.

The full WEMWBS scale incudes 14 items, each with 5 response categories; item scores are summed to provide a total score (range 14–70). The 7-item SWEMWBS scale is a shortened version with a preponderance of functioning items. Raw scores for the shortened version need to be transformed using a conversion table (see Stewart-Brown et al, 2009). UK population norms for the 14 and 7 item versions have been generated from responses to the Health Survey for England, 2011, and are available at: WEMWBS and SWEMBS population norms Health Survey for England 2011. Further normative data for the 7-item version are reported in Fat et al (2017).

References
Tennant, R., Hiller, L., Fishwick, R., Platt, S., Joseph, S., Weich, S., Parkinson, J., Secker, J., & Stewart-Brown, S. (2007). The Warwick-Edinburgh mental well-being scale (WEMWBS): development and UK validation. Health and Quality of Life Outcomes, 5, Article 63.

Stewart-Brown, S., Tennant, A., Tennant, R., Platt, S., Parkinson, J., & Weich, S. (2009). Internal construct validity of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS): a Rasch analysis using data from the Scottish Health Education Population Survey. Health and Quality of Life Outcomes, 7, Article 15.

Fat, L.N., Scholes, S., Boniface, S., Mindell, J., & Stewart-Brown, S. (2017). Evaluating and establishing national norms for mental wellbeing using the short Warwick–Edinburgh Mental Well-being Scale (SWEMWBS): findings from the Health Survey for England. Quality of Life Research, 26, 1129–1144.

COMMON MEASURES OF PERSONALITY AND TEMPERAMENT

Big Five Inventory (BFI)

The Big Five Inventory (BFI) is a 44-item self-report scale designed to assess the domains of the Five Factor Model of personality: Openness to Experience, Conscientiousness, Extraversion, Agreeableness, and Neuroticism. Items consist of short phrases (e.g., ‘is helpful and unselfish with others’, ‘does things efficiently’), and participants rate each item on a 5-point scale ranging from 1 (disagree strongly) to 5 (agree strongly); scale scores reflect the participant’s mean item response to the items on each scale. The BFI shows good psychometric properties, and significant associations have been reported between scores for the Big Five traits on the BFI and analogous factors on other established personality inventories.

Reference

John, O. P., Donahue, E., & Kentle, R. L. (1991). The Big Five Inventory: Versions 4a and 54. Berkley, CA: University of California, Institute of Personality Assessment and Research.

Children’s Behavior Questionnaire (CBQ)

The Children’s Behavior Questionnaire was developed to provide a differentiated caregiver report assessment of temperament in children of 3 to 8 years of age. The standard form, developed by Mary Rothbart and colleagues, includes 195 items assessing domains including positive and negative emotion, motivation, activity level, and attention, based on the constructs of temperament in infancy assessed by the IBQ. Subsequently short (CBQ-SF, 94 items, 15 scales) and very short (CBQ-VSF, 36 items, three broad scales) versions have been developed. Parents/caretakers rate the extent to which the scale descriptors are true of their child on 7-point scales ranging from “Extremely untrue” to “Extremely true”. Factor analyses of the CBQ in US samples have typically identified three main factors: surgency/ extraversion, negative affectivity, and effortful control.

Reference

Putnam, S.P. & Rothbart, M.K. (2006). Development of short and very short forms of the Children’s Behaviour Questionnaire. Journal of Personality Assessment, 87, 102-112.

Early Childhood Behavior Questionnaire (ECBQ)

The Early Childhood Behavior Questionnaire (ECBQ) was designed to provide a comprehensive and detailed assessment of temperament in toddlers aged 1-3 years old. It includes 201 items and 18 scales, described as predominantly ‘downward extensions’ of the dimensions of the Children’s Behavior Questionnaire, and ‘upward extensions’ of the dimensions of the Infant Behavior Questionnaire-Revised. Parents report on the frequency of specific behaviours (e.g., how often did your child ‘sit quietly and watch,’ ‘become sadly tearful’) in frequently occurring contexts (e.g., ‘When told no’) on a 7-point scale ranging from “Never” to “Always”.

Reference

Putnam, S. P., Gartstein, M. A., & Rothbart, M. K. (2006). Measurement of fine-grained aspects of toddler temperament: The early childhood behavior questionnaire. Infant Behavior and Development, 29, 386-401.

EAS Temperament Survey (EAS)

The EAS Temperament Survey is a 20-item parent-, teacher- or self-report questionnaire designed to assess 4 main dimensions of temperament: Emotionality: the tendency to become aroused easily and intensely; Activity: preferred levels of activity and speed of action; Sociability: the tendency to prefer the presence of others to being alone; and Shyness: the tendency to be inhibited and awkward in new social situations. Five items (e.g., ‘s/he is always on the go;’ ‘s/he tends to be shy’) index each dimension of temperament. Items are rated on 5-point scales ranging from 1 (not characteristic of your child) to 5 (very characteristic or typical of your child), and scores are summed for items reflecting each temperament dimension. The EAS was originally recommended for children aged 1-9 years but has also been used in older childhood and adolescent samples. Longitudinal studies report high stability on the four temperament dimensions across ages, along with some changes in mean scores by age.

Reference

Buss, A. H., & Plomin, R. (1984). Temperament: Early developing personality traits. Hillsdale, NJ: Lawrence Erlbaum.

Eysenck Personality Questionnaire (EPQ)

Hans Eysenck conceptualized personality in terms of three main dimensions: extroversion/introversion, neuroticism/stability, and psychoticism/socialisation. The original Eysenck Personality Questionnaire (EPQ) included items assessing each of these dimensions, along with a further series of items forming a lie/social desirability scale. Items consist of short phrases and are rated using binary yes/no responses. A revised version (EPQ-R) was later published; although the full version includes 100 items, a short 48-item version, with 12 items for each scale, has been widely used.

Reference

Eysenck, H. J., & Eysenck, S. B. G. (1975). Manual of the Eysenck Personality Scale. London: Hodder and Stoughton

Infant Behavior Questionnaire-Revised (IBQ-R)

The Infant Behaviour Questionnaire is a measure of temperament for infants between 3 and 12 months of age. The original version, developed by the leading temperament scholar Mary Rothbart, included 191 items, and assessed 14 facets of temperament. This was subsequently refined and revised into the IBQ-R, a 91-item measure. The most recent development is of a Very Short Form including 37 items, designed to index three broad components: negative emotionality, positive affectivity/surgency, and orienting/regulatory capacity, argued to be similar to three of the Big Five personality traits in adulthood (neuroticism, extraversion, and conscientiousness). Parents/caretakers rate how frequently their infants displayed specific temperament-related behaviours in the past week, using 5- or 7-point scales ranging from “Never” to “Always”. The IBQ-R shows good internal reliability and has been widely translated.

Reference

Putnam, S. P., Helbig, A. L., Gartstein, M. A., Rothbart, M. K., & Leerkes, E. (2014). Development and assessment of short and very short forms of the Infant Behavior Questionnaire–Revised. Journal of Personality Assessment, 96, 445–458.

International Personality Item Pool (IPIP)

The International Personality Item Pool (IPIP) was developed to provide a large public domain pool of personality items for use in research. Numerous scales, reflecting different models of personality, can be constructed from IPIP items. One widely-used subset of 50 items assesses the Big Five personality traits (extraversion, agreeableness, conscientiousness, emotional stability, and openness to experience). Items consist of short phrases (e.g., ‘get stressed out easily’, ‘pay attention to details’), and respondents score them on 5-point scales ranging from “Very like me” to “Not at all like me”.

Reference

Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R., & Gough, H. G. (2006). The international personality item pool and the future of public domain personality measures. Journal of Research in Personality, 40, 84–96.

Midlife Development Inventory (MIDI)

The Midlife Development Inventory (MIDI) uses 26 adjectives to assess the Big Five Personality traits: Openness to Experience (e.g., ’imaginative’), Conscientiousness (e.g., ‘organized’), Extraversion (e.g., ‘outgoing’), Agreeableness (e.g., ‘helpful’), and Neuroticism (e.g., ’moody’). Participants rate how well each adjective describes them on 4-point Likert-type scales ranging from 1 (a lot) to 4 (not at all).

Reference

Lachman, M.E. & Weaver, S.L. (1997). The midlife development inventory (MIDI) personality scales: scale construction and scoring (unpublished technical report). Waltham: Brandeis University.