New research published in the Canadian Medical Association Journal finds that the common method of estimating depression prevalence through self-report screening questionnaires is unreliable. Dr. Brett Thombs, a professor of psychiatry at McGill University, and his colleagues demonstrate that inaccurate measurements have led to false depression estimates leading to the misuse of healthcare resources and overdiagnosis.
“Screening tests for mental health and other types of screening questionnaires are not designed to make diagnostic classifications, and they are not calibrated to estimate prevalence,” the authors write. “Using them in this way distorts prevalence estimates, often substantially, and does so disproportionately in low-prevalence populations.”
Diverging from current rhetoric identifying depression as a “global health burden,” Thombs et al. raise concerns about current approaches to measuring and understanding depression across diverse populations. “There are important implications for how research should be conducted and reported,” they write. “First, prevalence estimates should be based on appropriate methods.”
The researchers reviewed existing studies estimating the prevalence of depression within the general population. They found that the prevalence of mental health disorders was based on screening questionnaires in 17 of the 19 identified studies, as well as a recent meta-analysis. The authors point out that this is likely due to the fact that questionnaires require fewer resources and are more cost-efficient than hiring trained personnel to administer diagnostic interviews to large population samples.
“These studies misrepresent the actual rate of depression, sometimes dramatically, which makes it very difficult to direct the right resources to problems faced by patients,” Thombs said in a press release.
Though questionnaires are similar to diagnostic interviews in the way they assess depressive symptoms, they cannot assess functional impairment nor determine outside influences that can foster similar symptoms. Once administered, researchers set a cut-off threshold, dividing the patients as either likely or unlikely to meet criteria for depression based on their scores. This method is problematic, as past research shows that the percentage of patients above the cut-off threshold usually surpasses true prevalence.
Further, false-positive cases of depression confound prevalence rates and are only minimally offset by concurring false-negative screens. The authors add to the list of concerns that “sensitivity and specificity estimates or potential heterogeneity across samples” are not included in the calculations, but these factors could potentially exacerbate this problem.
Thombs and colleagues identify three alternative methods for measuring prevalence rates of depression, including: (1) “Back calculation,” described as adjusting the percentage above a cut-off threshold based on sensitivity and specificity; (2) “Prevalence matching,” which involves a large study that sets a cut-off threshold for a sample population using a screening tool and diagnostic interview; and (3) “Two-stage prevalence estimation,” where step 1 includes administering a screening questionnaire to all patients and step 2 engages a validated diagnostic interview for all patients with positive screens and only a selected portion of patients with negative screens.
The researchers acknowledge that implementing back calculation and prevalence matching is not yet feasible.
“When efficient methods for estimating the prevalence of depression are needed, two-stage estimation of prevalence presents a viable option that can reduce resource use substantially and generate unbiased, reasonably precise prevalence estimates.”
Finally, the researchers provide guidelines for future studies intending to identify prevalence estimates: (1) Use appropriate methods, (2) Base systematic reviews and meta-analyses on results from validated diagnostic interviews, and (3) when comparing samples and mental health descriptions based on screening tools, use continuous scores rather than the faulty cut-off dichotomies.
As research inevitably extends to heterogeneous populations and aspires to make generalized conclusions, appropriate research methods must be examined, as evidenced by Thombs and colleagues:
“The common practice of reporting the percentage of patients with scores above cut-off thresholds in screening questionnaires for depression as disorder prevalence substantially overestimates prevalence and misinforms users of epidemiological evidence.”
Thombs, B. D., Kwakkenbos, L., Levis, A. W., & Benedetti, A. (2018). Addressing overestimation of the prevalence of depression based on self-report screening questionnaires, Canadian Medical Association Journal. DOI: 10.1503/cmaj.170691