New research published in the Canadian Medical Association Journal finds that the common method of estimating depression prevalence through self-report screening questionnaires is unreliable. Dr. Brett Thombs, a professor of psychiatry at McGill University, and his colleagues demonstrate that inaccurate measurements have led to false depression estimates leading to the misuse of healthcare resources and overdiagnosis.
“Screening tests for mental health and other types of screening questionnaires are not designed to make diagnostic classifications, and they are not calibrated to estimate prevalence,” the authors write. “Using them in this way distorts prevalence estimates, often substantially, and does so disproportionately in low-prevalence populations.”
Diverging from current rhetoric identifying depression as a “global health burden,” Thombs et al. raise concerns about current approaches to measuring and understanding depression across diverse populations. “There are important implications for how research should be conducted and reported,” they write. “First, prevalence estimates should be based on appropriate methods.”
The researchers reviewed existing studies estimating the prevalence of depression within the general population. They found that the prevalence of mental health disorders was based on screening questionnaires in 17 of the 19 identified studies, as well as a recent meta-analysis. The authors point out that this is likely due to the fact that questionnaires require fewer resources and are more cost-efficient than hiring trained personnel to administer diagnostic interviews to large population samples.
“These studies misrepresent the actual rate of depression, sometimes dramatically, which makes it very difficult to direct the right resources to problems faced by patients,” Thombs said in a press release.
Though questionnaires are similar to diagnostic interviews in the way they assess depressive symptoms, they cannot assess functional impairment nor determine outside influences that can foster similar symptoms. Once administered, researchers set a cut-off threshold, dividing the patients as either likely or unlikely to meet criteria for depression based on their scores. This method is problematic, as past research shows that the percentage of patients above the cut-off threshold usually surpasses true prevalence.
Further, false-positive cases of depression confound prevalence rates and are only minimally offset by concurring false-negative screens. The authors add to the list of concerns that “sensitivity and specificity estimates or potential heterogeneity across samples” are not included in the calculations, but these factors could potentially exacerbate this problem.
Thombs and colleagues identify three alternative methods for measuring prevalence rates of depression, including: (1) “Back calculation,” described as adjusting the percentage above a cut-off threshold based on sensitivity and specificity; (2) “Prevalence matching,” which involves a large study that sets a cut-off threshold for a sample population using a screening tool and diagnostic interview; and (3) “Two-stage prevalence estimation,” where step 1 includes administering a screening questionnaire to all patients and step 2 engages a validated diagnostic interview for all patients with positive screens and only a selected portion of patients with negative screens.
The researchers acknowledge that implementing back calculation and prevalence matching is not yet feasible.
“When efficient methods for estimating the prevalence of depression are needed, two-stage estimation of prevalence presents a viable option that can reduce resource use substantially and generate unbiased, reasonably precise prevalence estimates.”
Finally, the researchers provide guidelines for future studies intending to identify prevalence estimates: (1) Use appropriate methods, (2) Base systematic reviews and meta-analyses on results from validated diagnostic interviews, and (3) when comparing samples and mental health descriptions based on screening tools, use continuous scores rather than the faulty cut-off dichotomies.
As research inevitably extends to heterogeneous populations and aspires to make generalized conclusions, appropriate research methods must be examined, as evidenced by Thombs and colleagues:
“The common practice of reporting the percentage of patients with scores above cut-off thresholds in screening questionnaires for depression as disorder prevalence substantially overestimates prevalence and misinforms users of epidemiological evidence.”
Thombs, B. D., Kwakkenbos, L., Levis, A. W., & Benedetti, A. (2018). Addressing overestimation of the prevalence of depression based on self-report screening questionnaires, Canadian Medical Association Journal. DOI: 10.1503/cmaj.170691
There is a 100% chance that 100% of the world’s population is mentally ill. This is science.
Lots of People might be very unhappy for real reasons or no ‘real’ reason – but they’re not suffering from an “illness”.
When I woke up this morning I was “very unhappy” but I’m not unhappy now. I see things differently now, to when I woke this morning.
How can you “overestimate” the prevalence of “depression” when there is no way to identify the “actual rate of depression,” since there is not an even close to reasonably accurate way to measure what “depression” really is? It’s like estimating the “accurate rate of anger” or “accurate rate of itchiness.” There is no such thing as an accurate rate of “depression.”
Well actually you can get more accurate rate if you view rates medicine was prescribed for individuals that have depression 1960-1985.
Than you could look at rate from years 1990-2018. Third you could calculate mean, median, mode for the two groups.
There’s probably a whole lot of societal reasons rate increased dramatically. Hahahahaa.
Rate of prescription is not the same as rate of “depression.” If the definition of depression is arbitrary, the rate of prescriptions written will be, too. The question is, given the subjective and frankly arbitrary definition of “Major Depression” in the DSM, how can any rate calculation be anything but subjective and arbitrary? We already know that rates of concordance on diagnosis for any of the DSM “mental disorders” are mediocre to poor. So talking about “rates of depression” is just not meaningful.
When I was a professor, the introductory psychology students would complete a series of online screening questionnaires at the beginning of the semester for course credit. Their responses were sometimes used by researchers to select students who scored above a certain threshold on a particular questionnaire and invite them to participate in a study. For example, students who scored above the “clinical cutoff” on a depression questionnaire might be invited to take part in a depression study a few weeks later.
But there was a recurring, significant problem for the researchers. Many who scored above the clinical cutoff during the initial screening scored in the normal range when they showed up for the study a few weeks later. Maybe half were no longer eligible to participate. Their initially extreme scores had normalized. Perhaps they were having a bad day when they completed the screening measure. Perhaps they were distressed due to a transient stressor that had improved. In any case, it’s a well-known empirical fact that extreme scores tend to be less extreme when measured later. It’s called regression to the mean.
Imagine if we took everyone who scored above a clinical cutoff score on a depression questionnaire at an initial screening, ignored what was happening in their life, concluded they were “mentally ill” based on their score, and “treated” them with “antidepressants”? How many people who would have otherwise naturally improved would be prevented from doing so? Perhaps the massive increase in chronic, disabling, “treatment-resistant” depression during the antidepressant era provides a clue?
Right you are, Brett. Arbitrary measurements lead to arbitrary results.