In a new study, researchers compared an AI-powered therapy app (Woebot) with three other interventions: a non-smart conversational program from the 1960s (ELIZA), a journaling app (Daylio), and basic psychoeducation (which they considered the control group). They found no differences between the groups in terms of improvements in mental health.
“The main analysis failed to detect differences between any of the four treatment conditions in improving symptoms of depression, anxiety, and positive/negative affect,” the researchers write.
The study was conducted by Laura Eltahawy, Nils Myszkowski, and Leora Trub at Pace University, and Todd Essig at the William Alanson White Institute of Psychiatry Psychoanalysis and Psychology. The article was published in Computers in Human Behavior: Artificial Humans.
The researchers first recruited 120 college/graduate students (ages 18-29) on Facebook who self-identified as having anxiety or depression. Many dropped out of the study or did not fully complete the measures. The final tally included 65 participants: 18 in the Woebot group, 18 in the ELIZA group, 15 in the Daylio journaling group, and 14 in the psychoeducation group.
The researchers used the same measures as in a previous Woebot study: the GAD-7 for anxiety, the PHQ-9 for depression, and the PANAS to measure positive and negative affect. The participants took these measures at the beginning of the study and after two weeks.
The researchers found that everyone, on average, experienced an improvement in anxiety, depression, and affect over the course of the two-week study. However, there was no difference between the groups—Woebot was no better or worse than simple psychoeducation, Daylio, or ELIZA.
In a further secondary analysis of more specific outcomes, the researchers assessed the change over time for each individual group. They found that ELIZA and Daylio both resulted in more “robust” outcomes than Woebot, while psychoeducation had the least “robust” outcomes.
Specifically, users of ELIZA experienced statistically significant improvements in all four outcomes over time; users of Daylio experienced improvement in depression and negative affect; users of Woebot experienced improvements in anxiety; and those who received psychoeducation did not improve on any measure. However, these specific differences were extremely small—none of the four groups experienced a statistically significant difference from the others in the main analysis.
Woebot, developed in 2017, is a publicly available therapy app driven by artificial intelligence. Its creators claim that it can deliver cognitive-behavioral therapy (CBT). It is designed to “check in” with users every day and provide guided exercises that are adapted from CBT worksheets. ELIZA, developed in the 1960s, was a proof-of-concept program that used ideas from humanistic therapist Carl Rogers to imitate empathy by rephrasing what the user wrote. Daylio is a publicly available app that encourages users to keep a daily interactive journal. Psychoeducation involved reading educational materials about depression.
These results raise the question of whether the public is being duped by AI hype. This study did not find that an AI-powered CBT bot led to better outcomes than the first conversational program from the 1960s, or even psychoeducation. It provides reason to conclude that the claims around mental health apps are not evidence-based. Yet, despite concerns around privacy and coercion, they continue to grow in popularity.
Eltahawy and colleagues argue that future chatbot research must demonstrate that they are at least as good as existing psychotherapies (such as CBT delivered by a human therapist) before they should be delivered as supposed “effective” interventions.
They write, “Using a no-treatment control group study design to market clinical services should no longer be acceptable nor serve as an acceptable precursor to marketing a chatbot as functionally equivalent to psychotherapy.”
Eltahawy, L., Essig, T., Myszkowski, N., & Trub, L. (2023). Can robots do therapy?: Examining the efficacy of a CBT bot in comparison with other behavioral intervention technologies in alleviating mental health symptoms. Computers in Human Behavior: Artificial Humans, 100035. [Full text]