Research at the MIT Language and Technology lab assesses nascent AI-enabled pattern recognition of vocal biomarkers for ‘mental illness’ for systematic bias. MIT anthropologist Beth Semel participated in this research team in an effort to understand how voice analysis technologies sustain the U.S. mental health care system’s logic of capture and containment, and how this disproportionately harms marginalized groups and non-U.S. citizens.
Semel writes that these efforts to use technology to diagnose psychiatric disorders, based on the “tangled associations” of tone and inflection, threaten to promote a new “phrenology of the throat.” She adds:
“While the suggestion that AI might make these choices easier is seductive, the historical and ethnographic record demonstrates that automation in the name of efficacy tends to deepen, not mitigate, inequities that fall upon racialized, gendered, and colonial fault lines.”

Biological indicators of mental illness are a long sought after “final word” on diagnoses, promising to rid the mental health professions of the uncertainty and judgment calls around the application of diagnostic categories to patients. This striving has led to recent efforts such as the Research Domain Criteria (RDoc) project, funded by the NIMH, that attempts to locate specific brain processes that produce dysfunction. In another example, mental health app developers are trying to ground psychiatric diagnoses in people’s patterns of online behavior.
Computational psychiatry resides on the edges of mental health care research. In efforts to find alternatives to the hypothesis-driven methods of diagnosis used by DSM adherents, these psychiatrists favor a data-driven approach more commonly found in computer science and engineering.
Computational psychiatric researchers believe that it is only a matter of time until enough observable data is collected on patients to find biological, etiological “keys” to veracious diagnoses. Such data may bear no relationship to conventional diagnostic criteria. The idea is simply to find new patterns between behavior and the onset of ‘mental illness.’
The psychiatric researchers whose laboratory Semel observed treat speech as if it were another bio-behavioral indicator, like gait or response time to a stimulus. In the lab, engineers trained in signal processing are hard at work amassing speech data and applying mathematical analyses to glean information about the neuronal sources of spoken language.
“In theory, because vocal biomarkers index the faulty neural circuitry of mental illness, they are agnostic to language difference, speaker intentionality, and semantic, sociocultural meaning. Neurobiological essentialism and language universalism collide,” writes Semel.
Through fieldwork in the laboratories of computational psychiatric researchers, Semel examined the possible unintended effects of automating psychiatric screening. She notes that automating screening sounds like an effective way to democratize access to treatment but may actually deepen systemic biases that disproportionately harm members of marginalized groups in the screening process.
Semel’s ethnographic research suggests that the decision-making involved in creating automated psychiatric screenings risks replicating the same inequities of non-automated screenings.
The embodied, interactional dimensions of listening are central to automating psychiatric screening. Voice analysis systems are carefully developed from hand-labeled talk data and brain scans of recruited participants. Semel notes that the interactional settings and sociocultural scripts used to collect such data are never neutral decisions. Yet, they will be used to generate algorithmic recognition of speech patterns across potential patients – especially the underprivileged whom automation is supposed to help gain access to mental health treatment.
This ethnographic research reflects the continued existence of the computational logic underlying scientific racism. Semel’s work issues a warning to institutions and practitioners eager to jump aboard the latest intake automation technology. It is meant to invite critical historical, political, and economic excavations of the classification schemes taken for granted in algorithmic systems applied to mental health.
****
Semel, Beth. (2020). The Body Audible: From Vocal Biomarkers to a Phrenology of the Throat. Retrieved September 24, 2020, from Somatosphere Web site: http://somatosphere.net/2020/the-body-audible.html/