New research published in Nature Partner Journals, in partnership with the Schizophrenia International Research Society, reports on a machine learning classifier system applied to patient Facebook data that was able to differentiate Schizophrenia Spectrum Disorders (SSD) from Mood Disorders (MD) up to 18 months before patients’ first hospitalizations.
The researchers, led by Michael L. Birnbaum and Raquel Norel, suggest that Facebook data can be integrated with clinical information to inform clinical decision-making. However, their results show that the AI predictions may not significantly improve upon existing screening methods and critics have raised concerns over privacy and overdiagnosis.
Digital technologies are being developed and rolled out in the mental health field. In addition to supporting delivery of treatment, as in teletherapy, mental health apps, and medications with embedded digital sensors, such technologies are developed to aid clinical judgment. Among these are machine learning algorithms that use vocal and other biomarkers to diagnose ‘mental illness.’
Despite industry enthusiasm, critics have pointed out that algorithms often rely on incomplete data and can replicate and even exacerbate existing biases in healthcare. Further ethical concerns have been raised over the lack of transparency concerning how these algorithms actually make decisions, the difficulty of communicating such decisions with confidence to patients, and the tendency to “pass the buck” to technologies to avoid liability, once they are put into use.
Meanwhile, mental health researchers are debating the ill effects of social media and questioning aspects of these technologies and the types of environment they create for users. What is clear, however, is that social media platforms generate a glut of highly personal data, which is exactly what machine learning algorithms need to improve the accuracy of their predictions. This data is generated across a wide swath of society, not just by those who are already displaying symptoms of mental distress. This means that Facebook and other corporate social media platforms are well-positioned to provide such data in service of prediction and early detection of ‘mental illness.’ For example, data from Facebook has been used to attempt the prediction of suicides since 2017, and monitoring like this has led to invasive incidents of police and crisis teams intervening on people without their informed consent.
This latest study assesses whether patient data can confirm prior identification of associations between social media activity, including private “messenger” communications, and psychiatric diagnoses, by differentiating individuals diagnosed with Schizophrenia Spectrum Disorders (SSD), Mood Disorders, and healthy volunteers (HV).
The researchers collected a total of 3,404,959 Facebook messages and 142,390 Facebook Images across 223 participants with a mean age of 23.7 years and near even split between genders and diagnoses (SSD (n = 79), MD (n = 74), and HV (n = 70)).
Their first objective was to evaluate whether it was possible to distinguish between SSD, MD, and HV based on Facebook data alone. A pairwise classification used aggregated data for 18 months in a standard cross-validation scheme representing each participant by a single-feature vector meant to indicate their state as an overall average of their data across the six trimesters.
The algorithms correctly classified participants with SSD from those with MD or HV with an accuracy of 52% (chance = 33%). Participants with MD were correctly classified with an accuracy of 57% (chance = 37%), and HVs were correctly classified with an accuracy of 56% (chance = 29%).
The researchers suggest that such machine-learning algorithms can identify those with SSD and MD using Facebook activity alone over a year in advance of the first psychiatric hospitalization.
Compared to HV, participants with SSD and MD demonstrated significant differences in the use of words related to “anger,” “swearing,” “negative emotions,” “sex,” and “perception.” Many linguistic differences existed before the individual’s first hospitalization, suggesting that certain linguistic features may represent a trait rather than a state marker of impending symptoms or that clinically meaningful changes manifest online before hospitalization.
Analyzing word choice on Facebook could potentially help clinicians identify people at high risk of SSD or MD before the emergence of clinically significant symptoms.
While age, sex, and race were not associated with linguistic differences in SSD or HV participants, men and women with MD were significantly more likely (P < 0.01) to vary in their use of numerals. Compared to HV, photos posted by SSD or MD were significantly smaller, and participants with MD posted photos with more blue and less yellow colors.
The researchers also assessed if signals identified in the first trimester (when psychiatric symptoms are the most prominent, resulting in hospitalization) are also present in the trimesters farther away from hospitalization.
Classifications were performed on models trained with data from the first trimester only. They tested using data from the other trimesters showed the increasing differentiation of several word categories closer to the date of hospitalization. The authors attribute this to changes in “anxiety, mood, preoccupations, perceptions, social functioning, and other domains known to accompany illness emergence.”
The increased use of biological process words (blood, pain) and words related to negative emotions increased closer to hospitalization between HV and MD. Both SSD and MD patients used more negations, anger-oriented language, and swear words compared to healthy volunteers closer to their hospitalization dates.
The authors argue that while Facebook data alone cannot yet be used to make diagnoses, the integration of social media communication data could help improve diagnostic accuracy, serve as a “low burden screening tool” for at-risk youth, and provide collateral information. However, the predictions are correct only slightly more than half of the time. While the results here in the 50% accuracy range, 70-80% has been proposed as an acceptable threshold for prediction rates.
Further caution about the implications of this study is warranted on the grounds that diagnostic screening tools have been found to significantly overestimate mood disorders compared to clinical interviews. It is not necessarily the case that adding a secondary screening tool in practice will improve detection, as it is often the case that differing diagnostic tools can be assessing different underlying constructs, leading to misdiagnosis and overdiagnosis. There is also considerable debate concerning risks and ethical issues inherent in additional monitoring of patients and at-risk individuals.
Birnbaum, M. L., Norel, R., Van Meter, A., Ali, A. F., Arenare, E., Eyigoz, E., Agurto, C., Germano, N., Kane, J. M., & Cecchi, G. A. (2020). Identifying signals associated with psychiatric illness utilizing language and images posted to Facebook. Npj Schizophrenia, 6(1), 38. https://doi.org/10.1038/s41537-020-00125-0 (Link)