In 2018, California’s state government began rolling out a new “mental health” initiative. The tech companies of Silicon Valley were creating smartphone apps that could prompt users to seek mental health care, and the state wanted to provide support. After all, researchers claim that more than half of Americans with mental health problems don’t receive treatment, and one reason for that might be that treatment is expensive or unavailable in certain regions.
Of the thousands of mental health apps in existence today, the state selected two. The first app is called 7 Cups, by a company called 7 Cups of Tea. They’re focused on connecting mental health service users, in text-based chat sessions, with what they call “listeners”—volunteers who are trained in “active listening.” But, according to The New York Times, the company has been plagued with issues, including listeners having inappropriate conversations with their clients and investigations of its alleged financial misconduct.
The other company partnering with the state of California is Mindstrong Health. Their app (branded Mindstrong on March 17, 2020, previously known as Health) is available on the Google Play Store and the Apple App Store. However, you can only use the app if you have been given a code to participate by one of the health insurance companies they’ve partnered with. The company won’t tell you which companies they work with—it’s by invitation only.
Like 7 Cups, Mindstrong helps connect users with text-based counseling. However, the company also developed an algorithm that they claim can detect changes in users’ moods from day-to-day based on the way people use their phone—the way they swipe and tap, and the way they type and delete text. In fact, the founders of this company even suggest that their algorithm will be able to predict mental health problems before they occur. As one article about the app claims, “The smartphone app […] can tell you’re depressed before you know it yourself.”
All of this is raising profound questions: Is this a technology to be embraced? Or feared?
A Brave New World?
As 7 Cups, Mindstrong, and other mental health apps are rolled out, the “public health” premise under which they’re marketed is a familiar one: Psychiatric disorders are regularly underdiagnosed and undertreated, and artificial intelligence (AI) technologies can help get people diagnosed and into helpful treatment. The American Psychiatric Association and other mental health organizations have regularly sounded that theme in the past 20 years, and now tech companies are stepping forward with a high-tech solution.
It is the Mindstrong app that most raises the spectre of Brave New World, Aldous Huxley’s classic dystopian novel of eugenics and psychiatric surveillance. AI, the app seems to promise, can peer into the window of an individual’s mind much better than a psychiatrist—or other trained mental health professional—can. As we use our smartphone and computers, our typing rhythms, swiping habits, typing errors and so forth are all data points that can be a compiled into a mental health portrait of the user, one that the creators of Mindstrong claim can successfully diagnose “depression, anxiety, and other psychiatric disorders.”
And that diagnosis can then provide a gateway into treatment, with the first stop an online chat.
This AI technology was originally called the “digital fingerprint.” It was developed to track hackers without having to catch them in the act. Each person has a unique print. But while the developers are promoting the app as a public health initiative, the Mindstrong website and app store entries describe an AI that would be snooping on you at all times—ostensibly coming to know you better than you know yourself.
And ultimately doing so for commercial purposes that will expand the psychiatric enterprise.
While that is the spooky future that might await us if the Mindstrong app proves reliable, for the moment there are questions being raised about its “diagnostic” effectiveness and the merits of the online chats. Could the algorithm’s untested flags lead to overdiagnosis and overmedication? What happens if the app flags normal shifts in mood (say, feeling frustrated at work or feeling grief after a loss) as a mental illness? And what happens if you reach out for support but find only poorly-trained online chat technicians?
Mindstrong did not respond to our emails requesting an interview. When we called their customer service line, we were told on multiple occasions that the customer support operators were not authorized to give information to the press, and that they could not transfer us to someone authorized to do so. They also stated that they were unable to provide contact information for any other members of the Mindstrong Health team.
Mindstrong Health: Beginnings
Mindstrong Health began with Paul Dagum, a Stanford doctor and researcher who also holds advanced degrees in theoretical physics and theoretical computer science. Dagum is perhaps most famous in research circles for having developed a statistical method for analyzing data called Dynamic Bayesian Networks in the 1990s. He’s also the owner of numerous patents for artificial intelligence technology—algorithms designed to assess large amounts of data to provide predictions.
Before Mindstrong, Dagum created algorithms for surveillance. Some of his previous work was used in the field of “brand protection”—algorithms that designed to help corporations prevent digital “piracy” of movies, music, and software. He learned to create a digital fingerprint that could be used to identify and track hackers just based on the way they type. The idea was to be able to identify and monitor the activity of people based on how they used their smartphones and computers rather than what content they actually used. This means that people wouldn’t have to be caught committing a crime to be tracked—hackers could be identified based on the way they used their computer for innocent, everyday purposes. And the intention was to surveil the entire population—every single person at all times.
Dagum soon began looking for other uses for his digital surveillance algorithms. Healthcare seemed like the obvious next choice. Dagum partnered with Richard Klausner, who was the former director of the National Cancer Institute and former executive director for global health at the Bill and Melinda Gates Foundation. More importantly, Klausner knew how to start a healthcare business that would make money. He’d been the founder and director of pharmaceutical company Juno Therapeutics, which pulled in millions of dollars in investor funding before being acquired by Celgene for $9 billion.
But the last piece of the puzzle didn’t fall into place until Dagum met Tom Insel.
A former director of the National Institute for Mental Health (NIMH), Insel had become frustrated with the inability of the psychiatric establishment to make major improvements in care. After leaving the NIMH, Insel shifted to entrepreneurship, hoping that Silicon Valley could succeed where academia and the pharmaceutical industry had failed.
Insel, a psychiatric researcher, gained fame in the 1980s and 1990s for conducting animal experiments that supported the hypothesis that there were biological underpinnings to behavior. In his experiments on prairie voles, he identified oxytocin as one of the primary hormones responsible for the rodents’ strange, monogamous mating behavior.
In 2002, Insel became the director of the NIMH. He was the second-longest-serving director of the organization, running it for 13 years. By the end of his tenure there, he became known for speaking out against the Diagnostic and Statistical Manual of Mental Disorders (DSM)—the so-called “Bible” of psychiatry. In 2013, he rejected the DSM as a basis for research, citing its failure to find an objective laboratory test for any diagnosis:
The weakness is its lack of validity. Unlike our definitions of ischemic heart disease, lymphoma, or AIDS, the DSM diagnoses are based on a consensus about clusters of clinical symptoms, not any objective laboratory measure. In the rest of medicine, this would be equivalent to creating diagnostic systems based on the nature of chest pain or the quality of fever. Indeed, symptom-based diagnosis, once common in other areas of medicine, has been largely replaced in the past half century as we have understood that symptoms alone rarely indicate the best choice of treatment. Patients with mental disorders deserve better.
Although he still held to a primarily biological understanding of mental health problems, Insel suggested that the DSM did a poor job of accurately capturing those biological categories. Instead, he said, the DSM was based entirely on categorizing based on symptoms—and even then, its categories were wide nets that ensnared many different diagnostic fish. For instance, the long list of symptoms classed under Major Depressive Disorder—of which only a few are required for diagnosis—means that any two people with the diagnosis might have very different symptoms.
Insel’s reaction to the validity problems in the DSM, though, was to double down on the notion that mental health problems are biological. According to Insel, we simply haven’t figured out the biology yet. His replacement for the DSM was the RDoC (Research Domain Criteria)—a set of assumptions that he wanted to guide research. Chief among these assumptions was the idea of searching for biological differences first, and then trying to create mental diagnoses based on the biology.
It was an interesting shift, and Insel says it led to “cool papers by cool scientists,” but nothing that actually improved mental health care. In an interview with Wired, Insel describes investing billions of dollars on neuroscience research, only to come up empty-handed:
I spent 13 years at NIMH really pushing on the neuroscience and genetics of mental disorders, and when I look back on that I realize that while I think I succeeded at getting lots of really cool papers published by cool scientists at fairly large costs—I think $20 billion—I don’t think we moved the needle in reducing suicide, reducing hospitalizations, improving recovery for the tens of millions of people who have mental illness. I hold myself accountable for that.
But Insel remained undaunted. In 2015, just after leaving the NIMH, he began working with Verily, a life sciences tech company owned by Google’s parent company Alphabet, Inc. After a short stint there, he left to co-found Mindstrong Health with Dagum.
(This year, California Governor Gavin Newsom appointed Insel to be a “special adviser” on the state’s mental health system, despite the fact that Insel remains president of Mindstrong, a tech company looking to expand its reach into that very system. According to STAT, Insel will recuse himself from any conversations about Mindstrong.)
In his Wired interview, Insel provided his justification to moving into the tech sector: “If biomarkers can’t diagnose mental health issues, maybe a ‘digital phenotype’ can.”
The Digital Phenotype
“Digital phenotype” is a neologism that borrows a term from genetics. In genetics, phenotype refers to the observable traits that are the expression of the genotype (genetic code) in the environment. So, while the genotype refers to someone’s DNA, their traits are also influenced by the environment to create the phenotype. Figuratively, though, the word phenotype is sometimes used to mean an identifiable trait, like a fingerprint.
Dagum initially called his work a “digital fingerprint” when it was intended to be used to track hackers—perhaps because of the association fingerprinting has with law enforcement. Now that the algorithm’s being used for healthcare, there’s a new euphemism for it: a “digital biomarker,” insinuating that the digital information assessed by the app bears some relationship to a person’s biology.
So how does this app actually work?
First, the app installs a special keyboard on your phone. That way, the app can record information about the way you type at all times (whether you have the app open or not). The app tracks this data, and Dagum’s algorithm correlates it with “key domains of brain function including Cognitive Control, Working Memory, Processing Speed, Verbal Fluency, and Emotional Valence.”
It’s unclear exactly what data is being assessed by the algorithm. The Mindstrong website uses words like “taps, scrolls, swipes, and texts,” but the founders have implied that other metrics—frequency of texting, whether or not you reply to texts, length of replies, and other data—are also collected. One thing they are careful to say is that the app does not record any data about content—what you’re saying, which websites you’re visiting, who you’re talking to, or location data.
Is it possible that the rhythm of your clicks on your phone can be used to diagnose mental health problems? And is it possible that these activities can reveal whether you’re experiencing slight shifts in mood from day-to-day?
According to the Mindstrong website, “Believe it or not, how you use your smartphone has been scientifically validated as a way to measure things like how you’re doing and your mood.”
The Mindstrong website offers a single published article that they claim demonstrates the success of the algorithm. That article is a brief communication in the new journal npj Digital Medicine. It was written by Paul Dagum and describes a proof-of-concept study he conducted on his algorithm.
In Dagum’s study, 27 “healthy volunteers” were recruited on social media. They had no mental health diagnoses and no cognitive problems. Dagum recorded their texting behavior over the course of a week, then tested his algorithm’s ability to predict their scores on cognitive tests. He concluded that it had done so with some accuracy.
It’s important to note that these scores were only associated with cognitive functioning, not any mental health measures, and that this small group of participants were all considered perfectly healthy subjects. “These results should be considered preliminary until replicated in an independent sample,” he wrote.
Although that is the extent of the published research on the app, it’s presented on the app, and referred to on the website, as a “scientifically validated” measure of mood and mental health functioning in people with “serious mental illness” like bipolar disorder and borderline personality disorder. Given the nature of that one study, the claim is clearly an exaggerated one.
A second question, though, is whether the app will in fact provide useful help if it does detect a change in “mental health functioning.”
The Mindstrong website states that its app connects users with psychiatrists and “credible therapists.” Users can video chat with psychiatrists about their medications, although it’s unclear how often that happens. The bulk of the treatment offered through the app involves the “credible therapists. “They’re all thoroughly vetted for skills like empathetic listening and crisis management skills,” the website states.
However, the “credible therapists” are not in fact required to be licensed to practice, or to have expertise in any evidence-based treatments. They’re more like 7 Cups’ online listeners, providing 20-minute text therapy sessions, which the company claims are “evidence-based.”
“Our approaches include Cognitive Behavioral Therapy (CBT), Coping and Emotion Regulation, Psychoeducation, Crisis Management, Reflective Listening, and Empathy,” the website states.
While CBT delivered in person by a licensed therapist could be described as evidence-based, there is no good research that text-based CBT delivered in 20-minute sessions by unlicensed therapists is helpful. As for the rest of the listed therapeutic approaches, there are no studies that promote text-based “listening” and “empathy” as stand-alone treatment for any mental health condition.
So, the AI diagnosis leads to “treatment” that—it’s fair to say—is of a questionable sort.
Evaluating Mental Health Apps
John Torous, psychiatrist and clinical informaticist, is the director of the digital psychiatry division at Beth Israel Deaconess Medical Center (BIDMC). He’s also the head of the American Psychiatric Association (APA) work group on the evaluation of smartphone apps. He took the lead on developing a system to evaluate mental health apps, which is available at the APA website.
Torous is not affiliated with Mindstrong Health.
“A lot of the digital mental health offerings directed to consumers are not evidence-based,” Torous said in an interview with Mad in America. “They’re often not effective, and they’re often not safe, in that the business models take people’s personal health data, market it, sell it, or who knows what they’re actually going to do with that data.”
The APA’s evaluation system, however, does not provide guidance on what to do with the answers to all of those questions. The website simply states that deciding which app to use is a “personal decision” and that the goal is to “make APA members aware of very important information that should be considered when picking an app.” The system doesn’t actually score or rank apps, or provide a way to identify apps that lack an evidence basis and may pose any number of risks.
Instead, it serves as a reminder of what questions to ask in order to thoroughly vet an app before use.
Torous himself is currently engaged in research on another, similar mental health app, funded through a donor at BIDMC.
Although the apps share similar names (Torous’ app is officially known as mindLamp, although internally it’s often referred to as just LAMP), when MIA spoke with Torous, he highlighted the contrasts between his app and Mindstrong.
For instance, Mindstrong is already connected with health insurance companies, and the company behind it claims to be providing healthcare services, despite the lack of any published studies on its effectiveness.
In contrast, Torous stated that LAMP was still in the early stages of being tested. Ongoing studies are intended to demonstrate whether the app works or not. Torous said that it’s important to verify the effectiveness of an app before making claims that it is effective, and he believes his studies can help to provide that information.
However, like Mindstrong, LAMP is already being used as part of clinical care (although Torous verified that the algorithm itself is not being used yet). The app tracks the data it collects, which is then shared with the user and his or her clinician (and potentially with family members if consent is given).
The website for LAMP states that the Massachusetts Psychosis Network for Early Treatment (MAPNET) uses the app “at their sites across Massachusetts.” The site also notes that 15 research institutions around the world are using the app for various research studies.
According to Torous, LAMP’s code is open-source: free and available online. In contrast to Mindstrong’s secrecy, Torous said that the LAMP program is intended to pilot software that others can use freely and test to see if it can actually achieve better outcomes.
“Building a tool is no longer a huge technical challenge,” Torous said. “It’s hard to regulate what other people may be doing with the software.”
These apps—like Mindstrong—already exist, and they’re already being used, promoted, and monetized. But when their code is secret, it makes it impossible for clinicians and users to identify whether the apps are safe or effective. Torous wants to provide transparency to the public about how the apps work and whether there’s evidence for their effectiveness.
LAMP and Mindstrong: Both Different and Alike
LAMP’s processes work a little differently from Mindstrong’s. Insel and Dagum’s app downloads a keyboard onto the user’s phone and records details about how the person types on that keyboard.
Torous said that this could seem invasive, and noted that LAMP does not download an interface onto the phone in this way, nor does it record typing information.
However, both apps record GPS information, call and text logs, and accelerometer information (which is a proxy for exercise and sleep habits). Torous also discussed his app’s use of such metrics as whether the screen is on or off (another proxy for sleep habits), and answers to surveys and cognitive “games.” LAMP also records things like how often a person leaves the house (if at all), how far they travel, and other such details. But like Mindstrong, the LAMP app doesn’t record the places to which the user travels.
Both Mindstrong and LAMP have as their goal the identification and prediction of mental health changes, and both link the resulting “digital fingerprint” to a “clinical” intervention.
Torous’ current research is a study on whether the app can predict relapse in psychosis. The participants are people with a diagnosis of schizophrenia or schizoaffective disorder, and they will use the app for a year. The goal is to develop a unique baseline of functioning for each individual in the study so that when a person’s behavior changes, the app will identify it.
But what happens then? For Insel and Dagum’s app Mindstrong, an alert is sent to the Mindstrong care team, which will then reach out to the user. Torous’ app records the data about that change, which then becomes a point of discussion between the clinician and patient.
Torous noted that there are a lot of reasons a person’s behavior could change, not all of which imply a “relapse in psychosis.” For instance, if a person suddenly begins leaving the house more often and traveling to many places, that could be a sign of manic behavior—or it could mean that the person made a friend, or is simply trying to be more social.
With this information in hand, the clinician will show the patient the data (leaving the house more often) and ask what it means. “Until we know how well this works—when it’s right, when it’s wrong—it’s really hard to deploy clinically,” Torous said. “If someone has a relapse, we’ll ask them, ‘Here’s the signal we saw, does this make sense?’”
“We’re not going to automatically send an ambulance to their house,” Torous said. “Someone on the care team could call in; it could be a prompt for a family member to check in on someone” if there’s consent.
But, he admitted, another app using his free, open-source algorithms could easily escalate to sending an ambulance in the interests of care. “I think that was proposed in California at some point,” he said. “Facebook, apparently, may be doing this already.”
LAMP for College Students
Torous is also planning a study to test whether LAMP can be used for college students. He said this research is currently in the focus group stage, asking 100 college students about their needs and concerns. He hopes to create an adapted version of the app to deploy on college campuses.
The college environment can be a microcosm of the world, bringing to light the potential benefits of mental health surveillance, Torous said. For instance, the app might be able to identify if students were becoming socially withdrawn, which could signify depression.
But this, of course, raises obvious privacy concerns. Who gets the information that the student is becoming socially withdrawn? The college counseling center? The campus administration? Campus administration may already be incentivized to force students to take a leave of absence if there is a mental health concern, which can result in students unable to return to school, going into debt, and sometimes even a mark on their permanent record.
Other questions abound. Are students going to be forced to attend counseling? Involuntary counseling is not considered an effective intervention. How does the app distinguish between what could be a teenager’s first real romantic breakup, for example, versus an emergency of severe depression? Is there a difference?
And how does the app tell the difference between “manic” behavior and college students trying out new ways of being, like going to parties for the first time? What happens if the student uses drugs or alcohol? How does the app record that data, and who gets to see it?
Torous acknowledged these concerns. “Mental health in general only works when there’s trust and transparency,” he said. “We do have to understand what are limitations of this data, when this is useful, when it is not.”
Nevertheless, he said he hopes there is a way to adapt the app for college use that will be acceptable to the students and administration.
Much as the app might be used to monitor behavior, it could also be used to monitor response to—and compliance with—drug treatment. “Imagine starting a medication,” Torous said. “This is a way we can keep a very close eye on people as they’re starting a new treatment.”
The app might be able to chart a change in behavior that would inform patients that their antidepressant is “working,” even though they may not believe it is and haven’t noticed a difference in their own mood. “Maybe they’re still feeling a little bit down or depressed, but they’re kind of sleeping better, moving around better—it can be a good way to start a discussion,” Torous said.
At the same time, doctors might use the app data to check their own biases. For instance, if a patient says the medication is not working, or has intolerable side effects, and the app supports that conclusion, then the doctor may be more willing to believe the patient’s own account of his or her reaction to the drug.
Both of these possible uses reflect the same doctor-patient relationship that bedevils psychiatry today: The patient’s own self-assessment is not seen as fully reliable, and the doctor can use the data from the app either to convince the patient that his or her insight is mistaken or to agree that the patient’s insight appears correct. Either way, the presence of the app reinforces the notion that patients are not reliable witnesses to their own responses to psychiatric drugs.
A Brave New World Is Already Here
“In the clinic we don’t use the algorithm yet,” said Torous. However, LAMP itself is already being used by clinicians at BIDMC and at MAPNET, collecting data and monitoring patients, without any evidence that it improves outcomes.
“We haven’t seen definitive results that say we need to adopt these technologies in routine care today, in any field,” Torous said.
As for Mindstrong, it’s unclear if their algorithm is actively being used yet, but their app has been rolled out with health insurance backing in multiple states (it’s unclear exactly which insurance companies and states are utilizing the app, because Mindstrong did not respond to our requests for comment).
“Our outcomes look promising—lowering the inpatient readmission rate, ER admission rate, mental health costs and physical costs,” the Mindstrong website states. “We’ll publish our care model findings in 2020.”
“I haven’t seen any published papers about mental health patients,” Torous said, when asked about the research base for Mindstrong. “I haven’t seen robust clinical evidence.”
Meanwhile, early user feedback on mental health apps, at least by users of 7 Cups, has not been very positive. The online “listeners” have been criticized on reddit for being robotic, unhelpful, pitying rather than empathetic, and overeager to report people for suicidality, among other issues. Some users complain of feeling worse and even being retraumatized after using the service.
However, the mental health apps are now here. And they herald a future of greater mental health surveillance, with AI tracking your phone use and GPS data at all times, promising to provide a “digital fingerprint” of users, first to diagnose and then to constantly monitor the activities of those diagnosed with depression, bipolar disorder, psychotic disorders, and borderline personality disorder.
Leave your house and stay out too late and perhaps a “manic” button will ring. Stay in bed too long and don’t go out and perhaps a “depression” button will ring. And do you think your drug treatment isn’t working and causing harmful side effects? Your mental health app will help your prescriber decide if you know what you are talking about. But don’t worry, you’ll also be mandated to check in with your unlicensed, poorly-trained “listener” (even though there’s no evidence that will actually help you feel better).