The Mental Health App Marketplace is a Mess, Researchers Find

Harvard psychiatrists perform a comprehensive analysis of the mental health apps marketplace and find misinformation.


The issues commonly associated with mental health apps are made worse by misleading app marketplace metrics. In response, a team of Harvard psychiatrists led by John Torous has created an assessment framework and a public database of hundreds of apps to extend previous research into apps’ limited functionality and clinical utility.

To assess these apps, the researchers developed an assessment framework that they have since made publicly available as a database called MIND (the Mhealth Index and Navigation Database). They write:

The goal of this database is to make information transparent and to provide an accessible space where any interested users can filter apps… As each app in MIND is evaluated across 105 dimensions, the database offers a novel source to assess questions around current offerings, their quality, features, and overlap.”

Digital mental health apps are big businesses operating in the hazy regulatory zones of the current software landscape. Yet, understanding the features, qualities, and risks of these apps has not kept pace with the speed at which they are now recommended by healthcare providers, employers, and universities.

Accurate and comprehensive descriptions, warnings, and quality control are features one might expect from a marketplace, yet application aggregators like offer tens of thousands of mental health apps with practically no medical or legal oversight. How are patients, healthcare providers, and policymakers supposed to decide which apps to select, recommend, and fund in a marketplace characterized by false advertising, paid reviews, unsubstantiated claims, and the rapid emergence and extinction of apps?

To improve research on mental health apps, psychiatrists at Harvard Medical School comprehensively reviewed 278 apps from Apple and Google’s marketplaces. They analyzed the apps using a framework comprised of over 100 questions about the app’s origin, clinical foundation, accessibility, privacy, security, inputs, outputs, features, engagement styles, and other considerations.

The authors used the MIND database they developed to look at how marketplace rankings (stars, downloads) relate to metrics derived from the assessment tool, updating existing research around app privacy and quality, marketplace saturation and gaps, and the temporal dynamics of marketplace offerings.

Their own use of this framework found that 212 of these apps are developed on a for-profit basis. The remaining 66 were developed by academic institutions, healthcare companies, non-profits, or government organizations. 71 of the apps collected passive data about users, like biodata or geolocation. However, the privacy policies of these apps bear no significant differences to apps that do not collect passive data. Only 238 apps had privacy policies in the first place, albeit written at an extremely high reading level.

108 apps were found to share personal health information with third parties. However, less than half of the apps facilitated information sharing with a clinical provider. No apps offer integration with an electronic medical record– seriously limiting their clinical utility. Data portability and clinical integration remain important considerations for future mental health app developers.

Concerning their effectiveness, only 44 apps of the 278 in the database (<25%) were supported by a feasibility or efficacy study. The authors add:

“Our results confirm that app stars and downloads – even for the most popular apps by these metrics – did not correlate with more clinically relevant metrics related to privacy/security, effectiveness, and engagement. Most mental health apps offer similar functionality, with 16.5% offering both mood tracking and journaling and 7% offering psychoeducation, deep breathing, mindfulness, journaling, and mood tracking. Only 36.4% of apps were updated with a 100-day window, and 7.5% of apps had not been updated in four years.”

Current app marketplace metrics commonly used to evaluate apps do not offer an accurate representation of individual apps or a comprehensive overview of the entire space. The majority of apps overlap in terms of features offered, with many domains and other features not well represented.

The five most common offerings of mental health apps are mood or symptom tracking, journaling, mindfulness, psychoeducation, and deep breathing. These involve inputs by users that include surveys, diary entries, and photographs. There was substantial overlap in offerings across the surveyed apps, with few apps offering comprehensive therapeutic interventions like CBT and DBT.

Corroborating the limitations of app store ratings with thorough data sets of mental health apps found no significant relationships between app features and popularity (as given in the number of downloads, reviews, and user ratings). Selecting an appropriate app continues to require personal matching, given no clear trends or guidance offered by marketplace metrics alone.

While the data set was extensive, the authors caution that results were still drawn from a limited sample. Moreover, apps are frequently updated, with nearly a third of those studied updating within a period of 100 days. This makes it difficult to say whether these results will continue to apply over time.

This study relays to mental health professionals and users of digital mental health services that app marketplace metrics neither accurately represent individual apps nor testify to their utility in a clinically meaningful way.



Lagan S, D’Mello R, Vaidyam A, Bilden R, Torous J. Assessing Mental Health Apps Marketplaces with Objective Metrics from 29,190 Data Points from 278 Apps. Acta Psychiatrica Scandinavica. 2021 Apr. DOI: 10.1111/acps.13306. (Link)


  1. I’d be curious what people think would make a good app, besides one that y’know… is done ethically. What if like how some people rate non-profits through Charity Navigator, there are markers such as… this development team includes neurodivergent people, kindof like when you see the designation that an organization is run by women, or how responsive the company is to accountability (which… yeah, that’s just a wild take, I have no idea how that would get measured…)

    I was just talking to a Neuroscientist jokingly about how we’ll make buckets of money creating an app called, “Is it me or is it Capitalism?”, but we were both so skeptical of the idea of self-diagnosis vis-a-vis apps / AI even though there is growing consensus in the ADHD/Autism/Neurodiversity online communities that through their own extensive research they’re learning more than they ever did from doctors.

    I think the words “self-diagnosed” already have a negative connotation to them that I don’t exactly know how to unlearn, especially because the effect of having a qualified expert/clinician give credence to ones struggles leading to less personal shame and more acceptance is a big part of what makes something “a useful label”

    Ok so that was getting a bit tangential, when what I want to suggest is that it WOULD be interesting if apps would for example, pair diagnosis with connecting the user into an online mutual aid network so that it’s not about setting you up for how to talk to your doctor about getting the correct medication, as much as it is about gaining confidence doing your own research. Or, another example could be an app that helps calculate a polychronic sense of time, or a limited sense of energy based on comparing ones plans vs. what actually happened, I imagine it like how some people set clocks 5 minutes fast to trick themselves, if it notices that you are constantly 15 minutes late, it auto adjusts your Google Calendar to set everything 15 minutes ahead (not simply making more reminders), or it takes historical data such as having plans to visit with 4 friends, and get 3 papers done, and in the end you cancelled twice or had to do an all-nighter to pull off just 1 of the projects on time, then it can project a personalized idea of how much you can even inform other people to expect of your realistic output based on what you were able to pull off before. I don’t just want a smartphone that tells me I’m “over my average amount of minutes” I hoped to spend on my phone, I want a smartphone that tells me “Oh you think you’re capable of doing this in this amount of time? Think again. You’re 1/6 in following through on finishing a grant early, and not at 5 am. Suggested tips that have worked in other situations: A. Break up the task into something simpler with an earlier check in, perhaps ask _Carole_ to be your accountabilli-buddy on this since it involves writing. B. Your credit card is set to auto-pay to the _NRA_ unless you submit the grant by _May 1st_and get the code to stop the transaction which has been sent to _Emaline_ for confirmation.”

    Report comment