In a new article in JAMA, researchers suggest that developers of artificial intelligence (AI) programs for improving medicine should pay more attention to how their programs would actually function in a clinical setting. The authors remark that “given the abundance of algorithms, it is remarkable there has yet to be a major shift toward the use of AI for health care decision-making (clinical or operational).” However, they go on to outline several problems with AI programs that explain why healthcare organizations have not overwhelmingly adopted them for clinical decision making.
They write that “data quality, timeliness of data, lack of structure in the data, and lack of trust in the algorithmic black box are often mentioned as reasons.”
AI algorithms often haven’t been shown to work (“data quality”), nor have they consistently been found to improve results (due to “timeliness” or “lack of structure”). Another complication is that AI programs often can’t be assessed because their “black box” algorithms make it impossible to tell if they’re working, which leads to a “lack of trust.”
The authors acknowledge these problems, but they have an additional explanation, too: “Perhaps that model developers and data scientists pay little attention to how a well-performing model will be integrated into health care delivery.”
“The problem is that common approaches to deploying AI tools are not improving outcomes.”

The lead author was Christopher J. Lindsell at Vanderbilt University Medical Center, himself a patent-holder on several predictive technologies who also receives funding from Endpoint Health Inc, an “early-stage” tech start-up in the healthcare field.
Lindsell and his co-authors suggest that one major problem is that even if artificial intelligence algorithms were shown to work, they may not improve outcomes. Importantly, prediction and surveillance do not necessarily improve healthcare outcomes.
“Designing a useful AI tool in health care should begin with asking what system change the AI tool is expected to precipitate. For example, simply predicting or knowing the risk of readmission does not result in decreased readmission rates; it is necessary to do something in response to the information.”
The authors suggest that technology companies should work with “end users” such as patients and clinicians to determine what algorithmic technology may actually be helpful—and “in some cases, the realization that the problem is not ready for an AI solution given a lack of evidence-based intervention strategies to affect the outcome.”
They provide an example of an “expensive intervention” aimed at reducing alcohol use in people who experienced trauma. Technically, the intervention worked—but only for the people who were at low risk of both alcohol use and readmission.
“It was ineffective for those with more serious alcohol-related problems, who are also at higher risk of readmission.”
So, in that instance, a technology was developed with good intentions, and it appeared at first glance to be successful. But upon further review, it actually failed to work for the group of people who needed it most.
Apps using artificial intelligence to assess mental health are already in use, partnering with health insurance companies and medical centers, despite no published research evidence demonstrating their effectiveness in any clinical domain.
There were over 325,000 different healthcare apps available to download in 2017, and the market share was estimated at $23 billion. In 2018, users downloaded over 400 million healthcare apps, and that number has likely only grown.
A study last year found that of the more than 10,000 apps available for mental health, only “3.41% of apps had research to justify their claims of effectiveness, with the majority of that research undertaken by those involved in the development of the app.”
****
Lindsell, C. J., Stead, W. W., & Johnson, K. B. (2020). Action-informed artificial intelligence—Matching the algorithm to the problem. JAMA. Published online May 1, 2020. doi:10.1001/jama.2020.5035 (Link)