Published reports of clinical trials of psychiatric drugs typically include a graphic showing the efficacy of the study drug in reducing symptoms of the disorder compared with placebo. These graphics are visually compelling. They almost always show a notable separation between the study drug and placebo in the reduction of symptoms over time, and thus the reader can see what the authors of the study conclude: The study drug, at one dose or another, is an effective treatment for the disorder.
These graphics, however, are presenting an illusion of efficacy. It’s done very simply: The graphics are set up with a vertical axis that, in essence, acts as a magnifying glass.
In a recent MIA Report titled Anatomy of an Industry: Commerce, Payments to Psychiatrists, and Betrayal of the Public Good, Mad in America looked at the financial influences present as seven new psychotropics were brought to market from 2013 through 2017: four antipsychotics, one antidepressant, and two drugs for tardive dyskinesia. The published reports of the pivotal trials of these drugs regularly featured a graphical illusion of this sort, an illusion that can be revealed by re-graphing the efficacy data with a proper vertical axis.
Here are the graphics that tell of this illusion of efficacy for each of the seven drugs, along with similar graphics for a fifth antipsychotic that was approved in 2020 (lumateperone.)
Oral antipsychotics as a treatment for schizophrenia
In pivotal trials of a drug being tested as a treatment for schizophrenia, the primary outcome measurement is reduction of symptoms on the Positive and Negative Syndrome Scale (PANSS). The scale is composed of 30 questions, with each answer scored from 1 to 7, and thus it is a 210-point scale (with possible scores ranging from 30 to 210.) However, the efficacy graphics in published reports do not use that range of possible scores as a vertical axis; instead, they regularly use a 25-to-30 point segment of the PANSS scale, which acts like a magnifying glass in presenting the “separation” between drug and placebo.
In a phase III study published in 2015, three doses of brexpiprazole were compared with placebo over a six-week period. Here is the graphic that told of the drug’s efficacy at these three doses:
It appears that there is a significantly different course for those treated with brexpiprazole compared with placebo. However, the vertical axis tells of a reduction in symptoms from 0 to 25 points on the PANSS scale. That’s the magnifying effect: There is no information in this graphic that tells that this point drop occurred on a 210-point scale. The use of a 25-point segment of the scale, instead of one with scores spanning 180 points, could be said to magnify the difference by a factor of seven (180/25).
The proper perspective would be a graphic that charted PANSS scores with a vertical axis of 30 to 210. This would depict the clinical course of the four groups over the six weeks, and it would reveal whether there was a significant difference between the placebo and brexpiprazole cohorts.
As can be seen in this graphic, there is almost no difference in the clinical status of the four groups at any time during the six weeks. Researchers have determined that there needs to be at least a 15-point difference in PANSS scores between drug and placebo for the treatment to provide a clinically important benefit, and none of the three brexpiprazole doses met that standard.
This graphic also reveals that “statistical significance” shouldn’t be confused with clinical significance. Although the outcomes for the three brexpiprazole cohorts seem indistinguishable, two of the three doses—the 2 mg and 4 mg doses—squeaked over the “statistically significant” line, and this led the study authors to conclude that the drug was safe and effective at these doses.
The first graphic above charted a drop in symptoms; the second graphic charted the PANSS scores. It is possible to re-apply the magnifying effect to the PANSS scores by replacing the 30-to-210 point axis with one that runs from 70 to 95 points. Just as in the first graphic, a notable separation between drug and placebo now appears. There is the same seven-fold magnification of the results.
The “safe and effective” finding in a published article, together with the visual that tells of a notable separation in symptoms between drug and placebo, provides the “evidence” that a drug manufacturer can use to promote its product. The manufacturer pays psychiatrists to serve as its advisors, consultants, and speakers, and collectively this group authors the trial findings, writes further reviews of the drug, and speaks about its usefulness at dinners, conferences, and CME webinars.
That promotion turned Rexulti into a commercially successful drug, one that generated $1.4 billion in Medicare and Medicaid sales from 2015 through 2019.
In a phase III trial of cariprazine, which was published in 2015, two doses of cariprazine were compared with placebo and to a 10 mg dose of aripiprazole. The efficacy graphic that was published used the same 25-point vertical axis showing a drop in symptoms as the brexpiprazole publication did.
The graphic shows a notable separation in symptoms between the placebo group and the three groups treated with cariprazine or aripiprazole, with this separation becoming apparent at week two and more pronounced over time.
Now here is a graphic that charts their PANSS scores over time:
Once again, the lack of clinical significance is evident in this visual. The graphic provides the same understanding that the numerical data does. At the end of the study, there was only a 10-point difference between placebo and the 6 mg dose of cariprazine, and the drug-placebo difference was even less than that for the 3 mg dose. Both the graphic and the data tell of a drug treatment that did not provide a meaningful clinical benefit.
However, the small differences in PANSS scores were “statistically significant,” and thus the authors concluded that cariprazine was safe and effective at both 3 mg and 6 mg. The usual promotional machinery swung into gear once the drug was approved, and yet another billion-dollar drug was born. Medicare and Medicaid sales of Vraylar totaled $1.2 billion from 2016 through 2019.
A phase III trial of lumateperone, published in 2020, compared two doses of lumateperone with placebo over a four-week period. The efficacy graphic used a 16-point vertical axis, which ramped up the magnifying effect 11-fold.
Once again, it appears that there is a significant difference in the decrease in symptoms on PANSS for the two medicated groups compared with placebo. Here is a graphic of their PANSS scores over the four-week period:
There is virtually no separation between the medicated and placebo groups in this graphic. In fact, there was only a 3.2-point difference in PANSS scores at day 28 between placebo and the 42 mg dose, and a 2.4-point difference between placebo and the 24 mg dose. These differences, on a 210-point scale, were clinically meaningless, and yet the 42 mg dose was deemed to provide a statistically significant benefit.
Equally revealing, the 24 mg dose did not pass the “statistically significant” hurdle. As there was a slight difference in the baseline scores for the two doses, the total reduction in symptoms for the 42 mg dose was 1.6 points greater than for the 24 mg dose, and in the world of statistical significance, that miniscule difference separated an “effective” drug from a “non-effective” one.
As lumateperone did not reach the market until 2020, there is no public record yet of Medicaid and Medicare sales of Caplyta.
Injectable antipsychotics as a treatment for schizophrenia
Abilify Maintena/aripiprazole once monthly
In a 12-week study of aripiprazole once monthly as a treatment for an acute exacerbation of schizophrenia, patients were randomized to placebo or to a regimen of oral aripiprazole and an injectable dose for the first two weeks, with the oral dose stopped after that period.
The efficacy graphic in the published article used a 30-point vertical axis that charted a decrease in PANSS scores over the 12 weeks (a six-fold magnification of results). The visual portrays a dramatic difference in symptom reduction between the two groups.
However, the 14.6-point difference between drug and placebo at 12 weeks still did not reach the “minimal” level of a “clinically important difference.” A graphic that charts PANSS scores on a proper vertical axis does show a visible separation in symptoms, but, at the same time, provides a context for understanding why this difference was of a modest sort and still fell short of a clinically meaningful difference.
Otsuka and Lundbeck brought Abilify Maintena to market in 2014 as a treatment for schizophrenia and as a maintenance treatment for bipolar. Medicare and Medicaid sales of the injectable from 2014 through 2019 totaled $3 billion.
The efficacy of the injectable aripiprazole lauroxil was assessed over a 12-week period as a treatment for an acute exacerbation of schizophrenia. The efficacy graphic in the published article used a 25-point vertical axis (a seven-fold magnification of results). The efficacy results, which compared two different doses to placebo, seem much the same as in the Abilify Maintena study.
A graphic of their PANSS scores reveals that the separation between placebo and either injectable dose is slightly less than in the Abilify Maintena trials. The drug-placebo difference appears to be of a minimal sort (12 points for the 882 mg dose, and 11 points for the 441 mg dose), and thus neither dose reached the 15-point standard for a clinically important difference.
Medicaid and Medicare sales of Aristada totaled $726 million from 2015-2019.
Drugs for tardive dyskinesia
The abnormal involuntary movement scale (AIMS) used to measure tardive dyskinesia symptoms assesses movements in seven areas, with scores of 0 to 4. Thus, it is a 28-point scale.
In a 12-week trial of deutetrabenazine, the efficacy graphic used a 3-point vertical axis for depicting drops in AIMS scores (a nine-fold magnification.) The chart depicts a dramatic decrease in TD symptoms for the deutetrabenazine group compared with placebo, with the separation evident at the end of four weeks.
In fact, at the end of 12 weeks, there was only a 1.4-point difference in reduction of symptoms on the AIMS scale between drug and placebo, with this minimal difference made clear when the symptom scores are plotted on a 28-point vertical axis.
The 1.4-point difference was deemed “statistically significant,” and thus researchers concluded that deutetrabenazine was safe and effective as a treatment for TD. However, one might think that a 1.4-point difference on a 28-point scale wouldn’t be clinically noticeable, and two secondary outcome measures proved that to be the case. There was no “statistically significant” difference in outcomes based on the clinician’s “Global Impression of Change,” and that proved to be true for the patients’ self-assessments too. Neither the clinicians nor the patients noticed a significant difference in “global outcomes” at the end of 12 weeks.
Medicaid and Medicare sales of Austedo totaled $581 million from 2017 through 2020.
A six-week trial of valbenazine produced a similar drop in symptoms for the medicated patients, and since the placebo patients in this study didn’t improve, the efficacy graphic that was published, using a 4-point vertical axis, showed a steep separation from placebo for both doses of valbenazine (40 mg and 80 mg).
However, charting TD symptoms on a 28-point vertical axis shows a much more modest difference between drug and placebo.
A secondary outcome measure once again reveals which of the two charts better reflects the clinical reality. At the end of six weeks, there was no “significant” difference between placebo and either drug dose on an outcome measure titled “Clinical Global Impression of Change—Tardive Dyskinesia.” This is an outcome that tells of clinicians being unable to notice a difference in the “global” outcomes of the medicated and placebo groups.
Medicare and Medicaid sales of Ingrezza totaled $1.2 billion from 2017-2019.
The Latest Antidepressant
It is now fairly well known that clinical trials of “second-generation” antidepressants failed to show much of a benefit on the HAM-D scale that was used to assess primary outcomes when these drugs came to market. In the trials of vortioxetine, an antidepressant that was approved by the FDA in 2013, the Montgomery-Asberg Depression Scale (MADRS) was used to assess efficacy. This is a 60-point scale that assesses symptoms in 10 domains, with a score of 0 to 6 for each domain.
A U.S. randomized trial of vortioxetine compared two doses of the drug with placebo over an eight-week period. The efficacy graphic in the published report used an 18-point scale to depict decreases in MADRS scores (three-fold magnification), and while it showed a separation between placebo and both drug doses, the visual separation wasn’t nearly as pronounced as in the antipsychotic trials, reflective of how the magnification effect was much less in this presentation of data.
When the MADRS scores are plotted on a 60-point axis, the difference between placebo and the two vortioxetine doses nearly disappears.
At week eight, there was a 3.2-point difference in symptoms on the MADRS scale between placebo and the 20 mg dose of vortioxetine, and a 1.9-point difference between placebo and the 10 mg dose. Although the outcomes for the two vortioxetine cohorts were nearly identical, the 20 mg dose was found to produce a benefit over placebo that was “statistically significant,” while the 10 mg dose failed to reach this standard.
Other trials of vortioxetine produced similar results, with some doses deemed to provide a “statistically significant” benefit, and other doses failing to meet this standard. Even so, Takeda and Lundbeck, with the help of the psychiatrists they paid to serve as their consultants, advisors, and speakers, still found success in the marketplace with this drug, with Medicaid and Medicare sales totaling $1.25 billion from 2014 through 2019.
Visual Illusions: Key to the Marketing of Psychiatric Drugs
The published reports of the eight trials reviewed here all featured graphics that provided a “visual” of drugs that were quite effective in reducing the symptoms of the disorder. The graphics told of treatments that began providing a benefit over placebo fairly quickly, with this benefit sustained and often becoming more pronounced by the end of the study.
These efficacy graphics are what stick in the minds of prescribers. The visual presentation overrides the numerical data, and these graphics are regularly used in presentations at dinner events, conferences, and CME webinars. They tell of a separation of drug treatment from placebo over time, and thus present a visual understanding of how the clinical course of treated patients is superior to the course in untreated patients (which is how the placebo group is seen).
Yet, in the antipsychotic studies reviewed here, not a single dose of any of the five antipsychotics produced a benefit that met the standard for a “clinically important difference” over placebo. In the trials of the two TD drugs, neither drug produced a benefit over placebo that was clinically noticeable. The same was true for vortioxetine: neither drug dose produced a benefit over placebo that, on a 60-point scale, would be clinically noticeable.
In every instance, the visual picture of significant drug efficacy arose from the use of a vertical axis that served as a magnifying glass, dramatically enlarging the differences between drug and placebo.
However, as was seen above, a different picture emerges when symptoms are charted on a graphic that employs the full range of possible scores as its vertical axis. The magnifying effect is gone, and the visual is now in sync with data that tells of a drug that has failed to provide a clinically meaningful benefit.
Update: This article has been updated to reflect that although PANSS is a 210-point scale, the range of possible scores is 30 to 210 (a 180-point span.) The PANSS graphics previously had a y-axis of 0 to 210; the updated graphics use a y-axis of 30 to 210.