Ioannidis Questions Strength of Psychology and Neuroscience Literature


Last week, well-known Stanford scientist John Ioannidis and his colleague Denes Szucs released a new analysis online. They examined research published in eighteen prominent psychology and cognitive neuroscience journals over the past five years and found that the studies in these fields are generally of “unacceptably low” power and suffer from inflated effect sizes and selective reporting.

Open Access →

Ioannidis is perhaps most well-known for his 2005 article in PloS, “Why Most Published Research Findings Are False,” in which he explained his statistical analyses indicating that “for most study designs and settings, it is more likely for a research claim to be false than true.”

“Moreover,” he wrote, “for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias.”

In his latest paper, Ioannidis and Szucs take a closer look at the publication practices of research in experimental psychology and cognitive neuroscience. They extracted five years worth (Jan. 2011-Aug. 2014 ) of studies from 18 major journals. The journals ranged from impact factors slightly above 2, like Acta Psychologica, to those as high as 17, like the widely read Nature Neuroscience.

Journals in cognitive neuroscience are often of low power and may overstate their significance.
Journals in cognitive neuroscience are often of low power and may overstate their significance.

Of major concern was the fact that many of studies were of very low power, which can inflate the significance of the measured effects and lower the likelihood that the results can be reproduced. In short, low power can increase the risk that a statistically significant finding is false.

Their analysis found that “power in cognitive neuroscience and psychology papers is stuck at an unacceptably low level” and that “overall power has not improved during the past half century.” In fact, cognitive neuroscience journals had much lower power levels than the psychology journals, perhaps a result of the increased resources needed per participant in neuroscience studies.

“The power failure of the cognitive neuroscience literature is even more notable,” Ioannidis writes, “as neuroimaging (‘brain-based’) data is often perceived as ‘hard’ evidence lending special authority to claims even when they are clearly spurious.”

The data also revealed that the inflation of results may be more common in “high impact” journals and that these journals also had, on average, less power.

They conclude:

“In all, the combination of low power, selective reporting and other biases and errors that we have documented in this large sample of papers in cognitive neuroscience and psychology suggest that high (false report probability) FRP are to be expected in these fields. The low reproducibility rate seen for psychology experimental studies in the recent Open Science Collaboration is congruent with the picture that emerges from our data.”



Szucs, D. and Ioannidis, J.P., 2016. Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. bioRxiv, p.071530. (Full Text)

Previous articleBacking Away from Psychiatry
Next articleHow Psychiatry Almost Stopped Burning Man: A Story of Hell and Liberation
Justin Karter
MIA Research News Editor: Justin M. Karter is the lead research news editor for Mad in America. He completed his doctorate in Counseling Psychology at the University of Massachusetts Boston. He also holds graduate degrees in both Journalism and Community Psychology from Point Park University. He brings a particular interest in examining and decoding cultural narratives of mental health and reimagining the institutions built on these assumptions.


    • Or perhaps “Science is the belief in experts regardless of their ignorance.”

      And I’m just SURE that these low-powered experiments that are more likely to produce a positive result are just a result of naivete on the part of our poor, under-educated scientists. It’s just IMPOSSIBLE that they conduct this kind of experiment specifically BECAUSE it is more likely to give a positive result. Or is it?

      Report comment