John Ioannidis is a Stanford professor, a physician, and one of the most eminent scholars in the world in the field of evidence-based medicine.
Ioannidis has spent his career exposing the weak foundations of much of modern medicine. His 2005 paper, “Why Most Published Research Findings Are False,” became the most-viewed article in the history of PLOS Medicine and helped spark a global reckoning with reproducibility. He has since warned about how evidence-based medicine can be hijacked by industry influence, how biased reward systems in academia favor quantity over quality, and how even systematic reviews can recycle flawed data. His critiques extend to psychiatry, where pharma-funded trials often tilt toward positive results, guidelines are shaped by insiders, and neuroscience findings are more fragile than they appear.
He is a tenured professor at Stanford and has an extensive background in medicine, epidemiology, population health, and data sciences. As much as he is a champion of good science, Ioannidis is also a lover of the arts and humanities. He’s a novelist, teaches poetry, loves operas, and has written libretti for four operas himself.
In this interview, he discusses the extensive bias that pervades scientific research, the problematic practices and pressures that enable flawed science, and the significant issues with antidepressant research. At the same time, he reminds us why good science is a gift to humanity and something we must protect for our well-being and dignity.
The transcript below has been edited for length and clarity.
Ayurdhi Dhar: You always say that science is the best thing that has happened to us, and your work has been about doing and protecting good science. You also have an equal love for the arts and humanities. I’d like to know a little bit about how you came to love science and how you realized it needs protection.
John Ioannidis: I think that science is a very vulnerable and sensitive human endeavor. I had early experiences with scientists, including scientists in my own family. I saw how they work — how difficult it is to conduct good science and research, how many obstacles there are in getting things right, and how many calculations are difficult or flawed. All of that was intriguing, exciting, and at the same time, it was essential to try to do it better.
Science is not perfect. It is something that is in a continuous struggle to improve. I was attracted very early on, learning from my own mistakes and struggles to do good science, seeing how easy it is to be biased and make mistakes.
Ayurdhi Dhar: You have probably been asked about your 2005 paper a thousand times, so I will keep it short. This paper has been viewed over 3 million times. Tell us a little bit about what you found that led you to determine that most medical research findings are false.
John Ioannidis: That paper tried to estimate the chance that a research finding that passes some threshold of discovery, typically statistical significance in most of biomedical research, is likely to reflect a true finding, as opposed to something spurious. It tries to model different features, starting from how good the power of the study is, bias, and the fact that science is carried out by multiple scientists with multiple efforts.
It helps us calculate what the chances are that a claim of a significant result may be correct, under different circumstances, in different fields, in different phases of the development of a discipline, and with different factors, including biases, pressures from sponsors, conflicts, and other subversive powers that may operate around science.
In most settings, it turns out that what we get is unlikely to be correct. It’s more than 50% likely that it will not. Of course, that varies. If we have a very large late-stage randomized trial and we get a significant result, it’s far more than 50% likely to be correct. But in the vast majority of scientific efforts, both in 2005 and in 2025, it’s less than 50% likely, and in many cases, it’s far less than 50%.
Ayurdhi Dhar: This paper became really famous. Did you face any pushback for writing something that said, “Hey, a lot of these medical research findings might be false?” I ask because I think of Carl Elliott’s work around whistleblowers in medicine and how they’re treated.
John Ioannidis: Of course, there was pushback, and this is very appropriate and scientific. Science depends on critical feedback, on debate, on healthy, organized skepticism. There was a range of criticisms, some of them very superficial, and others deep, interesting, and constructive. I think that work in this field has evolved since then, and has become richer, deeper, more interesting, and perhaps also more accurate.
I avoided attacks that whistleblowers may get when they try to reveal that a single paper or a single scientist is fraudulent. The 2005 paper was not about a single scientist or a single paper being manipulative or fraudulent. It was about the entire scientific enterprise, and paradoxically, people are more willing to listen to “millions of papers may be wrong” versus “my paper is wrong”. When it becomes personal, people tend to react very negatively and emotionally.
When you describe problems at scale, people are more tolerant because they think, “Well, my own work is not what you’re talking about, is it?” In fact, the paper discusses the work of all of us. It’s not making exceptions. It shows that anyone, under many different circumstances, could succumb to these biases that shape problematic scientific literature.
Ayurdhi Dhar: The overall takeaway from your work around bias seemed to show that there is bias at every step, at every level of research. Let’s take my discipline, Psychology. You wrote there could be bias in observation, and in Psychology, that tends to be true, as there’s a lot of subjectivity in what and how you observe it. There could be bias in measurement. We take complex things like grief, and we reduce them to numbers. There are problems with classification. DSM classifications are consistently under scrutiny. There’s bias in analysis; experts show researchers run multiple analyses until they find what they want – “Hey, I got significance”! There’s bias in the dissemination of research, a topic Erik Turner has written about, particularly in relation to antidepressants and the issue of publishing bias.
You write that neuroimaging and cognitive science are especially plagued by this, and I agree. Every other day, a new study “finds” ADHD in the brain, and people take it seriously, thinking brain scans mean hard evidence. Please describe the biases specifically found in the cognitive neuroscience and psychology literature.
John Ioannidis: There are literally hundreds of biases. I published another paper with David Chavalarias on a catalogue of biases using text mining of the entire PubMed. Of course, that was not even an all-inclusive catalogue because some biases may not be named “X bias”, but we found several hundred different biases. A significant portion of these biases were first identified in the psychological or social sciences and subsequently applied in other fields, including biomedicine.
Psychology and cognitive sciences are not necessarily worse than other fields. Not only have they contributed to the theory of bias, as biases were first described in these fields, but they were also one of the first fields to recognize the need for soul-searching. They began conducting large-scale reproducibility checks, and typically about two-thirds of the experiments were unable to be reproduced.
Fields that have very messy and questionable measurements are likely to do worse than those that are more rigid and specific. Fields that have more pressure from sponsors or from other stakeholders who want to get a particular result do worse. Sometimes it’s allegiance bias alone, when someone has made a career out of a specific observation or publications, and they defend it at all costs.
Also, issues of multiplicity. Is it a field where only a few questions can be asked, or is it a field where millions of questions can be asked? How much transparency is there, and what measures are being taken to address it? Psychological science, in principle, may be somewhere in the middle compared to other fields. It’s not that bad. It’s not that perfect.
Probably, there’s a lot of allegiance bias. Some strong beliefs and narratives permeate the field, on which people build careers and then want to follow. One thing that might put it in a worse state than other fields is that the outcomes and the measurements can be hazy. Mental or psychological states are not as concrete as measuring a protein or other tangible entity.
Imaging studies are highly complex; every little vortex and pixel might count, and you have so many of them that you try to make inferences on.
Ayurdhi Dhar: So, which field is doing really poorly when it comes to this?
John Ioannidis: Every single field. Probably, nutrition science would stand out as very poor in terms of its standards, methods, transparency, reproducibility potential, and also the willingness to amend because of 1) how difficult it is to get accurate results, and 2) how much is being done to get there. Some fields may be horrible, but then take the right steps and become strong fields.
Genetics was probably among the worst 20 years ago; almost nothing correct was published in genetics other than very strong monogenetic effects. The field said, “We’ve had enough of that. We’ve published 100,000 papers that go nowhere. We need very strong validation practices, better statistics, more transparency, and sharing.”
Then they got results that are reproducible, but again, that might not be useful practically. This applies even more so for mental health and for psychological outcomes and phenotypes. Genetics is just amazingly complex. Even if some of the signals are now more reproducible, I am not sure they can be put to good use to change outcomes in patients or people who want to modulate their behaviors.
Ayurdhi Dhar: Twenty years ago, genetics was in trouble, and they have made strides in trying to better themselves. For example, there were these popular but erroneous narratives that a single gene causes schizophrenia or bipolar disorder. Even now, those narratives persist, even if they are false. Once they enter the world, they are tough to dispel. Same for depression; in larger public and pop culture, depression is seen as caused by serotonin. Doesn’t matter how much that has changed; not much has changed.
It seems that when a narrative is really simple, such as “it’s your biology, it’s just a gene,” it spreads like wildfire. But complex narratives never make it big. They don’t make it to the cover of TIME Magazine.
John Ioannidis: You’re absolutely right. There’s a competitive advantage for oversimplified narratives. The serotonin hypothesis is a largely discredited theory, but it persists because it offers a convenient explanation for what’s wrong with you and what we can do about it. But obviously, things do not work so simply, especially for psychological phenotypes and for mental health.
But simple narratives have a competitive advantage for journalists, bestseller authors, organizations, and many stakeholders who want to say, “I know what is wrong with you and I have an easy fix for that.” For mental health, that’s not the case with very, very rare exceptions.
Ayurdhi Dhar: Can you talk about a study or something you remember as one of the most egregious examples of bias?
John Ioannidis: The average study that I read every day. I don’t want to single anyone out because I did that in another paper. The bias that we were discussing was a bias in a method that detects bias. We were ready to publish it, as thousands of studies had already done so, and had overclaimed what they could achieve with this bias detection method. The editor said, “We will publish the paper. But you say you’ve looked at thousands of studies. Please enter one example of a single study that exemplifies this.”
We didn’t want to pinpoint anyone, but the editor insisted. We did put that example, and then a few months later, I met the first author of that particular study. Obviously, he was mad because he had been singled out as the one who had this horrible study, while all the other tens of thousands of scientists who just did the same had not been singled out.
I get asked that many times: What is the worst study? No, there’s just millions of studies. It is tempting to say, “This is the most horrible study,” and perhaps people engaged in the field feel even more intensely the anger and pain when they encounter a horrible study. We should move beyond that because there are literally tens of millions of studies with horrible problems.
Ayurdhi Dhar: There is a rage associated with encountering truly flawed science. But there are so many of them. If we focus on it as a systemic issue, then perhaps there can be systemic responses to the problem, which are different from rage responses.
John Ioannidis: I’m very much in favor of systemic responses. The rage responses have their value in terms of sensitizing people and raising awareness. But they may lead to miscalculation that fraudsters produce science, that science is a lost cause, or that it should not be trusted.
The problems range from common issues where scientists are unaware and fall prey to their lack of understanding and inadequate methodological preparation, all the way to complete fraud. Nowadays, we have a massive production of fraudulent papers. In the past, creating a fraudulent paper was a work of art. It took time and effort to create a fraudulent paper. Now, you can have AI create literally millions of fraudulent papers overnight.
Ayurdhi Dhar: When I teach bias and industry corruption in my classrooms, my students sometimes feel disheartened and ask, “How do you figure out what’s good research and bad? It seems like you would need an extensive background in statistics.” I find myself fumbling, so I want to ask you: since we consume so much medical research, what are some of the red flags that people can look out for?
John Ioannidis: There’s no easy checklist. That applies to both the average citizen, who’s bombarded by information, and the expert. Some aspects probably would increase or decrease the credibility of the work. An experimental, randomized study is likely to yield more accurate results than an observational study.
Large studies can be more reliable than small studies, not because a single small study may not be good, but because many small studies may be flawed and disappear.
Who is sponsoring the study? Is there a potential conflict? Does the sponsor want to achieve a specific outcome? Will the intervention be a blockbuster, and will the sponsor make billions of dollars? It could still be correct, but one must exercise a higher level of skepticism.
How much transparency is there? Was the study pre-registered so that people knew that it was happening? Did they know in detail how it was happening, with what outcomes, design, and analysis? Did the researchers follow what they promised, or did they deviate from it? How concrete do the numbers look? Do they seem to emerge from many analyses, with perhaps only a few having been reported?
It’s not easy. For the average citizen and the expert, it is a significant problem. How much can we teach people to learn more about such methods? Here I’m biased. I believe that we should make an effort to teach more from the early stages. Give more emphasis to science education because I think it’s central to human civilization, well-being, and the future.
Ayurdhi Dhar: This brings us to training. It mortified me to learn from your writing that even experts in a field often receive inadequate training in statistics. You said that doctors have a rigorous training in medicine, but often very poor training in research, and especially in statistics.
Repeatedly, you wrote that there is rampant statistical illiteracy among researchers. Statistical significance is a basic undergraduate concept, and there are concerns about whether we should even be conducting significance testing, given that it may not be clinically relevant. There are calls to increase the p-value, or to get rid of the whole enterprise. But you argued that we need to keep it because scientists often get even this basic method wrong. How are we going to incorporate advanced methods here?
John Ioannidis: It is a major challenge, and statistics permeate the large majority of research nowadays. In principle, this is good.
Statistics is a mature science. It has very strong tools and can improve the quality of the work. But we have seven million papers published every year, and only a small minority of them have some expert statistician or methodologist on board. Others depend on the statistical knowledge of the authors, and very often that knowledge is rudimentary; they’re not aware of the tools they’re using.
Anyone can get hold of statistical software that can run extremely sophisticated analyses, and with artificial intelligence, it is getting even easier to run analyses that not only are complicated but sometimes lack transparency. We have powerful tools and a lot of people who are not trained to use them; it’s so easy to misuse them.
Then we have peer review, which should detect flaws, but most peer reviewers are also not trained in these aspects, so much of what is wrong will likely slip through. Post-publication review may identify some of these problems, but again, post-publication review is quite limited, and even when it’s conducted, it does not impact the paper. Most journals retract or correct very rarely.
We need to teach researchers to be more careful with statistics. Please don’t use something unless you know how it works, what its assumptions are, what it means, how it functions, or how it fails when it breaks down.
It’s very tempting, especially with LLMs (Large Language Models). An LLM can generate amazing analyses for those with little expertise, and these amazing analyses will almost always be completely wrong because the way that they were sought and requested was not appropriate. I’ve seen that happen again and again and again.
We must not put pressure on people to publish papers for reasons that have nothing to do with science and with helping people, but “I need to graduate, therefore I need to publish X number of papers”, or even worse, “I’m a medical resident taking care of patients, which is a highly respectable occupation. At the same time, I also need to publish papers to finish my residency.” Why? There’s absolutely no reason.
We force people to do things they’re not prepared to do, and we shouldn’t blame them if they do things wrong.
Ayurdhi Dhar: I noticed something in your writing and interviews: you study bias, you have seen the scope of statistical illiteracy and the state of research, but you still maintain optimism and hope. You are aware that you may be biased yourself. How do you maintain this hope or humility, and how can scientists cultivate it, or do you also feel really dejected and defeated sometimes?
John Ioannidis: I feel defeated very often, but at the same time, there is room for optimism. Science has helped humanity over time. It has not been linear progress. There have been dark ages and enlightenment periods, periods of stagnation, and at other times, we have experienced more innovation.
Some fields may become stagnant, publishing thousands of papers that contribute nothing, while others may be more disruptive.
When you look at about fifty million people who have contributed to publishing scientific work, this is an enormous contribution of talent, effort, time, resources, and capacity. If you look at two hundred million papers, even if a small portion of them is reliable, this is a major achievement.
We could do something to reduce the papers that are poor, horrible, and useless. But there is still quite a bit of very exciting, very worthwhile, useful, trustworthy, reliable, reproducible work that is disseminated.
The challenge is how to separate that work from the rest. We have some tools, but it’s a struggle. It takes time and effort. There will be some progress and some steps back, and again more progress. I cannot predict what will happen in the future. Who knows, maybe humanity will become extinct. But I want to keep an optimistic outlook as a major possibility.
Ayurdhi Dhar: Your work with the arts, particularly in poetry and writing operas, does that help maintain some hope?
John Ioannidis: Humans are fascinating beings, and it’s important to do anything we can to maintain the dignity of humanity.
I have friends who strongly believe in science and technology, and I share their conviction. But I think that humans are multifaceted, very complex beings, so we also need art, philosophy, we need many different ways that humans can navigate their experience in the world, what it means, what matters, what is important, what can make a difference for them, for others, for the people who they love, and for the community where they live.
Science is indispensable, but it’s not alone in that. We need to look at all these other weapons we have that can give us the opportunity to think that humans are worth it. As a species, we’re worth it.
Ayurdhi Dhar: That’s a beautiful sentence, “Science is indispensable, but it’s not alone in that.” I want to get into your work around antidepressants, because this is really important for the wider public, and especially for our readers. You directly addressed this issue, writing that for the longest time, multiple clinical trials have shown antidepressants to be both effective and safe. And then two meta-analyses came and burst that bubble.
Today, there is a bigger conversation around antidepressant withdrawal. We now know that it’s not self-limiting in that it will be over for everyone in six weeks, but it can take up to a year or longer. The UK’s NICE changed its guidelines around antidepressant withdrawal. We had a the major umbrella review, debunking the serotonin theory.
What were the problems in these earlier trials that had found amazing results? They are incredibly popular drugs, although I do hear whispers that the pharmaceutical industry is moving away from psychopharmaceuticals.
John Ioannidis: The problem is that research on antidepressants, and this is also true for other drugs and for other mental health interventions, has typically been small studies of short duration with outcomes that are not hard outcomes. They’re mostly scale outcomes for some symptoms. They may not capture hard outcomes like suicide, suicide attempts, loss of a job, major marital events, or violence. That’s not easy to study.
You have a study population of 100 people followed for eight weeks or slightly longer. You need long-term studies with very large sample size. There are many meta-analyses of such studies, and I have done some myself.
On average, antidepressants probably offer some benefit. It’s a very subtle benefit for the average person. In the largest meta-analysis on which I was a co-author, we observed a standardized mean difference of approximately 0.3, which is a relatively small and modest effect.
Now, if you assume some extra bias, that 0.3 becomes 0.2 or 0.15. Some people may respond, so the average does not represent the individual experience. Many people will derive no benefit, while others may respond more and experience a significant improvement.
People can also try different psychotherapies. These have the same type of effects and the same problems because it’s also small studies of short duration, and with allegiance biases in the same way as we have sponsor biases in drug trials. But it’s worthwhile taking a shot.
The problem is that these drugs are not just used by people who have substantive symptoms. They’re widely used by people who have very few or no symptoms. Of course, there’s no benefit, and it only harms.
Even with low levels of harm, which I don’t think they’re just low levels, but let’s say that they are, if you have hundreds of millions of people getting these drugs, then you have to multiply that, and the net benefit for the population-wide experience is negative. These drugs end up causing more problems than they might help some individuals.
The same applies to withdrawal symptoms. Some will get withdrawal symptoms. Unfortunately, both for the benefits and for the withdrawal, we have no markers to predict who’s going to do well and who’s not, or even experience harm.
We’ve started getting some information on the harm side, but that’s not enough. But on the benefits, we have very little. People simply try to see how it works, and even that is complicated by subjectivity, including numerous placebo and nocebo effects, as well as circumstantial pressure and individual experience.
They are drugs that have been very good for the industry to make billions of dollars. So even with a relatively modest price, if you have hundreds of millions of people taking them, you can make a lot of profit.
I think this means that the literature is also influenced by sponsor pressure, which results in the creation of experts who promote them, and opportunities, meetings, and journals are put under pressure to publish more positive material about them.
As you say, even though there have been numerous benefits for the companies, perhaps they’re moving out because they see that they’re reaching a dead end. It’s good news because we do need new treatments, new concepts.
Ayurdhi Dhar: I wanted to take a minute to discuss the harms. There is a notorious Study 329 by GlaxoSmithKline (then SmithKline Beecham) where they reported that two antidepressants (paroxetine and a tricyclic) were effective and safe for adolescents. In contrast, the actual findings showed they were both ineffective and dangerous. They caused suicidal ideation and acts.
I know you wrote about how harms are underreported and that specific languages are used for this purpose. I saw that exactly in Study 329; there were phrases such as “Paroxetine was well-tolerated,” which is a very vague phrase, and a red flag, as you say. There was a phrase that said some adolescents showed emotional lability instead of saying they were suicidal.
Could you tell us a little bit about how harms are downplayed?
John Ioannidis: We’ve documented that across different disciplines. It’s not just antidepressants, it’s not just psychiatry, we’ve seen that in almost any medical discipline, that harms are underreported, underestimated, and also not commented on as much as the benefits.
In an early paper that I published in JAMA, I estimated that the amount of space devoted to reporting harms was less than the space dedicated to the authors’ names.
I think that the situation has improved somewhat because many people have emphasized the importance of being more thorough in documenting and reporting harms. However, there’s still a significant imbalance.
Harms have to pass a high threshold of resistance to be accepted. I have experienced this with trials when I was at NIH. Even for trials that should be independent — NIH is not an industry — the industry manufacturer was doing its best to suppress information on harms. We were fighting, “No, this is happening, it’s there”. They would say, “No, it’s not that important, it’s okay, let’s watch, not say anything.”
Same for antidepressants. Study 329 is likely an extreme example because the reanalysis yielded an exact opposite conclusion compared to the original analysis. The original analysis suggested that these antidepressants are very safe and very effective; the reanalysis showed that they’re not effective and they’re not safe.
It was an extreme that affected the treatment of millions of people, especially adolescents. Children and adolescents are not a population that should be exposed to these drugs unless it’s really a very, very, very, very special indication that that would be a rare exception, because they do have harms.
We need to improve at multiple levels: enhance regulatory agencies’ oversight, refine the requirements journals impose when publishing a trial, improve the processes institutional review boards follow when accepting a trial for publication, and then ensure the trial is published. When the picture is not complete, request the remaining information. It will require structural work to ensure that sponsors do not have nearly veto power to shape the picture of harms and their interventions. I think sponsors should move away from shaping the narrative about benefits and harms.
Ayurdhi Dhar: You have hope and optimism, and I need some of that. When I consider the scope and systemic nature of the issue, I think of cases like those of Joseph Biederman or even Faruk Abuzzahab, who, at one point, had his license revoked but was later awarded a Lifetime Achievement Award by the APA. When I see people who try to do the right thing being pushed down, ostracized, and squashed, and people who were caught with their hand in the cookie jar being rewarded by institutions, I lose heart. It’s nice to hear you have not lost hope.
John Ioannidis: Thank you so much for the opportunity. Yeah, we should not lose hope. Some bad people will reach very high, but not for too long.
**