Erick Turner is an Associate Professor in the Department of Psychiatry at the Oregon Health and Science University (OHSU). He is also a senior scholar with OHSU’s Center for Ethics and Health Care.
Dr. Turner has been an FDA reviewer and has dedicated his work and life to improving research transparency. He’s well known for his work on publication bias and antidepressant trials, but his findings show that psychotherapy research is also riddled with problems.
What happens when those we trust with knowledge in our society betray us? In today’s interview, we discuss how dubious research practices are not simply the work of a few bad apples but instead built into the way we produce knowledge. We further explore the consequences of these practices on patients and the dangerous tradition of journal worship before exploring how many of these problems can be solved.
The transcript below has been edited for length and clarity. Listen to the audio of the interview here.
Ayurdhi Dhar: In 2008, you conducted a review of antidepressant trials to check for publication bias. Clinicians, researchers, and service users need trials that accurately and truthfully tell them whether a drug works and has adverse effects—the question of efficacy and safety. You found massive problems in the way antidepressant trials were published. What is publication bias, and what did you find in the 2008 review?
Erick Turner: In simplest terms, it’s picking and choosing what gets published and how it gets published.
Dhar: By authors and by journals?
Turner: Yes, both. It takes two to tango. We used FDA review documents. You can see all the results of all the trials that were done. These may or may not be published, which is the essence of publication bias. We took a cohort of antidepressant trials for 12 drugs. We tracked them into the published literature to see whether each trial was published and, if so, how it was published. Was it published in a way that agreed with the FDA data or not?
We found that if you only look at what ended up in the published literature, you would have the impression that virtually all the trials were positive. That means the drug demonstrated a statistical superiority to the placebo. So, it looked like the drugs couldn’t fail. However, if you look at the FDA review documents, you find there were quite a few more trials you didn’t even know existed from just looking into the published literature.
Furthermore, the way the results panned out was 50/50 in terms of positive versus not positive. So in 50% of the trials, the drug wasn’t good enough to beat a placebo. But you would never have known it from the published literature.
Dhar: You also write about spinning results.
Turner: A number of the trials were spun. Probably the most common maneuver was simply not publishing the (negative) trial. There were 11 trials that were negative, and the drug did not beat the placebo, yet they were published, and they looked positive. They used statistical alchemy to make a silk purse from a sow’s ear. Another phrase is putting lipstick on a pig.
Dhar: How did you react to finding all these problematic and dubious practices?
Turner: I personally wasn’t shocked because of my background at the FDA. I realized that there was this disconnect between what clinicians were seeing and what the FDA reviewers were seeing, and what was known to the FDA and the pharmaceutical industry. They had this secret going on that doctors and patients weren’t privy to.
Dhar: Was there any pushback from the academic or psychiatric community?
Turner: There was some. The drug companies hoped that no one would notice. But there were two problems. One was that the paper was published in the New England Journal of Medicine, which got a lot of press and international attention. So they couldn’t ignore it because it was creating a stir.
A few companies decided to push back. One of the arguments was that the negative studies were not important, so they didn’t deserve to be published. They argued, and I think it’s a specious argument, that the trials “failed”—not that the drug failed, but the study failed.
They argued that it shouldn’t matter that the study wasn’t published because it was scientifically flawed. We replied and said why not let the academic community make that decision? Why not publish it and let them decide rather than paternalistically deprive them of the information?
Dhar: That was 2008. In the new meta-analysis, you used FDA reviews of four recent antidepressants and studied around 30 trials. 15 had negative and 15 positive outcomes. How have things changed since 2008?
Turner: First, there was a smaller sample size of a number of drugs, 4 drugs as opposed to 12, so fewer trials. So, the pace of new antidepressants coming on the market has slowed from the good old days or bad old days. What we saw was there was an improvement.
There’s a glass half full, glass half empty aspect to this. Some negative trials were published. And shockingly, they frankly admitted that they were negative trials. There were still the bad old habits of deep-sixing some negative trials that also happened. There was some spin, so the old tricks are still in the playbook.
But it’s refreshing to see a little bit of movement toward improvement. The number of negative trials that were published transparently, that is, they (a) published them and (b) admitted that they were negative, had gone up from the older cohort, from 11% to 47%. The positive trials are a non-story because, of course, they’re going to be published transparently.
Dhar: In your 2013 paper, you write that publication bias is prevalent in psychiatry, medicine, and even in other sciences. But it is especially rampant in psychiatry. What makes psychiatry so uniquely vulnerable?
Turner: I don’t want to let the other disciplines off the hook. I think it’s easy to bash psychiatry and psychology. Most of the attention and scrutiny has been in the mental health area. We don’t know how bad it is in other areas unless we take a similar look at FDA reviews, compare the inception cohort (the initial trials), and then track them into the published literature.
Dhar: I ask because you write that psychiatry has had many more of these problems. Maybe it’s because, as you say, we’ve studied it more in psychiatry. Is there something about mental health that makes it vulnerable to these things?
Turner: There are differences within psychiatry. Looking at antipsychotics, there was also publication bias, but there was less of it. I think the reason for that is that there were fewer negative studies. Of course, positive studies are going to be published transparently. There’s nothing to lie about as it is with negative studies. For antipsychotics, you had a smaller proportion of negative studies compared to antidepressants and hence less of the bother. You still had a fair amount, mind you!
So, the more effective the drug, the less need for publication bias, the less effective the drug, the more that need for publication bias. Why is a certain class (of drugs) more prone to that? It could be that we’re dealing with soft endpoints. We are dealing with subjective questions like how do you rate your level of depression today? How is your energy level? All those things come down to the patient reporting subjectively, whereas, with blood pressure, you have a hard endpoint. With cholesterol, there’s very little subjectivity.
Another thing is, what is the motivation for publication bias? One is a lack of efficacy, but the other issue is safety, a safety problem. It could be that the drug works perfectly well, but it’s not supposed to kill people!
Dhar: Could you briefly tell us about a couple of different biases? What are some of the other ways this is done?
Turner: We talked about non-publication and spin. There is also a delay (of publication); one can do statistical alchemy of various sorts. The most common method we found was to change how the dropouts were reported because people always drop out of clinical trials.
You start with a certain number, maybe 100 patients, 50 get randomized to the drug, and 50 get randomized to a placebo. I can guarantee you’re never going to get 50. People are going to drop out. What do you do with the data from the people who drop out? Because people who carry on are people who did particularly well on the drug. If they drop out, maybe it’s because of more side effects or they experience less efficacy.
The most common statistical maneuver was ignoring the people who dropped out. But, of course, you’re not supposed to ignore these people. You’re supposed to account for them statistically!
Dhar: Let’s talk about blame. You write the blame for publication bias is complicated, and many parties are involved. How do authors, journals, and pharmaceuticals contribute?
Turner: You talked about a colleague who was disillusioned by the game playing—the analyzing, reanalyzing (of data). I saw this myself. You finish a study you’ve been working on for a year; there is this excitement, but you get the data and crunch the numbers and “Oh, shoot! The P value is not statistically significant. Something must be wrong. What did we do? How about if we look at this, use that, change this thing about the analysis?” So someone tries something and goes bingo! P less than 0.05 over here! It’s a game, and people geek out on it and take pride in being able to massage it into a statistically significant finding.
I want to emphasize that the culture is such that we didn’t think we were doing anything wrong. We just thought this was the way it was done. In many universities and elsewhere, you see your superiors, your mentors, and they’re doing it, and you go, “I guess that’s the way I’m supposed to do it too.” You feel if you get a non-significant finding, you have failed.
People become attached to their pet ideas; they embarked on this study because they believe that this works and will help people. You’ve worked at this for five years, but when you crunch the numbers, it doesn’t work. It just creates this cognitive dissonance and is like Elizabeth Kubler-Ross’s stages of grief. Denial is the first thing—no, it can’t be, and then there’s bargaining.
Dhar: And there’s anger.
Turner: Then you send it to the journal. People send a negative study to a journal, and they won’t even review it. They do a desk reject. If it does get reviewed, they might get feedback saying, “Why don’t you try this other analysis there?”—milking out a statistically significant finding, making a silk purse out of a sow’s ear, or putting lipstick on the pig.
Dhar: So, the blame lies with multiple parties. What made you study antidepressant trials and publication bias? Was it a purely academic pursuit?
Turner: I used to work as an FDA reviewer and had a private practice on the side. I was wearing both hats, having done research at the NIH and private practice. I thought what’s in journal articles is the truth—this is it, and we have access to it! Going from NIMH to FDA, I realized we really didn’t know diddly back at NIH; I was humbled.
At the FDA, I became aware of all these negative trials. There is information about drugs that the FDA and pharmaceutical companies know, but doctors who are prescribing do not know. I was reminded of that when putting on my clinician hat. It was wrong that I had to operate on this incomplete data.
There were some studies I was trying to get approved by the IRB. These were placebo-controlled trials. We were told that you could forget about it. They were not going to approve any placebo-controlled trials because they believe that if you have a new antidepressant and you show that it works equally to an already approved antidepressant, then by transitivity, it must work too. So, you just put it up against Prozac or Zoloft or Paxil, and “we know those work.” I’d say, “Wait a minute, Prozac, Zoloft, and Paxil don’t always work. You say that they work, but they often don’t beat placebo themselves,” and they would roll their eyes and say, “what are you talking about? The journal articles show that they work”.
I realized, of course, they believe this because they’ve been taught, like we’ve all been trained in medicine, that journal articles are the Holy Grail—the truth with the capital T. I realized that no one would take my word for it that there were all these negative studies. So I have to prove it to them.
Dhar: I am glad you did. How do you work with this knowledge about inflated efficacy in your practice? How do you deal with patients on a day-to-day basis, knowing what you know?
Turner: I try to be transparent with them and tell them not to get their hopes up too much. There’s a chance they may have a wonderful response, particularly if they’ve never had an antidepressant before. But the more trials they’ve had of antidepressants that haven’t worked, the less likely it is that the next one is going to work out.
I let them know that these drugs work very well for some people, but there are many people for whom they don’t work at all, and many people in between get a partial response.
Dhar: It gets even more complicated with all the new research on tapering and withdrawal showing that antidepressant withdrawal can be long-term. Is that also information that you share with patients?
Turner: Yeah, we talked about that as well, and particularly, that it will vary according to which drug we’re talking about. I’ll say this is a drug you do not want to run out of—there are a couple of them, SNRIs like duloxetine and venlafaxine, brand name Effexor and the SSRI Paxil.
Dhar: We’ve been talking about drugs, but it turns out that the problem is not just antidepressants or antipsychotics. You found that the same problems plague psychological treatments like therapy. Publication bias makes it seem like the treatment works a lot more than it does. Could you tell us about that?
Turner: The key is to get an inception cohort and evidence of the trial before the study is conducted when everyone’s optimistic. The FDA learns about these trials of drugs before they’re done.
In the case of psychotherapy studies, there is no FDA. However, using the NIH database called Reporter, we were able to identify cohorts of psychotherapy trials for depression, and we could track these studies and see which ones were published. We found that there were a number of psychotherapy trials that were not published.
About three-quarters of the studies were published, and roughly a quarter were not. So, all we could look at was publication versus not publication but nothing about spin, as we did with drugs. Looking at FDA documents on drugs, you see the FDA got this non-significant result, but somehow, the authors got a statistically significant result in the publication. We could not do that with psychotherapy trials. So, I think our findings paint a rosier picture, and it’s worse in reality because surely there’re trials that were not published the way they should have been.
Dhar: There was a 2020 study from Germany where they found a lot of spin in the results of psychotherapy studies. So, there’s publication bias in drug and psychotherapy trials, and your other work found problems in meta-analyses, too. They often don’t report conflicts of interest that are there in the studies that they’re reviewing. So, what are the real-world consequences of all this, of clinicians, patients, and researchers thinking something works when it doesn’t?
Turner: There are two major domains to drugs or to any intervention: efficacy (does it work) and safety (are there harms). Clinicians talk about the risk-benefit ratio when they prescribe. Publication bias exaggerates the benefits and downplays the harms. So, you wind up with this perceived risk-benefit ratio that is overly rosy. That gets communicated to prescribing clinicians and then relayed to the patient. So, you wind up with more prescribing than is warranted or a lack of vigilance for harms/side effects.
Dhar: Did you ever meet clinicians who didn’t believe your findings? What is their response?
Turner: This area of research falls under the umbrella of meta-research or research about research. I worry that a lot of clinicians don’t know about it. With researchers, you either get “I didn’t know that was happening” or “Of course, it happens. Everyone knows about that.” But I don’t know if it translates into critical thinking when it comes to consuming information.
In the U.S., we’ve got direct-to-consumer advertising. If soap is advertised on TV as the best soap ever, you roll your eyes and say, “of course, they would say that.” But when you see a drug ad or you hear a key opinion leader (KOL) at a conference talking about some drug, there’s not enough critical thinking – “Wait a minute, this guy’s funded, paid to come here by a drug company and maybe that’s why he’s saying such wonderful things about this drug, maybe I shouldn’t be so quick to prescribe it.”
People put drugs and healthcare on a pedestal, but there are incentives and motives. We don’t hesitate to question things in the world outside of healthcare, but when it comes to healthcare, “Oh, they wouldn’t! If it’s in a journal article, it must be true. It is peer-reviewed.” There’s this naivety about healthcare that if it’s in a journal, it has somehow been blessed and sanctified and is not to be questioned.
Dhar: Can you tell us about the flaws in how peer-reviewed research is done?
Turner: I think the biggest flaw is not identifying an inception cohort before the study is done, and everyone still believes they will have positive results. The study has been initiated, the data collected, and all the analyses have been run, and rerun and rerun until a statistically significant effect has been found.
Then the writing starts – now we have something that’s “publishable,” which is a problematic word – the belief that you can’t publish it unless it’s statistically significant. Then it gets submitted to a journal. So, the writing, reviewing, and publication decisions are made after you know the results. The key is you have to eliminate knowledge of the results being a deciding factor as to whether the research gets published or not.
Dhar: I know that trials need to be registered, but just because they are registered at the beginning doesn’t mean they must be published. What would happen if we forced trials to be published, irrespective of results?
Turner: If you mean published in a journal article, true, they do not have to be published. But the results do, by law, have to be posted on www.clinicaltrials.gov. Many clinicians don’t think about going there and looking for the results of clinical trials. They all know about journal articles. But they haven’t thought about looking at www.clinicaltrials.gov. It doesn’t have the storytelling panache that a journal article does.
Another thing that would help is reviewing in two stages, once before the study is even conducted and then later after the results are in. But the publication decision should be made before the study is conducted. The peer review takes place at stage one and stage two, and a decision about publication is made at stage one before the study is done. Registered Reports is an effort led by a psychologist at Cardiff University in Wales.
Dhar: So, in the first stage, you look at the design, statistical tests, the method, and decide whether it’s worthy of publication based on that and not the results, right?
Turner: Yes. That’s where the decision (to publish or not) should be made. That’s how it’s done in the case of grants. Grant funding agencies decide whether to fund something based on a protocol. There’s no reason that a publication decision can’t be made then. There you can determine: is this good science? Are the methods good? The decision as to whether it’s good science shouldn’t depend upon whether the results were statistically significant.
MIA Reports are supported, in part, by a grant from The Thomas Jobe Fund.