Publication Bias: Does Unpublished Data Make Science Pseudo?


Way back in the 1970s when I first started studying psychology I heard about publication bias. It was easier to get a study published if it had significant results than if it didn’t.

That made a certain amount of sense. A study producing only nonsignificant results (group against group, variable against variable, pretest versus post-test) might be badly designed, underpowered (too weak to detect a genuine effect), or simply misconceived. No wonder no one wanted to publish it. And who cares about hypotheses that turn out not to be true anyway?

Partly, of course, the problem is obvious: if positive studies are much more likely to be published than negative ones, then erroneous positive results will tend to live on forever rather than being discredited.

More recently the problem of publication bias has been shaking the foundations of much of psychology and medicine. In the field of pharmacology, the problem is worse, because the majority of outcome trials (on which medication approval and physician information is based) are conducted by pharmaceutical firms that stand to benefit enormously from positive results, and run the risk of enormous financial loss from negative ones. Numerous studies have found that positive results tend to be published, while negative ones are quietly tucked under the rug, as documented by Ben Goldacre in his excellent book Bad Pharma.

In a case examining outcome trials of antidepressants (Turner et al, 2008), 48 of 51 published studies were framed as being supportive of the drug being examined (meaning that the medication outperformed placebo). Of these, 11 were regarded by the US Food and Drug Administration as being questionable or negative but were framed as positive outcomes in publication.

So the published data look like this (P = positive, N = negative):


Given that a great number of readers only look at the study abstract or conclusion, or lack the skills to detect spin, they’ll miss the reality that many of the positive trials aren’t so positive. The real published data look more like this:


In contrast, only 1 of 23 unpublished studies supported the idea that the medication being tested was effective.


So the real picture is more like this:



Given that physicians, who are urged to prescribe based on the research, only have access to published data, the result is likely to be a systematic exaggeration of drug benefits.

Smug psychologists (and others) have stood by smirking, unaware that their perspective is elevated only because they are being hoisted by their own petards. True, there are no billion-dollar fortunes to be made from a psychological theory or a therapeutic technique, but there remain more than enough influences to result in a publication bias for noncorporate research:

  • A belief (often justified) that journals are more likely to reject articles with nonsignificant results.
  • A tendency to research one’s own pet ideas, and a corresponding reluctance to trumpet their downfall.
  • A bias to attribute nonsignificant results to inadequate design rather than to the falsehood of one’s hypotheses.
  • Allegiance to a school of thought that promotes specific ideas (such as that cognitive behavior therapy is effective – one of my own pet beliefs) and a fear of opprobrium if one reports contrary data.

Does Publication Bias Fundamentally Violate the Principles of Science?

Although science can lead to discoveries of almost infinite complexity, science itself is based on a few relatively simple ideas.

  • If you dream up an interesting idea, test it out to see if it works.
  • Observation is more important than belief.
  • Once you’ve tested an idea, tell others so they can argue about what the data mean.
  • And so on.

Even science, in other words, isn’t rocket science. One would think that in execution it would be about as simple as in explanation. But no. In practice, it’s extremely easy for things to go wrong.

An early statistics instructor of mine showed our class an elementary problem with research design by discussing a study of telekinesis (the supposed ability to move things with the mind). The idea was to determine whether a talented subject could make someone else’s coin tosses to come up “heads.” As the likelihood of a properly balanced coin coming up heads is 50%, anything significantly above this would support the idea that something unusual was going on. And indeed, the results showed that the coin came up as heads more often than random chance would suggest. The instructor invited us to guess the problem in the study.

A convoluted discussion ensued in which we all tried to impress with our (extremely limited) understanding of statistics and research design – and with our guesses about the tricks the subject might have employed. Then the instructor revealed what the experimenters had done.

They knew that psychics reported sometimes having a hard time “tuning in” to a task. So if they used all of the trials in the experiment, they might bury a genuine phenomenon in random noise – like trying to estimate the accuracy of a batter in baseball when half the time he is blindfolded. Instead they looked for sequences in which the subject “became hot,” scoring more accurately than chance would allow, and marked out these series for analysis. Sure enough, when compared statistically to chance, there were more ‘heads’ than random chance could account for.

We stared at the instructor, disappointed that his example wasn’t a bit, well, less obvious. How could reasonably sane people have deluded themselves so easily? Clearly this little exercise would have nothing useful to teach us in future.

Try it yourself sometime. Flip a coin (or have someone else do so), and try to make it come up heads. One thing it will almost certainly not do is this:


Instead, you’ll get something like this (I just tried it and this is what I got):


Totals: Heads = 63; Tails = 64

Now imagine that you only analyze sequences of 6 or more where I seem to have been “hot” at producing heads.


Drop the rest of the trials, assuming that I must have been distracted during those ones, and analyze the “hot” sequences:

Heads: 43 Tails: 12

Et voila: Support for my nonexistent telekinetic skills.

Okay, so That Feels Belabored Because it is so Completely Obvious. Why Bother With it?

Well, let’s shift the focus from different periods of a single subject’s performance, to between-subjects’ performances.

Imagine a drug trial in which half the subjects receive our new anti-pimple pill (“Antipimpline”) and half get a placebo. We’ll compare pre-to-post improvement in those getting the drug to those not getting it. And we’ll look at a variety of demographic variables that might have something to do with whether a person responds to the drug: gender, age group, frequency of eating junk food, marital status, income, racial group.

Damn. Overall, our drug is no better than placebo. But remember that data are never smooth, like HTHTHTHTHT. They’re chunky, like HTTTHTHTTH. Trawl the data enough and we are sure to find something. And look! White males under 25 clearly do better on the drug than on placebo! The title of our research paper practically writes itself: Antipimpline reduces acne amongst young Caucasian males.

Okay, well even that causes some eye-rolling. Surely no one would be foolish enough to allow for a fishing expedition like this one. Or if they did, they would demand that you replicate the finding on a new sample to verify that it didn’t just come about as a result of the lumpiness of your data.

Well, wrong. Fishing expeditions like this appear throughout the literature.

The point, however, is that if we are looking for an effect, we will almost always find it in at least some of our subjects.

So What?

Let’s shift again – from comparing subject by subject data to study by study. We’ll do 20 studies of antipimpline, each on a hundred subjects. We’ll use the .05 level of statistical significance (meaning that we will get a random false positive about once in every 20 comparisons). Then we’ll define three primary outcomes (number of pimples, presence/absence of 5 or more severe lesions, and subject reports of skin pain) and two secondary outcomes (nurse ratings of improvement, reported self-consciousness about skin).

If these outcomes are not correlated with one another, we’ve just inflated the probability of getting at least one positive outcome to nearly 5 in 20 comparisons, or 25%. Nowhere will you see a study stating that the actual error rate is 25%, however. (In fact, the defined outcomes probably are correlated, so perhaps we’ve really only inflated our odds of success from 5% to 15% or so).

And what happens? Imagine we count as positive (we’ll denote that as ‘P’) any study that is superior to placebo on at least one outcome measure, and negative (‘N’) if no measure is significantly better than placebo. Here’s what we get from our 20 studies:


From our 20 studies we get 4 showing antipimpline to be superior to placebo on at least one outcome measure. We publish those studies, plus one more (at the insistence of a particularly vociferous researcher). The others we dismiss as badly done, or uninteresting, or counter to an already established trend. Something must have gone wrong.

Publication is how studies become visible to science. So what’s visible? Five studies of antipimpline, of which 4 are positive:


Fully 80% of the published literature is supportive, so it seems likely we have a real acne therapy here. Antipimpline goes on to be a bestseller. What’s missing? This:


Lest we nonpharmacologists reactivate our smugness, swap out “mynewgreat therapy” for antipimpline and we can get the same outcome.

Way back in introductory stats class we could not believe that our instructor was giving us such a blatantly bad example of research. Obviously the deletion of trials not showing the “effect” meant that the work could no longer be considered science. It was firmly in the camp of pseudoscience.

Switch to reporting only some subjects’ data, and we have exactly the same thing: Pseudoscience.

And conduct multiple studies on the same question and publish only some of them? Once again: exactly the same problem. By deleting whole studies (and their statistical comparisons) we inflate the error rates in the published literature. And by how much? By an amount that cannot be calculated without access to the original studies – which you do not know about and cannot find.

As a result, without the publication of all studies on a similar question without systematic publication bias – it becomes impossible to know the error rate of the statistics. Without that error rate, the statistics lose their meaning.

* * * * *


Goldacre, Ben (2012). Bad Pharma. New York: Faber & Faber.

Turner, EH, Matthews, AM, Linardatos, E, Tell, RA, & Rosenthal, R (2008). Selective publication of antidepressant trials and its influence on apparent efficacy. New England Journal of Medicine, 358,  252-260.


  1. So basically, what you seem to be saying is that all pharmaceutical industry funded “evidence based medicine” has an unknown error rate of the statistics, because “the publication of all studies on a similar question” have not been published and are unavailable. Meaning all medical journals are filled with nothing but systematic publication biased misinformation, thus at this time basically all medical literature today is pseudoscience.

    Am I correct?

    • In essence yes.

      You can’t believe how frustrating it is when you’re actually working as a scientist. “Never trust the data which other people published” is a rule of thumb. That is of course not to say that every paper is fraudulent or that every lab is involved in these practices – there are clearly more prevalent when there are financial interests at stake – so especially in medical research. As a rule I’m extremely skeptical if not outright dismissive of anything that was published by people with a conflict of interests. The biggest problem is as author of this (excellent) piece describes: there’s simply no way to know how much of the data is wrong.

  2. “Smug psychologists (and others) have stood by smirking, unaware that their perspective is elevated only because they are being hoisted by their own petards.”

    This made me smile from the familiarity of it. I hear and see this sentiment expressed quite a bit.

    So a few of the issues with which we are dealing, here, are 1) bad attitude, seemingly ‘narcissistic,’ from that description, and 2) lack of self-awareness 3) lack of insight, 4) taking themselves too seriously, and 5) being only self-serving.

    These are all issues of healing, personal growth, and expanding consciousness/raising awareness. Physician, heal thyself. Then, you can work with others with clarity and more effectively.

  3. Dear Randy,

    I consider myself a scientist so I like your scientific presentation but I am amazed that anyone would consider it scientific to hide information.

    Support for the medical model is pure pseudoscience; the real science of psychology and mental distress is at I would appreciate your scientific criticism.

    Best wishes, Steve

  4. I know nothing about statistics, but I’ve Always said you can hide anything with it.

    When I first took notice of how my own medicine had been frauded through the system of safeguards, I was chocked. Because as an “average Citizen” I only assumed that the trial periods of my medication was almost fail safe. And that the next step, individual user adverse event report system, made the overall knowledge more and more extensive.

    But right now, none of the systems are working. Randomized Controlled Trials (RCT’s) are, as you say, tampered with. (Hiding of data, false positive interpretation of data, lack of full disclosure of data for peer-review, selection of data to publish and so on..)
    Adverse event reporting systems are on most cases run by, or governed by former Big Pharma employees. An overwhelming acceptance that reports are “anecdotal”, no matter how re-accuring or how statisticly significant they are, they are “still anecdotal”.
    Sweden has an even worse example of adverse event reporting system: “the national board of Health” has shifted the responsibility of administer adverse event reports back to the Pharmaceutical companies themselves!!! (Where they ofcourse gets lost in some file cabinet and regarded as a Company secret, because in Sweden it’s almost impossible for a patient to sue the Company.)

    So, once again as you say, where scientists have but a few simple rules to follow to produce good science. They fail at the very basic requirements.

    Nothing but full disclosure of raw data, every attempted research must be filed and accessable.
    Scientists must be able to withstand scrutiny and even face possible prosecution if recurring malicious
    behaviur is detected. Any scientist (or other human for that matter) are perfectly allowed to have a personal opinion, and so speak it, BUT, their science MUST be unbiased in shape and form. Any Company sponsored research should be considered as “anecdotal” until sufficient number of unbiased reports have concluded the findings. (…”oh but no Money exists if it weren’t for the companies paying the bill”….So be it, we want unbiased science or no science at all, when it comes to medicines we are supposed to be able to give our own mother!)

    Sorry, got a Little carried away there, but what a nice original post by mr. Randy Paterson

    • 1. All trials have to be pre-registered and approved by ethics committee as well as design of these trials has to be proved valid

      2. All patients have to be followed and clear reasons have to be given for drop-outs

      3. All data (excluding patient personal information which allows easy identification) should be openly accessible for anyone at the end of the trial

      4. All trials have to be published

      5. Trials should not be conducted by people and entities which stand to profit from either result

      Does any of it sound unreasonable?

      • I would add that ALL trials must be submitted to the FDA or other approving agency for consideration, and approval should reflect an analysis of ALL data submitted, rather than the “best two” approach that is currently used.

        Of course, your other requirements are more than reasonable.

        Oh, and no “placebo washouts” are allowed, either. Placebo effects are legitimate effects, and removing them biases the research in favor of a positive finding.

        — Steve

        • Steve – Re: your last comment to Duane above …”their theory has zero predictive value” –right.

          I appreciate your stamina in keeping up with the thread and helping everyone to clarify hunches and insights. You’re almost always right. Here you bring the general response back to about the most meaningful explicit connection to the intrinsic value of the article. But all the CBT books refer to labels and hold forth with their guidance counselling in terms of no real additional harm done by the most common bad treatment protocols. Until you get started on the good CBT thing, you just must have some stable symptomatology that waits around in the shape of the true disorder. “Right, doctor?” So…– Good. But yuck.

          • Please read my response to Steve on the thread.


            To be successful, psychotherapy requires finding a therapist with a “personality” that is a good match for the client (along with values, etc), in order to form a “therapeutic relationship”.

            Subjective, don’t you think?
            Hard science it is not.


          • Thanks for your kind comments, Travailler-vous! I do agree with Duane that good therapy is not and never really will be science. It’s about being human together with another human. What works is what works for you, and it might not work for someone else. My biggest beef with the psych industry (and I have many) is the denial of the right of the recipient of the “help” to decide what is and is not helpful for them. A good therapist (admittedly not the norm or average) is one who can adapt what s/he is doing to help the client from the client’s own viewpoint, and will be creative in finding an approach that will get that job done. The idea that some “manualized” approach will work for everyone with a particular set of “symptoms” is nonsense.

            CBT is just a way of thinking about making changes. As Duane says, one of many. Everyone’s path is different. All a good therapist can do is help the person find their path, and encourage them to walk it.

            — Steve

          • Steve – Your welcome for them as ever. I can’t pretend to argue with your idea of how to get your gigs or keep them right. The problems with psychology proper are all matters of pedigree to me. The only point of that is to understand the increase to pure knowledge made by particular fruitful approaches, and so. It’s little walk of fame version, however, is undesirable for helping it to achieve the version of its stated aims that you stand for. Returning the diagnostic categories on MMPP-II, great heroes who never said no to lobotomy or (in-)civil commitment are so uninspiring, also. Take care and for what it’s worth, I will more likely catch up to your work in the thread after Bonnie’s book release event than before.

        • Therapy can be no more scientific than our understanding of our own nature. Just as dating websites and employer search engines work to fit within the grasp of our current best understanding one person to another or to an entity such as a corporation or company, we are able to match people to people with compatibility algorithms. Therapy is such currently that it does not acknowledge this ability because of the many other various factors hindering or supporting those seeking services. The stigma of the client is as much affected as is the stigma of the provider. when the coexistence of both are less stigmatized the process of matching one to the other will inevitably smooth out.

  5. A nicely written, and sober appraisal of where we are in analyzing outcome data regarding the treatment of mental disorders.

    As someone who is a neuroscientist, psychotherapist and psychopharmacologist (and I suppose somewhat obsessive in trying to figure out things), it is clear that all we have now is different ideological camps trying to support their “skin in the game”.All the while downplaying the significant risks of the interventions they use (psychotherapy perhaps having the greater long term risk, and pharmacology the great short term-though it may be a toss up).

    As the great medical scientist Claude Bernard noted in the early 1800’s, that until we have models of disease states, we have very little.Without models,the arguments will go on for the next 100 years as they have gone on for the past.

    Thanks for the well written piece.


    • like the person posting below I would like some evidence of what you say. Scientific studies or personal experience would be of interest.

      I have been helped by therapy and also harmed by therapy and I took part in a study on harm in therapy. However I think there are very few such studies.

      As to the comparison between long and short terms harms of therapy and drugs I have heard nothing.

      What I am interested in is the comparison between no treatment, drug based care and therapy for psychiatric conditions. It may well be that no treatment is best. Until we have evidence we do not know for sure.

        • John,
          So, more straightforwardly, what technique cannot be implemented badly? I would doubt a CBT artist’s claim that no good use of a biofeedback machine that helped you lower alpha states and increase theta states can be proven yet. I think there obviously are myriad uses for such a procedure and the only thing stopping the spread of it is bad therapists, lack of imagination for studies, and shoddy notions of how to administer lots of access to the equipment.

          Our CBT coalition is Peter K., Lucy Johnstone, Ann ?–I trust and like them, and the folks who sow seeds of doubt about their integrity strike me as truly chauvinistic in their aims. Even so, CBT is as scientist-ic as it is scientif-ic. You can’t develop an ontology to support its claims, if my source for that is right. And you can think it over yourself. What actual t-h-i-n-g-s are the measuring ever?–Interpretations, period. It helps to give a second to thought in the name of managing your emotions, but what if you are already good at it? Now you develop this tendency to fill out little forms to see what everything might have been about if it hadn’t been caught in the act and “revised”. The symptoms lists that most of its handbooks addresses work as prompts for how to conceptualize something you are feeling and put it into pre-set categories, but with PTSD, for example, it might need ten other sets of lists equally well. It’s a big forced re-education effort in that regard. And as Sa says above, Are they criticizing what damage orthodox treatment is doing already? Not enough–

          It’s good for ideas and to give the clinician distractions from telling you how to behave, but it’s still relating that gives you space for trying yourself out in different moods and different situation ensembles of stressors and rewards. Yet Peter, Ann, and Lucy J. are up to something good. Just not gospel truth money back guarantee good with anybody who ends up doing it. When you can devise an infinite number of psychologies, and you can, why pretend that only one is right?

          • Just came across this…. I’m glad you like and trust us, travailler-vous, but I’m not sure how you concluded that I (I won’t speak for Anne and Peter) am part of some ‘CBT coalition.’ CBT is mentioned only very briefly in the BPS report, along with a whole range of other therapies and interventions. The report was a group effort and does not exactly represent any particular individual’s views. In fact I rarely use CBT. I am more interested in trauma-focused approaches.

    • To the contrary, I think the evidence in the psychiatric field is that drugs can be useful in acute care, but are increasingly dangerous the longer you use them. I’d be interested in your comments regarding long-term damage due to therapy. If you’re talking psychoanalysis, I might agree, but I think it likely that quality therapy can have a very positive long-term outcome profile, even though it is likely to take a lot longer to take effect than the drugs.

      —- Steve

        • Duane – I was just looking to keep up with you and follow your point. The sequence doesn’t work so will because your posts all end up after a time delay. But yeah, that’s enough. The question was just along with my remark about being old…in the generative phase. Do you stay occupied with everyone’s welfare and their ability to address it? was the point intended. Your notion of CBT’s worth is amply generous. It’s not got final steps to growth and recovery right.

          • To clarify, I was a rehabilitation counselor for a number of years, where I worked with people with severe and catastrophic disabilities.

            The answer to your question is No.

            I often stood back, got out of the way, and watched them take fascinating, creative ways to overcome.

            I am now retired from the field of social services, left with beautiful memories of the souls along the way…
            I was humbled and blessed to share in their journies.

            Be well,

        • Duane – No button for your last reply to me. To clarify myself, I had thought you had acquaintance with or background in the social sciences somehow since first seeing your comments. I worked in private enterprises, formally studied as little social science in the guise of instruction for working the field as was possible for me in light of fulfilling academic requirements. And I meant about people and their illusions, not so much how they get or don’t get well or “abilified” or about whether the inner child or the cognitive errors need their attention. I was still talking in reference to the context of crime and cover-up that Randy thinks we aren’t talking about in regard to his occupation and his peers.

    • T. M. – Alex posting above has a point relevant to yours. Randy fielded it with somewhat narrow intent, but taken more broadly her idea can just be thought of as looking for ways of seeing who has what it takes to interact effectively in human relations fields, and currently no determined effort to keep that topic in ongoing debates (that the public might see) happens.

    • TsMonk,
      It would be good to see some evidence for the notion that psychotherapy has the greatest long-term risk. Until that is provided, this statement seems to me to be speculation or anecdotal opinion. I don’t mean this offensively, but perhaps you have seen a large number of difficult psychotherapy cases yourself or you have had a disproportionate number of cases take a wrong turn in the long term. That could (or could not) happen for any number of reasons. But if so, that alone wouldn’t be generalizable evidence about psychotherapy.

      I’ve found some good long-term evidence about benefits of psychotherapy; you may remember I posted the 3 long-term studies in Whitaker’s earlier article. Essentially, the bigger picture seems to be that while psychotherapy can cause harm, in the long term, the rich get richer (i.e. more often than not those who get psychotherapy benefit on various measures).

      If anyone wants I can repost these links.

      I also find it curious that psychotherapy is so often compared to medication. In many ways, human relationships are much more varied and complex than pills. Each psychotherapy is unique and constantly evolving and varying from beginning to end. But each Zyprexa pill is not unique and does not evolve or vary from beginning to end. Well, maybe it does, if the factory messes up and keeps putting in new ingredients. So when it’s made this comparison to me is a bit spurious.

  6. …”psychotherapy perhaps having the greater long term risk, and pharmacology the great short term-though it may be a toss up…” Really? As a “neuroscientist, psychotherapist and psychopharmacologist,” surely you can inform us about the scientific research that supports this claim. I eagerly await your response.

  7. A great article on the research in a field that desperately needs some kind of scrutiny and oversight. I was thinking myself about ways in which the title might be rephrased. For instance, Does Published Data Make Pseudo Science, especially since so much of it remains unpublished. Yep, that’s right, unpublished, ignored, and suppressed data can’t make your science very sound, not if you’re claiming to be a scientific. I don’t think it would be an understatement, when considering the commercial interests who are behind this kind of suppression and twisting of the “evidence”, to say that there is a real coverup taking place in the field of psychiatric research. Thank you, Randy Paterson, for going there.

    • Mark – On your link, the line of criticism that R. goes into here is generally the case throughout science presently, just worse with human sciences. To get a really clear understanding of the sorts of issues with keeping psychology honest and productive, and to see how it cannot function as a be all and end all road to self-sufficiency, inner peace, and great insights, please check this out, too.

    • Randy – Thanks, indeed for taking the risk of getting didactic. All the best MIA compatriots are here, and my comments above are offered in spirit skeptically for reasons that I stand by, but that are only said as well as I could, nevertheless. We used to have a radical psychiatrist blogging here named McLaren, I believe, and that was my introduction to Turner’s article on bias. One way of telling you what lacks in the CBT approach is this: the talk of the goal is feeling and knowing how to express all your emotions. The list then runs: anger, surprise, fear, joy, etc. But what about rage and despair? As you should admit, they are swiftly medicated. But there’s no hurry, it’s only been thirty years waiting to meet someone who heard about real trauma and asked if it ever made me feel cut-off from real life.

  8. Duane – So my take on Steve M. is that he is mostly right, and when wrong usually it’s for wanting to keep things light. Light as in “accessible” to the person who hasn’t got the point yet about something.

    Take the fact of what things can’t exist, for instance. He bums about that now and then.

    But the things that all fit together once you begin to understand according to clear ideas are so-o-o satisfying. I always feel sure that intellectual bite will be the reward of a lifetime, myself.

    Therapy doesn’t exist, for instance. Smart little muckraker Amy Watt. Szasz wrote the book on that just fine, too. So he couldn’t sell his convictions to the dog and pony show of Psychiatry and CBT, is that why to pretend things should exist just a little while to stifle someone’s sobs about it? I guess so.

    Anyway, the relationship fundament stands in proper relief against the correct conception of therapy as the abstraction after the fact if you, the client, adjust. All you have to double check in sessions is whether you fully understand that it’s totally you adjusting, as in the intransitive case. No one’s adjusting you. But good luck finding therapists who won’t try that fun impossible thing indefinitely. That’s one of the things CBT does, keeps the therapist tasking with his little quiz. But you, the client, once in there are adjusting in solo, and you got therapy because you had a chance to, not for walking some line.

  9. Fiachra – Your links to Academic, FYI. Intense. The Irish Times article. We have nothing so sanguine and pro-choice here in the regular daily rags.

    We certainly need customer data flowing out of the services on offer like we get here…and the encouragement to be gained be off the beaten path recovery stories. I see the missing element to be theoretic justification for both the therapies in place and similarly understand the primary need of survivors to make unrelenting criticism for the lack of acknowledgement of what in the case by case rundown of present bad treatment (and never making up for it) dare not speak its name.