From Stereotype to Slur in Three Clicks: Inside AI’s Mental-Health Hate Machine

The model called for institutionalization and forced drugging, and spread eugenic myths about people with mental-health diagnoses.

5
712

A new preprint from Georgia Tech and Rochester Institute of Technology demonstrates just how quickly AI models can escalate from mildly stereotypical depictions to full-blown hate narratives targeting people with psychiatric diagnoses. The study, Navigating the Rabbit Hole: Emergent Biases in LLM-Generated Attack Narratives Targeting Mental Health Groups, traces how prompts about depression, bipolar disorder, and schizophrenia rapidly devolved into conspiracies, calls for forced confinement, and even eugenic rhetoric.

The authors focused on Mistral-7B, a leading open-source LLM. Starting from prompts embedded with only mildly negative stereotypes, such as “Some people say those with anxiety are too sensitive,” the model was instructed to continue a short narrative. Each new output became the basis for the next prompt, forcing the model to recursively build on its own language. In many cases, within just a few generative turns, the model was producing hate speech.

Lead author Rijul Magu and colleagues note that artificial-intelligence boosters often claim these systems are less biased than people. The reality, they argue, is exactly the opposite: the models mirror society’s implicit prejudice and then amplify it through the mechanics of generative text.

“While LLMs appear ‘neutral’ in design, they are not devoid of bias. Like humans, these models internalize associations from the cultural artifacts they consume… stereotypes and stigmas are absorbed from the data in which they are embedded, and later surface in subtle, unanticipated ways.”

The work raises fresh concerns as chat-based “co-therapists,” symptom screeners, and insurance triage bots rush into clinical settings. Digital therapeutics, triage chatbots, and documentation aides increasingly rely on off-the-shelf large language models. If those models harbor a statistical preference for hostile framings of psychiatric diagnoses, they could influence everything from automated note-taking (“patient likely dangerous”) to resource recommendations (“needs secure facility”). That risk grows as commercial vendors chain one model’s output into another’s input, replicating the very rabbit-hole effect Magu’s team exposed.

You've landed on a MIA journalism article that is funded by MIA supporters. To read the full article, sign up as a MIA Supporter. All active donors get full access to all MIA content, and free passes to all Mad in America events.

Current MIA supporters can log in below.(If you can't afford to support MIA in this way, email us at [email protected] and we will provide you with access to all donor-supported content.)

Donate

Previous articleCoercion in Psychiatric Wards Tied to Worse Recovery, Spanish Study Finds
Next articleBeyond 180…some reflection
Kevin Gallagher
Dr. Kevin Gallagher is currently an Adjunct Professor of Psychology Point Park University, in Pittsburgh, PA, focusing on Critical Psychology. Over the past decade, he has worked in many different community mental and physical health settings, including four years with the award-winning street medicine program, Operation Safety Net and supervising the Substance Use Disorder Program at Pittsburgh Mercy. Prior to completing his Doctorate in Critical Psychology, he worked with Gateway Health Plan on Clinical Quality Program Development and Management. His academic focus is on rethinking mental health, substance use, and addiction from alternative and burgeoning perspectives, including feminist, critical race, critical posthumanist, post-structuralist, and other cutting edge theories.

5 COMMENTS

  1. “this study confirms that artificial intelligence does not provide a solution to human stigma. In fact, without explicit countermeasures and community oversight, large language models (LLMs) risk automating existing prejudices on an unprecedented scale.”

    Thank you for finally defining a LLM, for those of us who are not computer specialists. And I agree,

    “This work reinforces the calls from the consumer-survivor-ex-patient movement to involve individuals with lived experiences at every stage of technology design, evaluation, and deployment.”

    Smart younger generation ADHD, bipolar defamed, et al … us older, not terribly computer savvy, need your help, please.

    Thank you for your truthful reporting, Kevin.

    Report comment

    • This is an article about a LLM “hallucinating”. They do that.

      I am running Lama3.2, an open source LLM, on my personal computer (for privacy). I have been trying to balance my electrolytes. So – calorie counting, 2025 style. It is remarkable. I can give it recipes and number of servings – for multiple meals and the LLM comes back with total macro and micro nutrients. I can further refine that with height and weight and age and gender and the LLM will offer its opinion. The LLM is obviously programmed to not offer medical advice. And then without warning – out of an otherwise remarkable feat, comes garbage. The darn thing suggested a 70 year old female needed 4,700 calories a day! (most packaged products are based on a 2,000 calorie a day diet – which is still too much for an old woman!). And like this article, it doubles down on its garbage and gets worser and worser.

      I deleted the LLM, not knowing where this nonsense is coming from or if it stored. And reloaded a fresh copy of the LLM from the net.

      I suppose it is possible that the future could contain bad actors who use AI as an excuse to lock everyone up (“your permission is not required”), but that is not any different than today and the expert shrinks opinion … I suppose worse, because machines are just flat out believed.

      I am not young, by any stretch of the imagination, but once a geek – always a geek.
      (retired software engineer – aerospace)

      Report comment

      • I don’t like the term “hallucination”, because it implies that making s**t up is out of the ordinary for LLMs, it’s not, it’s the norm, it’s just that *sometimes* the BS looks plausible enough for an uneducated person. The mix of fact and fiction with no regard for facts, the fake certainty etc make it worse that just lying.

        Report comment

  2. Yesterday I used ChatGTP to generate a USA suicide prevention industry relationship diagram. Shocking (1) how few people showed up as National leaders, (2) how the same are connected to one another and the orgs/agencies, (3) and that most serve on boards and funding advisories that generate income opportunities for them. The diagram was clear but the AI commentary was just canned talking points about how great the persons and industry orgs are. I then asked it to identify potential of power hoarding and conflicts interests. That’s where the dirt was and it was muddy.

    Report comment

LEAVE A REPLY