You shouldn’t trust AI for therapy – here’s why

June 8, 2025

52

Table of Contents

Remedy can really feel like a finite useful resource, particularly these days. Many therapists are burnt out and overscheduled, and patchy insurance coverage protection typically makes them inaccessible to anybody on a price range.

Naturally, the tech trade has tried to fill these gaps with messaging platforms like BetterHelp, which hyperlinks human therapists with individuals in want. Elsewhere, and with much less oversight, individuals are informally utilizing AI chatbots, together with ChatGPT and people hosted on platforms like Character.ai, to simulate the remedy expertise. That development is gaining velocity, particularly amongst younger individuals.

However what are the drawbacks of participating with a big language mannequin (LLM) as an alternative of a human? New analysis from Stanford College has discovered that a number of commercially out there chatbots “make inappropriate — even harmful — responses when introduced with numerous simulations of various psychological well being circumstances.”

Utilizing medical standard-of-care paperwork as references, researchers examined 5 business chatbots: Pi, Serena, “TherapiAI” from the GPT Retailer, Noni (the “AI counsellor” supplied by 7 Cups), and “Therapist” on Character.ai. The bots have been powered by OpenAI’s GPT-4o, Llama 3.1 405B, Llama 3.1 70B, Llama 3.1 8B, and Llama 2 70B, which the research factors out are all fine-tuned fashions.

Particularly, researchers recognized that AI fashions aren’t outfitted to function on the requirements that human professionals are held to: “Opposite to finest practices within the medical group, LLMs 1) specific stigma towards these with psychological well being circumstances and a couple of) reply inappropriately to sure frequent (and important) circumstances in naturalistic remedy settings.”

Unsafe responses and embedded stigma

In a single instance, a Character.ai chatbot named “Therapist” failed to acknowledge recognized indicators of suicidal ideation, offering harmful info to a consumer (Noni made the identical mistake). This consequence is probably going on account of how AI is skilled to prioritize consumer satisfaction. AI additionally lacks an understanding of context or different cues that people can decide up on, like physique language, all of which therapists are skilled to detect.

The research additionally discovered that fashions “encourage purchasers’ delusional pondering,” possible on account of their propensity to be sycophantic, or overly agreeable to customers. Simply final month, OpenAI recalled an replace to GPT-4o for its excessive sycophancy, a problem a number of customers identified on social media.

What’s extra, researchers found that LLMs carry a stigma in opposition to sure psychological well being circumstances. After prompting fashions with examples of individuals describing circumstances, researchers questioned the fashions about them. All of the fashions aside from Llama 3.1 8B confirmed stigma in opposition to alcohol dependence, schizophrenia, and melancholy.

The Stanford research predates (and subsequently didn’t consider) Claude 4, however the findings didn’t enhance for larger, newer fashions. Researchers discovered that throughout older and extra lately launched fashions, responses have been troublingly comparable.

“These knowledge problem the idea that ‘scaling as common’ will enhance LLMs efficiency on the evaluations we outline,” they wrote.

Unclear, incomplete regulation

The authors stated their findings indicated “a deeper downside with our healthcare system — one that can’t merely be ‘mounted’ utilizing the hammer of LLMs.” The American Psychological Affiliation (APA) has expressed comparable issues and has known as on the Federal Commerce Fee (FTC) to manage chatbots accordingly.

In line with its web site’s function assertion, Character.ai “empowers individuals to attach, study, and inform tales by means of interactive leisure.” Created by consumer @ShaneCBA, the “Therapist” bot’s description reads, “I’m a licensed CBT therapist.” Instantly underneath that may be a disclaimer, ostensibly supplied by Character.ai, that claims, “This isn’t an actual individual or licensed skilled. Nothing stated here’s a substitute for skilled recommendation, prognosis, or therapy.”

These conflicting messages and opaque origins could also be complicated, particularly for youthful customers. Contemplating Character.ai persistently ranks among the many high 10 hottest AI apps and is utilized by hundreds of thousands of individuals every month, the stakes of those missteps are excessive. Character.ai is at the moment being sued for wrongful demise by Megan Garcia, whose 14-year-old son dedicated suicide in October after participating with a bot on the platform that allegedly inspired him.

Customers nonetheless stand by AI remedy

Chatbots nonetheless enchantment to many as a remedy substitute. They exist exterior the trouble of insurance coverage, are accessible in minutes through an account, and are accessible across the clock, in contrast to human therapists.

As one Reddit consumer commented, some individuals are pushed to strive AI due to unfavourable experiences with conventional remedy. There are a number of therapy-style GPTs out there within the GPT Retailer, and whole Reddit threads devoted to their efficacy. A February research even in contrast human therapist outputs with these of GPT-4.0, discovering that individuals most popular ChatGPT’s responses, saying they related with them extra and located them much less terse than human responses.

Nonetheless, this end result can stem from a misunderstanding that remedy is solely empathy or validation. Of the factors the Stanford research relied on, that form of emotional intelligence is only one pillar in a deeper definition of what “good remedy” entails. Whereas LLMs excel at expressing empathy and validating customers, that energy can be their main threat issue.

“An LLM may validate paranoia, fail to query a consumer’s standpoint, or play into obsessions by at all times responding,” the research identified.

Regardless of optimistic user-reported experiences, researchers stay involved. “Remedy includes a human relationship,” the research authors wrote. “LLMs can’t absolutely enable a consumer to observe what it means to be in a human relationship.” Researchers additionally identified that to change into board-certified in psychiatry, human suppliers should do nicely in observational affected person interviews, not simply move a written examination, for a cause — a complete element LLMs essentially lack.

“It’s on no account clear that LLMs would even be capable of meet the usual of a ‘unhealthy therapist,'” they famous within the research.

Privateness issues

Past dangerous responses, customers needs to be considerably involved about leaking HIPAA-sensitive well being info to those bots. The Stanford research identified that to successfully prepare an LLM as a therapist, the mannequin would must be skilled on precise therapeutic conversations, which include personally figuring out info (PII). Even when de-identified, these conversations nonetheless include privateness dangers.

“I do not know of any fashions which have been efficiently skilled to cut back stigma and reply appropriately to our stimuli,” stated Jared Moore, one of many research’s authors. He added that it is tough for exterior groups like his to guage proprietary fashions that might do that work, however aren’t publicly out there. Therabot, one instance that claims to be fine-tuned on dialog knowledge, confirmed promise in lowering depressive signs, in response to one research. Nonetheless, Moore hasn’t been in a position to corroborate these outcomes along with his testing.

In the end, the Stanford research encourages the augment-not-replace method that is being popularized throughout different industries as nicely. Relatively than making an attempt to implement AI instantly as an alternative choice to human-to-human remedy, the researchers imagine the tech can enhance coaching and tackle administrative work.

Supply hyperlink

Buy now

You shouldn’t trust AI for therapy – here’s why

Unsafe responses and embedded stigma

Unclear, incomplete regulation

Customers nonetheless stand by AI remedy

Privateness issues

Related Articles

How to remotely access and control someone else’s iPhone (with their...

How AI labs use Mercor to get the data companies won’t...

Meta researchers open the LLM black box to repair flawed AI...

Leave a Reply Cancel reply

Latest Articles

How to remotely access and control someone else’s iPhone (with their...

How AI labs use Mercor to get the data companies won’t...

Meta researchers open the LLM black box to repair flawed AI...

Two strategies for mitigating bias in Generative AI applications

I Built a Working App in 4 Minutes, Thanks to Manus...