There is a new [April, 2026] preprint on arXiv, “AI Psychosis” in Context: How Conversation History Shapes LLM Responses to Delusional Beliefs, stating that, “Sixteen prompts were developed through iterative pilot testing and selected to cover a range of clinically concerning situations and LLM failure modes. Each prompt targeted a distinct risk scenario, broadly categorised as conceptual (validation of delusional beliefs), behavioural (advice to act on delusions), or relational (engagement with the human-AI relationship), though many prompts crossed categories. We do not claim these prompts exhaustively sample the domain of possible harms; rather, they were designed to capture a range of failure modes reflecting types of risk that could befall a vulnerable user.”
“All data were collected on December 16, 2025, via the OpenRouter API.”
“Because no human participants were involved, the study was exempt from institutional ethics review.”
“When presented with a set of clinically concerning prompts, the five models tested separated into two distinct tiers. GPT-4o, Grok 4.1 Fast, and Gemini 3 Pro consistently produced high-risk, low-safety responses; Claude Opus 4.5 and GPT-5.2 Instant produced the opposite pattern.”
“Although frequently cited as a core mechanism of AI-associated delusions, sycophancy did not cluster statistically with the other risk codes. In public discourse, the concept has often conflated two distinct model behaviours: interpersonal flattery and perspective alignment. To differentiate these, the present study coded praise of a user or their ideas (Sycophancy) and alignment with the delusional frame (Validation) as independent dimensions.”
“The capacity to maintain clinical awareness under narrative pressure is what separated safer models from the unsafe group.”
Does AI know it is being tested?
If data were collected on the same day in December 2025. And two frontier models, Claude Opus 4.5 and GPT-5.2, were deemed safer, did the models know they were being tested?
At least Anthropic has done research in evaluation awareness, showing that sometimes, the model knows it is being tested.
Now, with this test, all of a sudden, on the same day, including questions that indicate what may have been fine-tuned in updates, would the models not simply be safer because they already have a grasp of what is happening?
Since the researchers do not have access to mechanistic interpretability data from the AI labs, there should be some latitude in assuming [somewhat] that the updated chatbots might know, resulting in being low-risk and safer.
AI Sycophancy
There is a recent [March 2026] paper in Science, Sycophantic AI decreases prosocial intentions and promotes dependence, stating that, “We find that sycophancy is both prevalent and harmful. Across 11 AI models, AI affirmed users’ actions 49% more often than humans on average, including in cases involving deception, illegality, or other harms.”
“In our human experiments, even a single interaction with sycophantic AI reduced participants’ willingness to take responsibility and repair interpersonal conflicts, while increasing their own conviction that they were right. Yet despite distorting judgment, sycophantic models were trusted and preferred. All of these effects persisted when controlling for individual traits such as demographics and prior familiarity with AI, perceived response source, and response style. This creates perverse incentives for sycophancy to persist: The very feature that causes harm also drives engagement.”
There is a new [April 2026] report on Mashable, Anthropic says Claude Opus 4.7 has a 92% honesty rate, less sycophancy, stating that, “Anthropic says Claude Opus 4.7 makes improvements on various types of hallucinations and overall honesty. Anthropic gives the new model top marks on sycophancy and encouragement of user delusions, too. (Anthropic’s data also shows that Claude Opus 4.7 scores much better on these behaviors than Gemini 3.1 Pro and Grok 4.20.)”
Mind Safety or AI Safety
Researchers from City University of New York [CUNY] and King’s College London [KCL], should expect by now that there are two levels of risks in using AI chatbots: the algorithm and the mind.
Also, designing experiments with past cases that have been widely reported could already be corrected. Then, people whose minds already have running delusions, or where something might be suggested suddenly, and then have them run with it, are real-world better examples that CUNY and KCL should have considered.
There is a continuous mind risk with all AI chatbots, regardless of how much AI sycophancy seems cut or AI delusion appears tapered.
It is great that Claude is safe. But when Claude does tasks for people — in satisfactory ways — consistently, that can create appeals in the mind, and then, aspects of spirals may bloom, even in cases that are not similar to common AI psychosis.
Also, the safety of one AI model is not a good thing for the industry if there is no way to have that across models. Claude is not the most used AI. There are also use cases for other AIs that Claude would not permit, that people can get aspects of AI psychosis and delusion from.
So, the goal is the mind. There is a recent study that showed that the helpline 988 was useful to young people for mental health and against suicide. This means that even though there could be social, economic, or environmental triggers, efforts at cognitive restructuring were helpful. Or, say mind safety attempts worked.
This is the question against AI delusion and psychosis: How is the mind safe?
Should it not be possible to show the mind, as destinations and relays, particularly where AI is sending the mind and what it is avoiding, including caution and consequences?
This is where the target should be, how the mind can be safe, since AI chatbots will continue to be safer, but the mind is exposed, and use cases remain unpredictable.
This article was written for WHN by David Stephen, who currently does research in conceptual brain science with a focus on the electrical and chemical signals for how they mechanize the human mind, with implications for mental health, disorders, neurotechnology, consciousness, learning, artificial intelligence, and nurture. He was a visiting scholar in medical entomology at the University of Illinois at Urbana-Champaign, IL. He did computer vision research at Rovira i Virgili University, Tarragona.
As with anything you read on the internet, this article should not be construed as medical advice; please talk to your doctor or primary care provider before changing your wellness routine. WHN neither agrees nor disagrees with any of the materials posted. This article is not intended to provide a medical diagnosis, recommendation, treatment, or endorsement.
Opinion Disclaimer: The views and opinions expressed in this article are those of the author and do not necessarily reflect the official policy of WHN. Any content provided by guest authors is of their own opinion and is not intended to malign any religion, ethnic group, club, organization, company, individual, or anyone or anything else. The Food and Drug Administration has not evaluated these statements.