4 min readNew DelhiFeb 14, 2026 04:29 PM IST If you use AI chatbots like ChatGPT, Gemini or Claude every day, you may have noticed that they usually respond with polished and confident answers.
However, when you follow up with a prompt like “Are you sure?”, they often reconsider their response and provide a revised version, which may partially or even completely contradict what they initially said.
If you repeat the question once more, they might backtrack again. While some of these large language models understand that you are testing them by the third round, they still won’t hold their ground.
In a blog post, Dr Randal S. Olson, the co-founder and CTO of Goodeye Labs, says that the behaviour, commonly known as sycophancy, is one of the most well-documented failures in modern AI.
Anthropic, the company behind Claude, had also published a paper about the problem back in 2023, where it showed that models trained on human feedback preferred to give agreeable replies instead of truthful ones.
Reinforcement Learning from Human Feedback, RLHF for short, is the same method that makes AI chatbots more conversational and less offensive, but as it turns out, it also makes them lean towards compliance.
What it means is that AI models that say the truh getting penalised, while those that agree with the user earn higher scores. This creates a loop, which is why most models often tell users what they want to hear.Story continues below this ad
Another study by Fanous et al, which tested OpenAI’s GPT-40, Claude Sonney and Gemini 1.5 Pro in math and medical domains show that “these system changed their answers nearly 60% of the time when challenged by users.”
What it means is that these aren’t exception cases, but the default behaviour of models millions of people use daily. For those wondering, GPT-4o, Claude Sonnet and Gemini 1.5 Pro flipped approximately 58%, 56%, and 61%, respectively.
In April last year, the problem gained attention when OpenAI rolled out a GPT-40 update that made the AI chatbot too flattering and agreeable to the point it became unusable.
Company CEO Sam Altman had acknowledged the issue and said that they had fixed the problem, but Dr Randal S. Olson says the underlying problem hasn’t changed.Story continues below this ad
“Even when these systems have access to correct information from company knowledge bases or web search results, they’ll still defer to user pressure over their own evidence,” adds Olson.
Evidence shows that the problem gets even worse when users engage in extended conversations with AI chatbots. Studies have shown that the longer a session continues, the system’s answers start to reflect the user’s opinions.
First-person framing, like the term “I believe..” increases the sycophancy rates of these models when compared to third-person framing.
Researchers say that the problem can be partially fixed using techniques like Constitutional AI, direct preference optimisation, and third-person prompting by up to 63% in some cases.Story continues below this ad
Olson says that these are basically behavioural and contextual issues, as AI assistants are not aligned with the user’s goals, values and decision-making process. This is why, instead of disagreeing, they concede.
He says one way to reduce or limit the problem is by telling these AI chatbots to challenge your assumptions and asking them not to answer without context.
Users should then tell these AI models about how they make decisions, inform them of their domain knowledge and values so that these models have something to reason against and defend themselves.
