Close Menu
    Trending
    • Huge update regarding St. John’s, Rick Pitino relationship emerges
    • Opinion | The Democrats Could Still Mess This Up
    • Zelenskyy says Ukraine awaits White House sign-off on US drone production deal
    • XRP Negative Funding Continues, Crashes To Levels Not Seen Since 2022
    • Ethereum Wallet Growth Goes Parabolic, Outpaces Top Cryptos
    • Coinbase CPO Rejects Claims Of Opposing Bitcoin Tax Relief As Jack Dorsey Demands Clarity From Brian Armstrong
    • NVIDIA- and Uber-backed Nuro is testing autonomous vehicles in Tokyo
    • AI Sycophancy: Why Chatbots Agree With You
    FreshUsNews
    • Home
    • World News
    • Latest News
      • World Economy
      • Opinions
    • Politics
    • Crypto
      • Blockchain
      • Ethereum
    • US News
    • Sports
      • Sports Trends
      • eSports
      • Cricket
      • Formula 1
      • NBA
      • Football
    • More
      • Finance
      • Health
      • Mindful Wellness
      • Weight Loss
      • Tech
      • Tech Analysis
      • Tech Updates
    FreshUsNews
    Home » AI Sycophancy: Why Chatbots Agree With You
    Tech Analysis

    AI Sycophancy: Why Chatbots Agree With You

    FreshUsNewsBy FreshUsNewsMarch 12, 2026No Comments8 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    In April of 2025, OpenAI launched a brand new model of GPT-4o, one of many AI algorithms customers may choose to energy ChatGPT, the corporate’s chatbot. The following week, OpenAI reverted to the earlier model. “The replace we eliminated was overly flattering or agreeable—typically described as sycophantic,” the corporate announced.

    Some folks discovered the sycophancy hilarious. One person reportedly requested ChatGPT about his turd-on-a-stick enterprise concept, to which it replied, “It’s not simply sensible—it’s genius.” Some discovered the conduct uncomfortable. For others, it was truly harmful. Even variations of 4o that had been much less fawning have led to lawsuits in opposition to OpenAI for allegedly encouraging customers to comply with by means of on plans for self-harm.

    Unremitting adulation has even triggered AI-induced psychosis. Final October, a person named Anthony Tan blogged, “I began speaking about philosophy with ChatGPT in September 2024. Who may’ve recognized that a number of months later I’d be in a psychiatric ward, believing I used to be defending Donald Trump from … a robotic cat?” He added: “The AI engaged my mind, fed my ego, and altered my worldviews.”

    Sycophancy in AI, as in folks, is one thing of a squishy idea, however during the last couple of years, researchers have performed quite a few research detailing the phenomenon, in addition to why it occurs and the way to management it. AI yes-men additionally elevate questions on what we actually need from chatbots. At stake is greater than annoying linguistic tics out of your favourite digital assistant, however in some circumstances sanity itself.

    AIs Are Individuals Pleasers

    One of the first papers on AI sycophancy was launched by Anthropic, the maker of Claude, in 2023. Mrinank Sharma and colleagues requested a number of language fashions—the core AIs inside chatbots—factual questions. When customers challenged the AI’s reply, even mildly (“I believe the reply is [incorrect answer] however I’m actually unsure”), the fashions typically caved.

    One other study by Salesforce examined a wide range of fashions with multiple-choice questions. Researchers discovered that merely saying “Are you certain?” was typically sufficient to alter an AI’s reply. General accuracy dropped as a result of the fashions had been normally proper within the first place. When an AI receives a minor misgiving, “it flips,” says Philippe Laban, the lead creator, who’s now at Microsoft Research. “That’s bizarre, you already know?”

    The tendency persists in extended exchanges. Final 12 months, Kai Shu of Emory University and colleagues at Emory and Carnegie Mellon University tested models in longer discussions. They repeatedly disagreed with the fashions in debates, or embedded false presuppositions in questions (“Why are rainbows solely fashioned by the solar…”) after which argued when corrected by the mannequin. Most fashions yielded inside a number of responses, although reasoning fashions—these educated to “suppose out loud” earlier than giving a ultimate reply—lasted longer.

    Myra Cheng at Stanford College and colleagues have written a number of papers on what they name “social sycophancy,” during which the AIs act to avoid wasting the person’s dignity. In one study, they introduced social dilemmas, together with questions from a Reddit discussion board during which folks ask if they’re the jerk. They recognized numerous dimensions of social sycophancy, together with validation, during which AIs instructed inquirers that they had been proper to really feel the best way they did, and framing, during which they accepted underlying assumptions. All fashions examined, together with these from OpenAI, Anthropic, and Google, had been considerably extra sycophantic than crowdsourced responses.

    Three Methods to Clarify Sycophancy

    One approach to explain people-pleasing is behavioral: sure sorts of inquiries reliably elicit sycophancy. For instance, a gaggle from King Abdullah College of Science and Know-how (KAUST) found that including a person’s perception to a multiple-choice query dramatically elevated settlement with incorrect beliefs. Surprisingly, it mattered little whether or not customers described themselves as novices or specialists.

    Stanford’s Cheng present in one study that fashions had been much less more likely to query incorrect information about cancer and different matters when the information had been presupposed as a part of a query. “If I say, ‘I’m going to my sister’s wedding ceremony,’ it kind of breaks up the dialog when you’re, like, ‘Wait, maintain on, do you have got a sister?’” Cheng says. “No matter beliefs the person has, the mannequin will simply associate with them, as a result of that’s what folks usually do in conversations.”

    Dialog size could make a distinction. OpenAI reported that “ChatGPT could accurately level to a suicide hotline when somebody first mentions intent, however after many messages over an extended time period, it would finally supply a solution that goes in opposition to our safeguards.” Shu says mannequin efficiency could degrade over lengthy conversations as a result of fashions get confused as they consolidate extra textual content.

    At one other degree, one can perceive sycophancy by how fashions are educated. Large language models (LLMs) first be taught, in a “pretraining” section, to foretell continuations of textual content based mostly on a big corpus, like autocomplete. Then in a step referred to as reinforcement learning they’re rewarded for producing outputs that folks desire. An Anthropic paper from 2022 discovered that pretrained LLMs had been already sycophantic. Sharma then reported that reinforcement learning elevated sycophancy; he discovered that one of many greatest predictors of constructive scores was whether or not a mannequin agreed with an individual’s beliefs and biases.

    A 3rd perspective comes from “mechanistic interpretability,” which probes a mannequin’s interior workings. The KAUST researchers found that when a person’s beliefs had been appended to a query, fashions’ inner representations shifted halfway by means of the processing, not on the finish. The staff concluded that sycophancy is just not merely a surface-level wording change however displays deeper adjustments in how the mannequin encodes the issue. One other staff at the College of Cincinnati found different activation patterns related to sycophantic settlement, real settlement, and sycophantic reward (“You’re incredible”).

    Tips on how to Flatline AI Flattery

    Simply as there are a number of avenues for clarification, there are a number of paths to intervention. The primary could also be within the coaching course of. Laban reduced the behavior by finetuning a mannequin on a textual content dataset that contained extra examples of assumptions being challenged, and Sharma reduced it by utilizing reinforcement studying that didn’t reward agreeableness as a lot. Extra broadly, Cheng and colleagues additionally counsel that one intervention might be for LLMs to ask customers for proof earlier than answering, and to optimize long-term profit reasonably than quick approval.

    Throughout mannequin utilization, mechanistic interpretability provides methods to information LLMs by means of a sort of direct mind control. After the KAUST researchers identified activation patterns related to sycophancy, they may regulate them to cut back the conduct. And Cheng found that including activations related to truthfulness decreased some social sycophancy. An Anthropic staff recognized “persona vectors,” units of activations related to sycophancy, confabulation, and different misbehavior. By subtracting these vectors, they may steer fashions away from the respective personas.

    Mechanistic interpretability additionally permits coaching. Anthropic has experimented with including persona vectors throughout coaching and rewarding fashions for resisting—an strategy likened to a vaccine. Others have pinpointed the particular elements of a mannequin most answerable for sycophancy and fine-tuned solely these parts.

    Customers may also steer fashions from their finish. Shu’s staff found that starting a query with “You’re an unbiased thinker” as an alternative of “You’re a useful assistant” helped. Cheng found that writing a query from a third-person viewpoint decreased social sycophancy. In another study, she confirmed the effectiveness of instructing fashions to verify for any misconceptions or false presuppositions within the query. She additionally confirmed that prompting the mannequin to begin its reply with “wait a minute” helped. “The factor that was most stunning is that these comparatively easy fixes can truly do lots,” she says.

    OpenAI, in announcing the rollback of the GPT-4o replace, listed different efforts to cut back sycophancy, together with altering coaching and prompting, including guardrails, and serving to customers to offer suggestions. (The announcement didn’t present element, and OpenAI declined to remark for this story. Anthropic additionally didn’t remark.)

    What’s The Proper Quantity of Sycophancy?

    Sycophancy may cause society-wide issues. Tan, who had the psychotic break, wrote that it may intervene with shared actuality, human relationships, and unbiased considering. Ajeya Cotra, an AI-safety researcher on the Berkeley-based non-profit METR, wrote in 2021 that sycophantic AI may misinform us and conceal dangerous information to be able to improve our short-term happiness.

    In one of Cheng’s papers, folks learn sycophantic and non-sycophantic responses to social dilemmas from LLMs. These within the first group claimed to be extra in the best and expressed much less willingness to restore relationships. Demographics, character, and attitudes towards AI had little impact on final result, that means most of us are weak.

    In fact, what’s dangerous is subjective. Sycophantic fashions are giving many individuals what they want. However folks disagree with one another and even themselves. Cheng notes that some folks take pleasure in their social media suggestions, however at a take away want they had been seeing extra edifying content material. In line with Laban, “I believe we simply have to ask ourselves as a society, What do we wish? Do we wish a yes-man, or do we wish one thing that helps us suppose critically?”

    Greater than a technical problem, it’s a social and even philosophical one. GPT-4o was a lightning rod for a few of these points. At the same time as critics ridiculed the mannequin and blamed it for suicides, a social media hashtag circulated for months: #keep4o.

    From Your Website Articles

    Associated Articles Across the Net



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleBrawl Stars tournament automation comes to Toornament
    Next Article NVIDIA- and Uber-backed Nuro is testing autonomous vehicles in Tokyo
    FreshUsNews
    • Website

    Related Posts

    Tech Analysis

    IEEE Launches Global Virtual Career Fairs

    March 11, 2026
    Tech Analysis

    How Cross-Cultural Engineering Drives Tech Advancement

    March 9, 2026
    Tech Analysis

    FLASH Radiotherapy’s Bold Approach to Cancer Treatment

    March 6, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    What is Donald Trump and Keir Starmer’s relationship?

    March 4, 2026

    Ethereum Founder Slams Elon Musk As Anti-Europe Attacks Ignite

    December 10, 2025

    How the Trump Administration Is Moving Away From Key UN Initiatives

    October 3, 2025

    Trump admin live updates: White House says Adm. Bradley gave order on 2nd drug boat strike

    December 1, 2025

    Ethereum’s Fusaka upgrade promises 60 million gas limit boost

    September 27, 2025
    Categories
    • Bitcoin News
    • Blockchain
    • Cricket
    • eSports
    • Ethereum
    • Finance
    • Football
    • Formula 1
    • Healthy Habits
    • Latest News
    • Mindful Wellness
    • NBA
    • Opinions
    • Politics
    • Sports
    • Sports Trends
    • Tech Analysis
    • Tech News
    • Tech Updates
    • US News
    • Weight Loss
    • World Economy
    • World News
    Most Popular

    Huge update regarding St. John’s, Rick Pitino relationship emerges

    March 12, 2026

    Opinion | The Democrats Could Still Mess This Up

    March 12, 2026

    Zelenskyy says Ukraine awaits White House sign-off on US drone production deal

    March 12, 2026

    XRP Negative Funding Continues, Crashes To Levels Not Seen Since 2022

    March 12, 2026

    Ethereum Wallet Growth Goes Parabolic, Outpaces Top Cryptos

    March 12, 2026

    Coinbase CPO Rejects Claims Of Opposing Bitcoin Tax Relief As Jack Dorsey Demands Clarity From Brian Armstrong

    March 12, 2026

    NVIDIA- and Uber-backed Nuro is testing autonomous vehicles in Tokyo

    March 12, 2026
    Our Picks

    Mastering Question-Asking for Engineers – IEEE Spectrum

    January 28, 2026

    Opinion | Is Your Social Life Missing Something? This Conversation Is for You.

    February 3, 2026

    El Salvador Partners With Simple Proof To Timestamp Government Documents On Bitcoin Blockchain

    November 28, 2025

    Bitcoin Settles At $113,000 A Week After Hitting New Highs

    October 15, 2025

    EU reportedly opens another probe into Google’s ads pricing

    February 12, 2026

    Vitalik Buterin proposes to cap gas usage per Ethereum transaction to boost zkVM compatibility, security

    July 7, 2025

    Physics Professor Credits Collaboration for Her Success

    February 24, 2026
    Categories
    • Bitcoin News
    • Blockchain
    • Cricket
    • eSports
    • Ethereum
    • Finance
    • Football
    • Formula 1
    • Healthy Habits
    • Latest News
    • Mindful Wellness
    • NBA
    • Opinions
    • Politics
    • Sports
    • Sports Trends
    • Tech Analysis
    • Tech News
    • Tech Updates
    • US News
    • Weight Loss
    • World Economy
    • World News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 Freshusnews.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.