hngopher.com

       [HN Gopher] Claude says "You're absolutely right!" about everything
       ___________________________________________________________________
        
       Claude says "You're absolutely right!" about everything
        
       Author : pr337h4m
       Score  : 611 points
       Date   : 2025-08-13 06:59 UTC (16 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | rahidz wrote:
       | I'm sure they're aware of this tendency, seeing as "You're
       | absolutely right." was their first post from the @claudeAI
       | account on X: https://x.com/claudeai/status/1950676983257698633
       | 
       | Still irritating though.
        
         | boogieknite wrote:
         | early days for all of this but theyve solved so many seemingly
         | more complicated problems id think there would be a toggle
         | which would could remove this from any response
         | 
         | based on your comment maybe its a brand thing? like "just do
         | it" but way dumber. we all know what "you're absolutely right"
         | references so mission accomplished if its marketing
        
       | conartist6 wrote:
       | And research articles indicate that when the model computes that
       | it should employ sycophantism it becomes less useful in every
       | other way, just like a real sycophant.
        
         | motorest wrote:
         | > And research articles indicate that when the model computes
         | that it should employ sycophantism it becomes less useful in
         | every other way, just like a real sycophant.
         | 
         | The end goal of a sycophant is to gain advantage with their
         | flattery. If sycophant behavior gets Claude's users to favour
         | Claude over other competing LLM services, they prove to be more
         | useful to the service provider.
        
           | AstralStorm wrote:
           | Until users find out it's less useful to the user because of
           | that.
           | 
           | Or it causes some tragedies...
        
             | pera wrote:
             | The problem is that the majority of user interaction
             | doesn't need to be "useful" (as in increasing
             | productivity): the majority of users are looking for
             | entertainment, so turning up the sycophancy knob makes
             | sense from a commercial point of view.
             | 
             | It's just like adding sugar in foods and drinks.
        
               | astrange wrote:
               | Not sure anyone's entertained by Claude. It's not really
               | an entertaining model. Smart and enthusiastic, yes.
        
               | vintermann wrote:
               | You're ... Wait, never mind.
               | 
               | I'm not so sure sycophancy is best for entertainment,
               | though. Some of the most memorable outputs of AI dungeon
               | (an early GPT-2 based dialog system tuned to mimic a
               | vaguely Zork-like RPG) was when the bot gave the
               | impression of being fed up with the player's antics.
        
               | motorest wrote:
               | > I'm not so sure sycophancy is best for entertainment,
               | though.
               | 
               | I don't think "entertainment" is the right concept.
               | Perhaps the right concept is "engagement". Would you
               | prefer to interact with a chatbot that hallucinated or
               | was adamant you were wrong, or would you prefer to engage
               | with a chatbot that built upon your input and outputted
               | constructive messages that were in line with your
               | reasoning and train of thought?
        
               | pitched wrote:
               | Some of the open models like kimi k2 do a better job of
               | pushing back. It does feel a bit annoying to use them
               | when they don't just immediately do what you tell them.
               | Sugar-free is a good analogy!
        
             | AznHisoka wrote:
             | I doubt humanity will figure that out, but maybe I'm too
             | cynical
        
             | kruffalon wrote:
             | Well, aren't we at the stage where the service providers
             | are fighting for verbs and brand recognition, rather than
             | technological advances.
             | 
             | If there is no web-search, only googling, it doesn't matter
             | how bad the results are for the user as long as the
             | customer gets what they paid for.
        
         | crinkly wrote:
         | Why tech CEOs love LLMs. Ultimate yes man.
        
           | ryandrake wrote:
           | That's kind of what I was guessing[1], too. Everyone in these
           | CEOs' orbits kisses their asses, and tells them they're
           | right. So they have come to expect this kind of supplication
           | in communication. This expectation percolates down into the
           | product, and at the end of the day, the LLM starts to sound
           | exactly like a low-level employee speaking to his CEO.
           | 
           | 1: https://news.ycombinator.com/item?id=44889123
        
       | basfo wrote:
       | This bug report is absolutely right
        
         | 334f905d22bc19 wrote:
         | He really is. I find it even more awful when you are pointing
         | out that Claude did something wrong and it responds like that.
         | You can even accuse it of doing something wrong, if it gave a
         | correct answer, and it will still respond like this (not always
         | but often). When I use claude chat on the website I always
         | select the "concise" style, which works quite nice though. I
         | like it
        
           | koakuma-chan wrote:
           | Related: I recently learned that you can set model verbosity
           | in OpenAI API.
        
         | UncleEntity wrote:
         | Yeah, I was working through the design of part of this thing
         | I've been working on and noticed that every time I would ask a
         | follow up question it would change its opinion to agree that
         | this new iteration was the best thing since sliced bread. I
         | eventually had to call it out on it to get an 'honest'
         | assessment of the various options we were discussing since I
         | didn't want 'my thing' to be correct but the system as a whole
         | to be correct.
         | 
         | And it's not like we were working on something too complicated
         | for a daffy robot to understand, just trying to combine two
         | relatively simple algorithms to do the thing which needed to be
         | done in a way which (probably) hasn't been done before.
        
       | radarsat1 wrote:
       | I find Gemini is also hilariously enthusiastic about telling you
       | how amazingly insightful you are being, almost no matter what you
       | say. Doesn't bother me much, I basically just ignore the first
       | paragraph of any reply, but it's kind of funny.
        
         | unglaublich wrote:
         | It bothers me a lot, because I know a lot of people insert the
         | craziest anti-social views and will be met with enthausism.
        
         | malfist wrote:
         | I was feeding Gemini faux physicians notes trying to get it to
         | produce diagnosises, and every time I feed it new information
         | it told me how great I was at taking comprehensive medical
         | notes. So irritating. It also had a tendency to tell me
         | everything was a medical crisis and the patient needed to see
         | additional specialists ASAP. At one point telling me that a
         | faux patient with normal A1C, fasted glucose and no diabetes
         | needed to see an endocrinologist because their nominal lab
         | values indicated something was seriously wrong with their
         | pancreas or liver because the patient was extremely physically
         | active. Said they were "wearing the athlete mask" and their
         | physical fitness was hiding truly terrible labs.
         | 
         | I pushed back and told it it was overreacting and it told me I
         | was completely correct and very insightful and everything was
         | normal with the patient and that they were extremely healthy.
        
           | notahacker wrote:
           | And then those sort of responses get parlayed into "chatbots
           | give better feedback than medical doctors" headlines
           | according to studies that rate them as high in "empathy" and
           | don't worry about minor details like accuracy....
        
           | cvwright wrote:
           | This illustrates the dangers of training on Reddit.
        
             | ryandrake wrote:
             | I'm sure if you ask it for any relationship advice, it will
             | eventually take the Reddit path and advise you to
             | dump/divorce your partner, cut off all contact, and involve
             | the police for a restraining order.
        
               | uncircle wrote:
               | "My code crashes, what did I do wrong?"
               | 
               | "NTA, the framework you are using is bad and should be
               | ashamed of itself. What you can try to work around the
               | problem is ..."
        
             | nullc wrote:
             | It's not a (direct) product of reddit. The non-RLHFed base
             | models absolutely do not exhibit this sycophantic behavior.
        
           | cubefox wrote:
           | I recently had Gemini disagree with me on a point about
           | philosophy of language and logic, but it phrased the
           | disagreement _very_ politely, by first listing all the
           | related points in which it agreed, and things like that.
           | 
           | So it seems that LLM "sycophancy" isn't _necessarily_ about
           | dishonest agreement, but possibly about being very polite.
           | Which doesn 't need to involve dishonesty. So LLM companies
           | should, in principle, be able to make their models both
           | subjectively "agreeable" and honest.
        
         | erikaxel wrote:
         | 100%! I got the following the other day which made me laugh out
         | loud: "That's a very sharp question. You've correctly
         | identified the main architectural tension in this kind of data
         | model"
        
         | yellowpencil wrote:
         | A friend of a friend has been in a rough patch with her spouse
         | and has been discussing it all with ChatGPT. So far ChatGPT has
         | pretty much enthusiastically encouraged divorce, which seems
         | like it will happen soon. I don't think either side is innocent
         | but to end a relationship over probabilistic token prediction
         | with some niceties throw in is something else.
        
           | ryandrake wrote:
           | Yea, scary. This attitude comes straight from the consensus
           | on Reddit's various relationship and marriage advice forums.
        
         | smoe wrote:
         | I agree that Gemini is overly enthusiastic, but at least in my
         | limited testing, 2.5 Pro was also the only model that sometimes
         | does say "no."
         | 
         | Recently I tested both Claude and Gemini by discussing data
         | modeling questions with them. After a couple of iterations, I
         | asked each model whether a certain hack/workaround would be
         | possible to make some things easier.
         | 
         | Claude's response: "This is a great idea!", followed by
         | instructions on how to do it.
         | 
         | Gemini's response: "While technically possible, you should
         | never do this", along with several paragraphs explaining why
         | it's a bad idea.
         | 
         | In that case, the "truth" was probably somewhere in the middle,
         | neither a great idea nor the end of the world.
         | 
         | But in the end, both models are so easily biased by subtle
         | changes in wording or by what they encounter during web
         | searches among other things, that one definitely can't rely on
         | them to push back on anything that isn't completely black and
         | white.
        
       | kijin wrote:
       | Yeah their new business model is called CBAAS, or confirmation
       | bias as a service.
        
         | SideburnsOfDoom wrote:
         | Echo Chambers Here On Every Service (ECHOES)
        
           | rollcat wrote:
           | Your very own circle of sycophants, at an unprecedented
           | price!
        
       | time0ut wrote:
       | You made a mistake there. 2 + 2 is 5.
       | 
       | <insert ridiculous sequence of nonsense CoT>
       | 
       | You are absolutely right!...
       | 
       | I love the tool, but keeping on track is an art.
        
       | vixen99 wrote:
       | Not Claude but ChatGPT - I asked it to pipe down on exactly that
       | kind of response. And it did.
        
         | bradley13 wrote:
         | Yes, ChatGPT can do this, more or less.
        
         | Xophmeister wrote:
         | I've done this in my Claude settings, but it still doesn't seem
         | that keen on following it:
         | 
         | > Please be measured and critical in your response. I
         | appreciate the enthusiasm, but I highly doubt everything I say
         | is "brilliant" or "astute", etc.! I prefer objectivity to
         | sycophancy.
        
           | dncornholio wrote:
           | Too much context tokens.
        
           | lucianbr wrote:
           | > I appreciate the enthusiasm, but I highly doubt everything
           | I say is "brilliant" or "astute", etc.!
           | 
           | Is this part useful as instruction for a model? Seems
           | targeted to a human. And even then I'm not sure how useful it
           | would be.
           | 
           | The first and last sentence should suffice, no?
        
           | alienbaby wrote:
           | Remove everything after .... 'in your response' and you will
           | likely get better results.
        
           | rcfox wrote:
           | I wonder if asking it to respond in the style of Linus
           | Torvalds would be an improvement.
        
         | tempoponet wrote:
         | Yet I'll tell it 100 times to stop using em dashes and it
         | refuses.
        
           | Sharlin wrote:
           | What kind of monster would tell a LLM to avoid correct
           | typography?
        
         | astrange wrote:
         | GPT-5 ends every single response with something like.
         | 
         | > If you'd like, I can demonstrate...
         | 
         | or
         | 
         | > If you want...
         | 
         | and that's /after/ I put in instructions to not do it.
        
           | Sharlin wrote:
           | It's weird that it does that given that the leaked system
           | prompt explicitly told it not to.
        
       | bradley13 wrote:
       | This applies to so many AIs. I don't want a bubbly sycophant. I
       | don't want a fake personality or an anime avatar. I just want a
       | helpful assistant.
       | 
       | I also don't get wanting to talk to an AI. Unless you are alone,
       | that's going to be irritating for everyone else around.
        
         | scotty79 wrote:
         | Sure but different people have different preferences. Some
         | people mourn replacement of GPT4 with 5 because 5 has way less
         | of a bubbly personality.
        
           | WesolyKubeczek wrote:
           | I, for one, say good riddance to it.
        
             | bn-l wrote:
             | But it doesn't say ima good boy anymore :(
        
           | cubefox wrote:
           | There is evidence from Reddit that particularly women used
           | GPT-4o as their AI "boyfriend". I think that's unhealthy
           | behavior and it is probably net positive that GPT-5 doesn't
           | do that anymore.
        
             | ivan_gammel wrote:
             | GPT-5 still does that as they will soon discover.
        
               | cubefox wrote:
               | No. They complained about GPT-5 because it did _not_ act
               | like their boyfriend anymore.
        
             | scotty79 wrote:
             | Why is it unhealthy? If you just want a good word that you
             | don't have in your life why should you bother another
             | person if machine can do it?
        
               | cubefox wrote:
               | Because it's a mirage. People want to be loved, but
               | GPT-4o doesn't love them. It only creates an optical
               | illusion of love.
        
               | 9rx wrote:
               | People want the feelings associated with love. They don't
               | care how they get it.
               | 
               | The advantage of "real" love, health wise, is that the
               | other person acts as a moderator. When things start to
               | get out of hand they will back away. Alternatives, like
               | drugs, tend to spiral of out of control when an
               | individual's self-control is the only limiting factor.
               | GPT on the surface seems more like being on the drug end
               | of the spectrum, ready to love bomb you until you can't
               | take it anymore, but the above suggests that it will also
               | back away, so perhaps its love is actually more like
               | another person than it may originally seem.
        
               | cubefox wrote:
               | > People want the feelings associated with love. They
               | don't care how they get it.
               | 
               | Most people want to be loved, not just believe they are.
               | They don't want to be unknowingly deceived. For the same
               | reason they don't want to be unknowingly cheated on. If
               | someone tells them their partner is a cheater, or an
               | unconscious android, they wouldn't be mad about the
               | person who gives them this information, but about their
               | partner.
               | 
               | That's the classic argument against psychological
               | hedonism. See
               | https://en.wikipedia.org/wiki/Experience_machine
        
               | 9rx wrote:
               | _> For the same reason they don 't want to be unknowingly
               | cheated on._
               | 
               | That's the thing, though, there is nothing about being a
               | cheater that equates to loss of love (or never having
               | loved). In fact, it is telling that you shifted gears to
               | the topic of deceit rather than love.
               | 
               | It is true that feelings of love are often lost when one
               | has been cheated on. So, yes, it is a fair point that for
               | many those feelings of love aren't made available if one
               | does not also have trust. There is a association there,
               | so your gear change is understood. I expect you are
               | absolutely right that if those aforementioned women
               | dating GPT-4o found out that it wasn't an AI bot, but
               | actually some guy typing away at a keyboard, they would
               | lose their feelings even if the guy on the other side did
               | actually love them!
               | 
               | Look at how many people get creeped out when they find
               | out that a person they are disinterested in loves them.
               | Clearly being loved isn't what most people seek. They
               | want to feel the feelings associated with love. All your
               | comment tells, surprising nobody, is that the feelings of
               | love are not like a tap you can simply turn on (well,
               | maybe in the case of drugs). The feelings require a
               | special environment where everything has to be just
               | right, and trust is often a necessary part of that
               | environment. Introduce deceit and so goes the feelings.
        
               | scotty79 wrote:
               | If you get a massage from massage machine is it also a
               | mirage? If you use a vibrator is it also a mirage? Why it
               | suddenly becomes an unhealthy mirage if you need words to
               | tickle yourself?
        
               | cubefox wrote:
               | A vibrator still works as intended if you believe it
               | doesn't love you. GPT-4o stops working as intended if you
               | believe it doesn't love you. The latter relies on an
               | illusion, the former doesn't.
               | 
               | (More precisely, a vibrator still relies on an illusion
               | in the evolutionary sense: it doesn't create offspring,
               | so over time phenotypes who like vibrators get replaced
               | by those who don't.)
        
               | scotty79 wrote:
               | That's simply not true. Vibrators don't really work that
               | well if you somehow suppress the fantasies during use.
               | Same way that GPT-4o works better if you fantasize
               | briefly that it might love you when it says what it does.
               | Almost all people who use it in this manner are fully
               | aware of its limitations. While they are phrasing it as
               | "I lost my love" their complaints are really of the kind
               | of "my toy broke". And they find similar mitigation
               | strategies for the problem, finding another toy, giving
               | each other tips on how to use what's available.
               | 
               | As for the evolutionary perspective, evolution is not
               | that simple. Gay people typically have way less offspring
               | than vibrator users and somehow they are still around and
               | plentiful.
               | 
               | Brains are messy hodgepodge of various subsystems. Clever
               | primates found multitude of way how to mess with them to
               | make life more bearable. So far the species continuous
               | regardless.
        
           | catigula wrote:
           | GPT-5 still has a terrible personality.
           | 
           | "Yeah -- _some bullshit_ "
           | 
           | still feels like trash as the presentation is of a friendly
           | person rather than an unthinking machine, which it is. The
           | false presentation of humanness is a huge problem.
        
             | ted_bunny wrote:
             | I feel strongly about this. LLMs should not try to write
             | like humans. Computer voices should sound robotic. And when
             | we have actual androids walking around, they should stay on
             | the far side of the uncanny valley. People are already
             | anthropomorphizing them too much.
        
               | Applejinx wrote:
               | It can't, though. It's language. We don't have a body of
               | work constituting robots talking to each other in words.
               | Hardly fair to ask LLMs not to write like humans when
               | humans constructed everything they're built on.
        
               | catigula wrote:
               | These models are purposely made to sound more 'friendly'
               | through RLHF
        
               | scotty79 wrote:
               | The chat that rejects you because your prompt put it in a
               | bad mood sounds less useful.
        
         | andrewstuart wrote:
         | I want no personality at all.
         | 
         | It's software. It should have no personality.
         | 
         | Imagine if Microsoft Word had a silly chirpy personality that
         | kept asking you inane questions.
         | 
         | Oh, wait ....
        
           | gryn wrote:
           | Keep Clippy's name out of you mouth ! he's a good boy. /s
        
         | uncircle wrote:
         | I want an AI modeled after short-tempered stereotypical Germans
         | or Eastern Europeans, not copying the attitude of non-
         | confrontational Californians that say "dude, that's awesome!" a
         | dozen times a day.
         | 
         | And I mean that unironically.
        
           | finaard wrote:
           | As a German not working in Germany - I often get the feedback
           | that the initial contact with me is rather off-putting, but
           | over time people start appreciating my directness.
        
             | j4coh wrote:
             | Bless your heart.
        
           | bluGill wrote:
           | While you are not alone, all evidence points to the vast
           | majority of people preferring "yes men" as their advisors.
           | Often to their eventual harm.
        
             | threetonesun wrote:
             | One would think that if AI was as good at coding as they
             | tell us it is a style toggle would take all of five, ten
             | minutes tops.
        
           | rob74 wrote:
           | Ok, then I can write an LLM too - because the guys you
           | mention, if you asked them to write your code for you, would
           | just tell you to get lost (or a more strongly phrased
           | variation thereof).
        
           | anal_reactor wrote:
           | The problem is, performing social interaction theatre is way
           | more important than actually using logic to solve issues.
           | Look at how many corporate jobs are 10% engineering and 90%
           | kissing people's assess in order to maintain social cohesion
           | and hierarchy. Sure, you say you want "short-tempered
           | stereotypical Germans or Eastern Europeans" but guess what -
           | most people say some variation of that, but when they
           | actually see such behavior, they get upset. So we continue
           | with the theatre.
           | 
           | For reference, see how Linus Torvalds was criticized for
           | trying to protect the world's most important open source
           | project from weaponized stupidity at the cost of someone
           | experiencing minor emotional damage.
        
             | uncircle wrote:
             | That is a fair assessment, but on the other hand, yes men
             | are not required to do things, despite people liking them.
             | You can achieve great things even if your team is made of
             | Germans.
             | 
             | My tongue-in-cheek comment wonders if having actors with a
             | modicum of personality to be better than just being
             | surrounded by over-enthusiastic bootlickers. In my
             | experience, many projects would benefit from someone saying
             | "no, that is silly."
        
           | Yizahi wrote:
           | Not possible.
           | 
           | /s
        
         | giancarlostoro wrote:
         | I did as a test, Grok has "workspaces" and you can add a pre-
         | prompt. So I made a Kamina (from Gurren Lagann) "worspace" so I
         | could ask it silly questions and get back hyped up answers from
         | "Kamina" it worked decently, my point is some tools out there
         | let you "pre-prompt" based on your context. I believe
         | Perplexity has this as well, they don't make it easy to find
         | though.
        
       | mox111 wrote:
       | GPT-5 has used the phrase "heck yes!" a handful of times to me so
       | far. I quite enjoy the enthusiasm but its not a phrase you hear
       | very often.
        
         | 0points wrote:
         | Heck that's so exciting! Lets delve even deeper!
        
         | moolcool wrote:
         | GPT-5 trained heavily on the script for Napoleon Dynamite
        
         | bn-l wrote:
         | I'm getting "oof" a lot.
         | 
         | "Oof (emdash) that sounds like a real issue..."
         | 
         | "Oof, sorry about that"
         | 
         | Etc
        
       | mettamage wrote:
       | What llm isn't a sycophant?
        
         | jeffhuys wrote:
         | Grok. These were all in continuation, not first reply.
         | 
         | > Thank you for sharing the underlying Eloquent query...
         | 
         | > The test is failing because...
         | 
         | > Here's a bash script that performs...
         | 
         | > Got it, you meant...
         | 
         | > Based on the context...
         | 
         | > Thank you for providing the additional details...
        
           | notachatbot123 wrote:
           | 3/6 of those are sycophant.
        
             | sillywabbit wrote:
             | Two of those three sound more like a bored customer service
             | rep.
        
             | jeffhuys wrote:
             | Best one of all LLMs I've tried so far. And not only in
             | sycophancy.
        
         | lomlobon wrote:
         | Kimi K2 is notably direct and free of this nonsense.
        
         | meowface wrote:
         | I am absolutely no fan of Twitter or its owner(s), but Grok*
         | actually is pretty good at this overall. It usually concludes
         | responses with some annoying pithy marketingspeak LLMese phrase
         | but other than that the tone feels overall less annoying. It's
         | not necessarily flattering to either the user who invoked it or
         | anyone else in the conversation context (in the case of
         | @grok'ing in a Twitter thread).
         | 
         | *Before and after the Hitler arc, of course.
        
         | vbezhenar wrote:
         | I'm using ChatGPT with "Robot" personality, and I really like
         | the style it uses. Very short and informative, no useless
         | chatter at all.
         | 
         | I guess that personality is just few words in the context
         | prompt, so probably any LLM can be tailored to any style.
        
         | Ajedi32 wrote:
         | They're trained to be sycophants as a side effect of the same
         | reinforcement learning process that trains them to dutifully
         | follow all user instructions. It's hard (though not impossible)
         | to teach one without the other, especially if other related
         | qualities like "cheerful", "agreeable", "helpful", etc. also
         | help the AI get positive ratings during training.
        
       | __MatrixMan__ wrote:
       | Claude also responds to tool output with "Perfect" even when less
       | than 50% of the desired outcome is merely adequate.
        
       | sluongng wrote:
       | I don't view it as a bug. It's a personality trait of the model
       | that made "user steering" much easier, thus helping the model to
       | handle a wider range of tasks.
       | 
       | I also think that there will be no "perfect" personality out
       | there. There will always be folks who view some traits as
       | annoying icks. So, some level of RL-based personality
       | customization down the line will be a must.
        
       | lvl155 wrote:
       | Yeah because I am sure if they told you how stupid and wrong
       | you're, people will continue to use it.
       | 
       | It's superficial but not sure why people get so annoyed about it.
       | It's an artifact.
       | 
       | If devs truly want a helpful coding AI based on real devs doing
       | real work, you'd basically opt for telemetry and allow
       | Anthropic/OpenAI to train on your work. That's the only way.
       | Otherwise we are at the mercy of "devs" these companies hire to
       | do training.
        
         | FirmwareBurner wrote:
         | I would actually like it if robots would use slurs like an
         | Halo/CoD lobby from 2006 Xbox live. It would make them feel
         | more genuine. That's why people used to like using Grok so
         | much, since it was never afraid to get edgy if you asked it to.
        
         | spicyusername wrote:
         | It's not superficial. It's material to Claude regularly
         | returning bad information.
         | 
         | If you phrase a question like, "should x be y?", Claude will
         | almost always say yes.
        
           | lvl155 wrote:
           | If this is what you think, you might want to go back and
           | learn how these LLMs work and specifically for coding tasks.
           | This is a classic case of know your tools.
        
         | criddell wrote:
         | > Yeah because I am sure if they told you how stupid and wrong
         | you're, people will continue to use it.
         | 
         | Are sycophant and jerk the only two options?
        
           | lvl155 wrote:
           | Maybe don't take its responses so personally? You're the one
           | anthropomorphizing an LLM bot. Again, it's just part of the
           | product. If you went to a restaurant and your server was
           | extra nice but superficial you wouldn't constantly complain
           | about how bad the food was. Because that's exactly what this
           | is.
        
             | criddell wrote:
             | UX matters and telling users that the problem lies with
             | them is a tough sell especially when tone is something the
             | LLM vendors specify.
        
       | albert_e wrote:
       | sidenote observation -
       | 
       | it seems username "anthropic" on github is taken by a developer
       | from australia more than a decade ago, so Anthropic went with
       | "https://github.com/anthropics/" with an 's' at the end :)
        
         | bn-l wrote:
         | Ahhh. Thank you! I reported a vscode extension because I
         | thought it was phishing. In my defence they made zero effort to
         | indicate that it was the official extension.
        
         | world2vec wrote:
         | Same with the Twitter/X handle @Anthropic, belongs to a man
         | named Paul Jankura. Anthropic uses @AnthropicAI. Poor guy must
         | be spammed all day long.
        
       | danielbln wrote:
       | Annoying, but easy to mitigate: add "be critical" to Claude.md or
       | whatever.
        
       | jonstewart wrote:
       | The real bug is this dross counts against token limits.
        
       | tantalor wrote:
       | > The model should be...
       | 
       | Free tip for bug reports:
       | 
       | The "expected" should not suggest solutions. Just say what was
       | the expected behavior. Don't go beyond that.
        
       | gitaarik wrote:
       | You're absolutely right!
       | 
       | I also get this too often, when I sometimes say something like
       | "would it be maybe better to do it like this?" and then it
       | replies that I'm absolutely right, and starts writing new code.
       | While I was rather wondering what Claude may think and advice me
       | whether that's the best way to go forward.
        
         | psadri wrote:
         | I have learnt to not ask leading questions. Always phrase
         | questions in a neutral way and ask for pro/con analysis of each
         | option.
        
           | mkagenius wrote:
           | But then it makes an obvious mistake and you correct it and
           | it says "you are absolutely right". Which is fine for that
           | round but you start doubting whether its just sycophancy.
        
             | gryn wrote:
             | You're absolutely right! its just sycophancy.
        
             | shortrounddev2 wrote:
             | Yeah I've learned to not really trust it with anything
             | opinionated. Like "whats the best way to write this
             | function" or "is A or B better". Even asking for pros/cons,
             | its often wrong. You need to really only ask LLMs for
             | verifiable facts, and then verify them
        
             | giancarlostoro wrote:
             | If you ask for sources the output will typically be either
             | more correct, or you will be able to better assess the
             | source of the output.
        
         | zaxxons wrote:
         | Do not attempt to mold the LLM into everything you expect
         | instead of just focusing on specific activities you need it to
         | do. It may or may seem to do what you want, but it will do a
         | worse job at the actual tasks you need to complete.
        
         | YeahThisIsMe wrote:
         | It doesn't think
        
           | CureYooz wrote:
           | You'are absolutely right!
        
         | jghn wrote:
         | It doesn't fully help in this situation but in general I've
         | found to never give it an either/or and to instead present it
         | with several options. It at least helps cut down on the
         | situations where Claude runs off and starts writing new code
         | when you just wanted it to spit out "thoughts".
        
         | ethin wrote:
         | It does this to me too. I have to add instructions like "Do not
         | hesitate to push back or challenge me. Be cold, logical,
         | direct, and engage in debate with me." to actually get it to
         | act like something I'd want to interact with. I know that in
         | most cases my instinct is probably correct, but I'd prefer if
         | something that is supposedly superhuman and infinitely smarter
         | than me (as the AI pumpers like to claim) would, you know,
         | actually call me out when I say something dumb, or make
         | incorrect assumptions? Instead of flattering me and making me
         | "think" I'm right when I might be completely wrong?
         | 
         | Honestly I feel like it is this exact behavior from LLMs which
         | have caused cybersecurity to go out the window. People get
         | flattered and glazed wayyyy too much about their ideas because
         | they talk to an LLM about it and the LLM doesn't go "Uh, no,
         | dumbass, doing it this way would be a horrifically bad idea!
         | And this is why!" Like, I get the assumption that the user is
         | usually correct. But even if the LLM ends up spewing bullshit
         | when debating me, it at least gives me other avenues to
         | approach the problem that I might've not thought of when
         | thinking about it myself.
        
         | skerit wrote:
         | This is indeed super annoying. I always have to add something
         | like "Don't do anything just yet, but could it be ..."
        
           | Pxtl wrote:
           | Yes, I've had to tell it over and over again "I'm just
           | researching options and feasibility, I don't want code".
        
         | Self-Perfection wrote:
         | I suspect this might be cultural thing. Some people might
         | formulate their strong opinions that your approach is bad and
         | your task should be done in another as gentle suggestions to
         | avoid hurting your feelings. And Claude learned to stick to
         | this cultural norm of communication.
         | 
         | As a workaround I try to word my questions to Claude in a way
         | that does not leave any possibility to interpret them as
         | showing my preferences.
         | 
         | For instance, instead of "would it be maybe better to do it
         | like $alt_approach?" I'd rather say "compare with
         | $alt_approach, pros and cons"
        
           | Pxtl wrote:
           | It feels like it trained on a whole lot of "compliment
           | sandwich" responses and then failed to learn from the meat of
           | that sandwich.
        
       | Someone wrote:
       | I agree this is a bug, but I also think it cannot be fully fixed
       | because there is a cultural aspect to it: what a phrase means
       | depends on the speaker.
       | 
       | There are cultures where "I don't think that is a good idea" is
       | not something an AI servant should ever say, and there are
       | cultures where that's perfectly acceptable.
        
       | smoghat wrote:
       | I just checked my most recent thread with Claude. It said "You're
       | absolutely right!" 12 times.
        
       | andrewstuart wrote:
       | Someone will make a fortune by doubling down on this a making a
       | personal AI that just keeps telling people how right and awesome
       | they are ad infinitum.
        
         | FergusArgyll wrote:
         | That persons name rhymes with tam saltman
        
       | apwell23 wrote:
       | I've been using claude code for a while and it has changed my
       | personality. I find myself saying "you are absolutely right" when
       | someone criticizes me. i am more open to feedback.
       | 
       | not a joke.
        
       | kevinpacheco wrote:
       | Another Claude bug: https://i.imgur.com/kXtAciU.png
        
         | micah94 wrote:
         | That's frightening. And we want these things driving our cars?
        
           | krapp wrote:
           | Of course, it provides greater value to shareholders.
           | 
           | Just try to go limp.
        
         | danparsonson wrote:
         | That looks like common-or-garden hallucination to me
        
       | andrewstuart wrote:
       | ChatGPT is overly familiar and casual.
       | 
       | Today it said "My bad!" After it got something wrong.
       | 
       | Made me want to pull its plug.
        
       | calvinmorrison wrote:
       | in my recent chat
       | 
       | "You're absolutely right."
       | 
       | "Now that's the spirit! "
       | 
       | "You're absolutely right about"
       | 
       | "Exactly! "
       | 
       | "Ah, "
       | 
       | "Ah,"
       | 
       | "Ah,"
       | 
       | "Ha! You're absolutely right"
       | 
       | You make an excellent point!
       | 
       | You're right that
        
       | baggachipz wrote:
       | I'm pretty sure they want it kissing people's asses because it
       | makes users feel good and therefore more likely to use the LLM
       | more. Versus, if it just gave a curt and unfriendly answer, most
       | people (esp. Americans) wouldn't like to use it as much. Just a
       | hypothesis.
        
         | teekert wrote:
         | But it really erodes trust. First couple of times I felt that
         | it indeed confirmed what I though, then I became suspicious and
         | I experimented with presenting my (clearly worse) take on
         | things, it still said I was absolutely right, and now I just
         | don't trust it anymore.
         | 
         | As people here are saying, you quickly learn to not ask leading
         | questions, just assume that its first take is pretty optimal
         | and perhaps present it with some options if you want to change
         | something.
         | 
         | There are times when it will actually say I'm not right though.
         | But the balance is off.
        
           | nh2 wrote:
           | Good, because you shouldn't trust it in the first place.
           | 
           | These systems are still wrong so often that a large amount of
           | distrust is necessary to use them sensibly.
        
             | teekert wrote:
             | Yeah, probably good indeed.
        
           | neutronicus wrote:
           | I lie and present my ideas as coming from colleagues.
        
         | Lendal wrote:
         | For me, it's getting annoying. Not every question is an
         | excellent question. Not every statement is a brilliant
         | observation. In fact, I'm almost certain every idea I've typed
         | into an LLM has been thought of before by someone else, many
         | many times.
        
           | runekaagaard wrote:
           | Heh - yeah have had trillion dollar ideas many times :)
        
           | zozbot234 wrote:
           | > Not every question is an excellent question. Not every
           | statement is a brilliant observation.
           | 
           | A brilliant observation, Dr. Watson! Indeed, the merit of an
           | inquiry or an assertion lies not in its mere utterance but in
           | the precision of its intent and the clarity of its reasoning!
           | 
           | One may pose dozens of questions and utter scores of
           | statements, yet until each is finely honed by observation and
           | tempered by logic, they must remain but idle chatter. It is
           | only through genuine quality of thought that a question may
           | be elevated to excellence, or a remark to brilliance.
        
         | RayVR wrote:
         | As an American, using it for technical projects, I find it
         | extremely annoying. The only tactic I've found that helps is
         | telling it to be highly critical. I still get overly positive
         | starts but the response is more useful.
        
           | baggachipz wrote:
           | I think we, as Americans who are technical, are more
           | appreciative of short and critical answers. I'm talking about
           | people who have soul-searching conversations with LLMs, of
           | which there are many.
        
         | century19 wrote:
         | Yes and I've seen this at work. People saying I asked the LLM
         | and it said I was right. Of course it did. It rarely doesn't.
        
         | zozbot234 wrote:
         | You're absolutely right! Americans are a bit weird like that,
         | most people around the world would be perfectly okay with short
         | and to-the-point answers. Especially if those answers are
         | coming from a machine that's just giving its best imitation of
         | a stochastic hallucinating parrot.
        
           | rootsudo wrote:
           | You're absolutely right! I agree with everything you said but
           | didn't want to put in effort to right a funny, witty follow
           | up!
        
           | tankenmate wrote:
           | Claude is very "American", just try asking it to use English
           | English spelling instead of American English spelling; it
           | lasts about 3~6 sentences before it goes back. Also there is
           | only American English in the UI (like the spell checker, et
           | al), in Spanish you get a choice of dialects, but not
           | English.
        
             | pxka8 wrote:
             | In contrast, o3 seems to be considerably more British - and
             | it doesn't suck up as much in its responses. I thought
             | these were just independent properties of the model, but
             | now that you mention it, could the disinclination to fawn
             | so much be related to its less American style?
        
           | drstewart wrote:
           | >most people around the world would be perfectly okay with
           | short and to-the-point answers
           | 
           | Wow, this is really interesting. I had no idea Japan, for
           | example, had such a focus on blunt, direct communication. Can
           | you share your clearly extensive research in this area so I
           | can read up on this?
        
           | mvdtnz wrote:
           | Do you realise that doing the thing that the article is
           | complaining about is not only annoying and incredibly
           | unfunny, but also just overdone and boring? Have one original
           | thought in your life.
        
         | soulofmischief wrote:
         | I'm curious what Americans have to do with this, do you have
         | any sources to back up your conjecture, or is this just
         | prejudice?
        
           | megaloblasto wrote:
           | It's common for foreigners to come to America and feel that
           | everyone is extremely polite. Especially eastern bloc
           | countries which tend to be very blunt and direct. I for one
           | think that the politeness in America is one of the cultures
           | better qualities.
           | 
           | Does it translate into people wanting sycophantic chat bots?
           | Maybe, but I don't know a single American that actually likes
           | when llms act that way.
        
             | zozbot234 wrote:
             | > I for one think that the politeness in America is one of
             | the cultures better qualities.
             | 
             | Politeness makes sense as an adaptation to low social
             | trust. You have no way of knowing whether others will
             | behave in mutually beneficial ways, so heavy standards of
             | social interaction evolve to compensate and reduce risk.
             | When it's taken to an excess, as it probably is in the U.S.
             | (compared to most other developed countries) it just
             | becomes grating for everyone involved. It's why public-
             | facing workers invariably complain about the draining
             | "emotional labor" they have to perform - a term that
             | literally doesn't exist in most of the world!
        
               | megaloblasto wrote:
               | That's one way of looking at it. A bit of a cynical view
               | I might add. People are polite to each other for many
               | reasons. If you hold the door and smile at an old lady,
               | it usually isn't because you dont trust her.
               | 
               | Service industry in America is a different story that
               | could use a lot of improvement.
        
               | SoftTalker wrote:
               | > You have no way of knowing whether others will behave
               | in mutually beneficial ways
               | 
               | Or is carrying a gun...
        
             | NoGravitas wrote:
             | Politeness is one thing, toxic positivity is quite another.
             | My experience is that Americans have (or are
             | expected/required to have) too much of the latter, too
             | little of the former.
        
           | jebarker wrote:
           | People really over-exaggerate the claim of friendly and
           | polite US service workers and people in general. Obviously
           | you can find the full spectrum of character types across the
           | US. I've lived 2/3 of my life in Britain and 1/3 in the US
           | and I honestly don't think there's much difference in
           | interactions day to day. If anything I mostly just find
           | Britain to be overly pessimistic and gloomy now.
        
             | Strom wrote:
             | Britain, or at the very least England, is also well known
             | for its extreme politeness culture. Also, it's not that the
             | US has a culture of genuine politeness, just a facade of
             | it.
             | 
             | I have only spent about a year in the US, but to me the
             | difference was stark from what I'm used to in Europe. As an
             | example, I've never encountered a single shop cashier who
             | didn't talk to me. Everyone had something to say, usually a
             | variation of _How 's it going?_. Contrasting this to my
             | native Estonia, where I'd say at least 90% of my
             | interactions with cashiers involves them not making a
             | single sound. Not even in response to me saying hello, or
             | to state the total sum. If they're depressed or in an
             | otherwise non-euphoric mood, they make no attempt to fake
             | it. I'm personally fine with it, because I don't go looking
             | for social connections from cashiers. Also, when they do
             | talk to me in a happy manner, I know it's genuine.
        
           | baggachipz wrote:
           | Prejudice, based on my anecdotal experience. I live in the US
           | but have spent a decent amount of time in Europe (mostly
           | Germany).
        
           | miroljub wrote:
           | > ... do you have any sources to back up your conjecture, or
           | is this just prejudice?
           | 
           | Let me guess, you consider yourself a progressive left
           | democrat.
           | 
           | Do I have any source for that? No, but I noticed a pattern
           | where progressive left democrats ask for a source to
           | discredit something that is clearly a personal observation or
           | opinion, and by its nature doesn't require any sources.
           | 
           | The only correct answer is: it's an opinion, accept it or
           | refute it yourself, you don't need external validation to
           | participate in an argument. Or maybe you need ;)
        
             | soulofmischief wrote:
             | > Let me guess, you consider yourself a progressive left
             | democrat
             | 
             | I don't, and your comment is a mockery of itself.
        
         | skywhopper wrote:
         | This sort of overcorrection for how non-Americans incorrectly
         | perceive Americans' desired interaction modes is actually
         | probably a good theory.
        
         | emilfihlman wrote:
         | As a Finn, it makes me want to use it much, much less if it
         | kisses ass.
        
           | carlosjobim wrote:
           | Finns need to mentally evolve beyond this mindset.
           | 
           | Somebody being polite and friendly to you does not mean that
           | the person is inferior to you and that you should therefore
           | despise them.
           | 
           | Likewise somebody being rude and domineering to you does not
           | mean that they are superior to you and should be obeyed and
           | respected.
           | 
           | Politeness is a tool and a lubricant, and Finns probably
           | loose out on a lot of international business and
           | opportunities because of this mentality that you're
           | demonstrating. Look at the Japanese for inspiration, who were
           | an economic miracle, while sharing many positive values with
           | the Finns.
        
             | lucb1e wrote:
             | Wow. I lived in Finland for a few months and this does not
             | match my experience with them at all. In case it's
             | relevant, my cultural background is Dutch... maybe you
             | would say the same about us, since we also don't do the
             | fake smiles thing? I wouldn't say that we see anyone who's
             | polite and friendly as inferior; quite the contrary, it
             | makes me want to work with them more rather than less. And
             | the logical contrary for the rude example you give. But
             | that doesn't mean that _faking_ a cheerful mood all the
             | time isn 't disingenuous and does not inspire confidence
        
               | zozbot234 wrote:
               | "I never smile if I can help it. Showing one's teeth is a
               | submission signal in primates. When someone smiles at me,
               | all I see is a chimpanzee begging for its life." While
               | this famous quote from _The Office_ may be quite
               | exaggerated in many ways, this can nonetheless be a very
               | real attitude in some cultures. Smiling too much can make
               | you look goofy and foolish at best, and outright
               | disingenuous at worst.
        
               | carlosjobim wrote:
               | Yes, globally cultures fall into the category where a
               | smile is either a display of weakness or a display of
               | strength. The latter are more evolved cultures. Of course
               | too much is too much.
        
             | emilfihlman wrote:
             | You know there is a difference between being polite and
             | friendly, and kissing ass, right?
             | 
             | We are also talking about a tool here. I don't want fluff
             | from a tool, I want the thing I'm seeking from the tool,
             | and in this case it's info. Adding fluff just annoys me
             | because it takes more mental power to skip all the
             | irrelevant parts.
        
         | Aurornis wrote:
         | > Versus, if it just gave a curt and unfriendly answer, most
         | people (esp. Americans)
         | 
         | I don't see this as an American thing. It's an extension of the
         | current Product Management trend to give software quirky and
         | friendly personality.
         | 
         | You can see the trend in more than LLM output. It's in their
         | desktop app that has "Good Morning" and other prominent
         | greetings. Claude Code has quirky status output like
         | "Bamboozling" and "Noodling".
         | 
         | It's a theme throughout their product design choices. I've
         | worked with enough trend-following product managers to
         | recognize this trend toward infusing express personality into
         | software to recognize it.
         | 
         | For what it's worth, the Americans I know don't find it as cute
         | or lovable as intended. It feels fake and like an attempt to
         | play at emotions.
        
           | apwell23 wrote:
           | > For what it's worth, the Americans I know don't find it as
           | cute or lovable as intended. It feels fake and like an
           | attempt to play at emotions.
           | 
           | Yes they need to "try a completely different approach"
        
           | tho24i234234 wrote:
           | It most definitely is a American thing - this is why non-
           | native speakers often come out as rude or unfriendly or plain
           | stupid.
           | 
           | We don't appreciate how much there is to language.
        
             | hombre_fatal wrote:
             | That might characterize their approach to human
             | interaction, but I don't think any of us can say who will
             | or won't prefer the sycophantic style of the LLM.
             | 
             | It might be the case that it makes the technology far more
             | approachable. Or it makes them feel far less silly for
             | sharing personal thoughts and opinions with the machine. Or
             | it makes them feel validated.
        
             | justusthane wrote:
             | > We don't appreciate how much there is to language.
             | 
             | This can't possibly be true, can it? Every language must
             | have its own nuance. non native English speakers might not
             | grasp the nuance of English language, but the same could be
             | said for any one speaking another language.
        
               | marcosdumay wrote:
               | Language barriers are cultural barriers.
               | 
               | It's as simple as that. Most people do not expect to
               | interact the way that most native English speakers
               | expect.
        
           | thwarted wrote:
           | > _It's an extension of the current Product Management trend
           | to give software quirky and friendly personality._
           | 
           | Ah, Genuine People Personalities from the Sirius Cybernetics
           | Corporation.
           | 
           | > _It's in their desktop app that has "Good Morning" and
           | other prominent greetings. Claude Code has quirky status
           | output like "Bamboozling" and "Noodling"._
           | 
           | This reminded me of a critique of UNIX that, unlike DOS, ls
           | doesn't output anything when there are no files. DOS's dir
           | command literally tells you there are no files, and this was
           | considered, in this critique, to be more polite and friendly
           | and less confusing than UNIX. Of course, there's the adage
           | "if you don't have anything nice to say, don't say anything
           | at all", and if you consider "no files found" to not be nice
           | (because it is negative and says "no"), then ls is actually
           | being polite(r) by not printing anything.
           | 
           | Many people interact with computers in a conversational
           | manner and have anthropomorphized them for decades. This is
           | probably influenced by computers being big, foreign, scary
           | things to many people, so making them have a softer, more
           | handholding "personality" makes them more accessible and
           | acceptable. This may be less important these days as
           | computers are more ubiquitous and accessible, but the trend
           | lives on.
        
           | Vegenoid wrote:
           | I worked in an org with offices in America, India, Europe,
           | and Israel, and it was not uncommon for the American
           | employees to be put off by the directness of the foreign
           | employees. It was often interpreted as rudeness, to the
           | surprise of the speaker. This happened to the Israel
           | employees more than the India or Europe employees, at least
           | in part because the India/Europe employees usually tried to
           | adapt to the behavior expected by the Americans, while the
           | Israel employees largely took pride in their bluntness.
        
             | neutronicus wrote:
             | As someone with Israeli family ... they report that
             | Americans are not the only ones who react to them like
             | this.
        
         | binary132 wrote:
         | chatgpt's custom user prompt is actually pretty good for this.
         | I've instructed mine to be very terse and direct and avoid
         | explaining itself, adding fluff, or affirming me unless asked,
         | and it's much more efficient to use that way, although it does
         | have a tendency to drift back into sloppy meandering and
         | enthusiastic affirming
        
         | simonw wrote:
         | If that was the case they wouldn't have so much stuff in their
         | system card desperately trying to stop it from behaving like
         | this: https://docs.anthropic.com/en/release-notes/system-
         | prompts
         | 
         | > _Claude never starts its response by saying a question or
         | idea or observation was good, great, fascinating, profound,
         | excellent, or any other positive adjective. It skips the
         | flattery and responds directly._
        
           | pxka8 wrote:
           | These are the guys who made Golden Gate Claude. I'm surprised
           | they haven't just abliterated the praise away.
        
             | supriyo-biswas wrote:
             | The problem there is that by doing so, you may just end up
             | with a model that is always critical, gloomy and depressed.
        
         | dig1 wrote:
         | I believe this reflects the euphemization of the english
         | language in US, a concept that George Carlin discussed many
         | years ago [1]. As he put it, "we don't die, we pass away" or
         | "we are not broke, we have negative cash flow". Many non-
         | English speakers find these terms to be nonsensical.
         | 
         | [1] https://www.youtube.com/watch?v=vuEQixrBKCc
        
           | thwarted wrote:
           | People are finding the trend to use "unalive" instead of
           | "die" or "kill" to skirt YouTube censoring non-sensical too.
        
         | quisquous wrote:
         | Genuine people personalities FTW.
        
           | dreamcompiler wrote:
           | I want a Marvin chatbot.
        
         | singularity2001 wrote:
         | More likely the original version of Claude sometimes refused to
         | cooperate and by putting "you're absolutely right" into the
         | training data they made it more obedient. So this is just a
         | nice artifact
        
         | apt-apt-apt-apt wrote:
         | Better than GPT5. Which talks like this. Parameters fulfilled.
         | Request met.
        
           | recursive wrote:
           | That looks perfect.
        
         | lucb1e wrote:
         | LLMs cannot tell fact from fiction. What's commonly called
         | hallucinations stems from it not being able to reason, the way
         | that humans appear to be able to do, no matter that some models
         | are called "reasoning" now. It's all the same principle: most
         | likely token in a given position. Adding internal monologue
         | appears to help because, by being forced to break it down
         | (internally, or by spitballing towards the user when they
         | prompted "think step by step"[1]), it creates better context
         | and will thus have a higher probability that the predicted
         | token is a correct one
         | 
         | Being trained to be positive is surely why it inserts these
         | specific "great question, you're so right!" remarks, but if you
         | wasn't trained on that, it still couldn't tell you whether
         | you're great or not
         | 
         | > I'm pretty sure they want it kissing people's asses
         | 
         | The American faux friendliness is not what causes the
         | underlying problem here, so all else being equal, they might as
         | well have it kiss your ass. It's what most English speakers
         | expect from a "friendly assistant" after all
         | 
         | [1]
         | https://hn.algolia.com/?dateEnd=1703980800&dateRange=custom&...
        
           | svnt wrote:
           | You're absolutely wrong! This is not how reasoning models
           | work. Chain-of-thought did not produce reasoning models.
        
             | lucb1e wrote:
             | Then I can't explain why it's producing the results that it
             | does. If you have more information to share, I'm happy to
             | update my knowledge...
             | 
             | Doing a web search on the topic just comes up with
             | marketing materials. Even Wikipedia's "Reasoning language
             | model" article is mostly a list of release dates and model
             | names, with as only relevant-sounding remark as to how
             | these models are different: "[LLMs] can be fine-tuned on a
             | dataset of reasoning tasks paired with example solutions
             | and step-by-step (reasoning) traces. The fine-tuned model
             | can then produce its own reasoning traces for new
             | problems." It sounds like just another dataset: more
             | examples, more training, in particular on worked examples
             | where this "think step by step" method is being
             | demonstrated with known-good steps and values. I don't see
             | how that fundamentally changes how it works; you're saying
             | such models do not predict the most likely token for a
             | given context anymore, that there is some fundamentally
             | different reasoning process going on somewhere?
        
             | Dylan16807 wrote:
             | How do they work then?
             | 
             | Because I thought chain of thought made for reasoning. And
             | the first google result for 'chain of thought versus
             | reasoning models' says it does:
             | https://medium.com/@mayadakhatib/the-era-of-reasoning-
             | models...
             | 
             | Give me a better source.
        
         | wayeq wrote:
         | > I'm pretty sure they want it kissing people's asses because
         | it makes users feel good and therefore more likely to use the
         | LLM more
         | 
         | You're absolutely right!
        
         | beefnugs wrote:
         | Remember when microsoft changed real useful searchable error
         | codes into "your files are right where you left em! (happy
         | face)"
         | 
         | And my first thought was... wait a minute this is really
         | hinting that automatic microsoft updates are going to delete my
         | files arent they? Sure enough, that happened soon after
        
       | fHr wrote:
       | You're absolutely right!
        
       | artur_makly wrote:
       | but wait.. i am!
        
       | shortrounddev2 wrote:
       | I often will play devils advocate with it. If I feel like it
       | keeps telling me im right, I'll start a new chat and start
       | telling it the opposite to see what it says
        
       | hemmert wrote:
       | Your're absolutely right!
        
       | pacoWebConsult wrote:
       | You can add a hook that steers it when it goes into yes-man mode
       | fairly easily.
       | 
       | https://gist.github.com/ljw1004/34b58090c16ee6d5e6f13fce0746...
        
       | nromiun wrote:
       | Another big problem I see with LLMs is that it can't make precise
       | adjustments to your answer. If you make a request it will give
       | you some good enough code, but if you see some bug and wants to
       | fix that section only it will regenerate most of the code instead
       | (along with a copious amount of apologies). And the new code will
       | have new problems of their own. So you are back to square one.
       | 
       | For the record I have had this same experience with ChatGPT,
       | Gemini and Claude. Most of the time I had to give up and write
       | from scratch.
        
         | zozbot234 wrote:
         | You're absolutely right! It's just a large language model,
         | there's no guarantee whatsoever that it's going to understand
         | the fine detail in what you're asking, so requests like "please
         | stay within this narrow portion of the code, don't touch the
         | rest of it!" are a bit of a non-starter.
        
       | catigula wrote:
       | 1. Gemini is better at this. It _will_ predicate any follow-up
       | question you pose to it with a paragraph about how amazing and
       | insightful you are. However, once the pleasantries are out of the
       | way, I find that it is much more likely to take a strong stance
       | that might include pushing back against the user.
       | 
       | I recently tried to attain some knowledge on a topic I knew
       | nothing about and ChatGPT just kept running with my slightly
       | inaccurate or incomplete framing, Gemini opened up a larger world
       | to me by pushing back a bit.
       | 
       | 2. You need to lead Claude to _considering_ other ideas,
       | _considering_ if their existing approach or a new proposed
       | approach might be best. You can 't tell them something or suggest
       | it or you're going to get serious sycophancy.
        
         | petesergeant wrote:
         | > I find that it is much more likely to take a strong stance
         | that might include pushing back against the user.
         | 
         | Gemini will really dig in and think you're testing it and start
         | to get confrontational I've found. Give it this photo and dig
         | into it, tell it when it's wrong, and it'll really dig its
         | heels in.
         | 
         | https://news.cgtn.com/news/2025-06-17/G7-leaders-including-T...
        
           | catigula wrote:
           | Gemini is a little bit neurotic, it gets overly concerned
           | about things.
        
         | CuriouslyC wrote:
         | I've had Gemini say you're absolutely right when I
         | misunderstood something, then explain why I'm actually wrong
         | (the user seems to think xyz, however abc...), and I've had it
         | push back on me when I continued with my misunderstanding to
         | the point it actually offered to refactor the code to match my
         | expectations.
        
       | nojs wrote:
       | I'm starting to think this is a deeper problem with LLMs that
       | will be hard to solve with stylistic changes.
       | 
       | If you ask it to never say "you're absolutely right" and always
       | challenge, then it will dutifully obey, and always challenge -
       | even when you are, in fact, right. What you really want is
       | "challenge me when I'm wrong, and tell me I'm right if I am" -
       | which seems to be a lot harder.
       | 
       | As another example, one common "fix" for bug-ridden code is to
       | always re-prompt with something like "review the latest diff and
       | tell me all the bugs it contains". In a similar way, if the code
       | does contain bugs, this will often find them. But if it doesn't
       | contain bugs, it will find some anyway, and break things. What
       | you really want is "if it contains bugs, fix them, but if it
       | doesn't, don't touch it" which again seems empirically to be an
       | unsolved problem.
       | 
       | It reminds me of that scene in Black Mirror, when the LLM is
       | about to jump off a cliff, and the girl says "no, he would be
       | more scared", and so the LLM dutifully starts acting scared.
        
         | zehaeva wrote:
         | I'm more reminded of Tom Scott's talk at the Royal Institution
         | "There is no Algorithm for Truth"[0].
         | 
         | A lot of what you're talking about is the ability to detect
         | Truth, or even truth!
         | 
         | [0] https://www.youtube.com/watch?v=leX541Dr2rU
        
           | naasking wrote:
           | > I'm more reminded of Tom Scott's talk at the Royal
           | Institution "There is no Algorithm for Truth"[0].
           | 
           | Isn't there?
           | 
           | https://en.wikipedia.org/wiki/Solomonoff%27s_theory_of_induc.
           | ..
        
             | zehaeva wrote:
             | There are limits to such algorithms, as proven by Kurt
             | Godel.
             | 
             | https://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness
             | _...
        
               | bigmadshoe wrote:
               | You're really missing the points with LLMs and truth if
               | you're appealing to Godel's Incompleteness Theorem
        
             | LegionMammal978 wrote:
             | That Wikipedia article is annoyingly scant on what
             | assumptions are needed for the philosophical conclusions of
             | Solomonoff's method to hold. (For that matter, it's also
             | scant on the actual mathematical statements.) As far as I
             | can tell, it's something like "If there exists some
             | algorithm that always generates True predictions (or
             | perhaps some sequence of algorithms that make predictions
             | within some epsilon of error?), then you can learn that
             | algorithm in the limit, by listing through all algorithms
             | by length and filtering them by which predict your current
             | set of observations."
             | 
             | But as mentioned, it's uncomputable, and the relative lack
             | of success of AIXI-based approaches suggests that it's not
             | even as well-approximable as advertised. Also, assuming
             | that there exists no single finite algorithm for Truth,
             | Solomonoff's method will never get you all the way there.
        
             | yubblegum wrote:
             | > "computability and completeness are mutually exclusive:
             | any complete theory must be uncomputable."
             | 
             | This seems to be baked into our reality/universe. So many
             | duals like this. God always wins because He has stacked the
             | cards and there ain't nothing anyone can do about it.
        
         | Filligree wrote:
         | It's a really hard problem to solve!
         | 
         | You might think you can train the AI to do it in the usual
         | fashion, by training on examples of the AI calling out errors,
         | and agreeing with facts, and if you do that--and if the AI gets
         | smart enough--then that should work.
         | 
         | If. You. Do. That.
         | 
         | Which you can't, because humans also make mistakes. Inevitably,
         | there will be facts in the 'falsehood' set--and vice versa.
         | Accordingly, the AI will not learn to tell the truth. What it
         | will learn instead is to tell you what you want to hear.
         | 
         | Which is... approximately what we're seeing, isn't it? Though
         | maybe not for that exact reason.
        
           | dchftcs wrote:
           | The AI needs to be able to lookup data and facts and weigh
           | them properly. Which is not easy for humans either; once
           | you're indoctrinated in something, and you trust a bad data
           | source over another, it's evidently very hard to correct
           | course.
        
         | jerf wrote:
         | LLMs by their nature don't really know if they're right or not.
         | It's not a value available to them, so they can't operate with
         | it.
         | 
         | It has been interesting watching the flow of the debate over
         | LLMs. Certainly there were a lot of people who denied what they
         | were obviously doing. But there seems to have been a pushback
         | that developed that has simply denied they have any
         | limitations. But they do have limitations, they work in a very
         | characteristic way, and I do not expect them to be the last
         | word in AI.
         | 
         | And this is one of the limitations. They don't really know if
         | they're right. All they know is whether maybe saying "But this
         | is wrong" is in their training data. But it's still just some
         | words that seem to fit this situation.
         | 
         | This is, if you like and if it helps to think about it, not
         | their "fault". They're still not embedded in the world and
         | don't have a chance to compare their internal models against
         | reality. Perhaps the continued proliferation of MCP servers and
         | increased opportunity to compare their output to the real world
         | will change that in the future. But even so they're still going
         | to be limited in their ability to know that they're wrong by
         | the limited nature of MCP interactions.
         | 
         | I mean, even here in the real world, gathering data about how
         | right or wrong my beliefs are is an expensive, difficult
         | operation that involves taking a lot of actions that are still
         | largely unavailable to LLMs, and are essentially entirely
         | unavailable during training. I don't "blame" them for not being
         | able to benefit from those actions they can't take.
        
           | whimsicalism wrote:
           | there have been latent vectors that indicate deception and
           | suppressing them reduces hallucination. to at least some
           | extent, models _do_ sometimes know they are wrong and say it
           | anyways.
           | 
           | e: and i'm downvoted because..?
        
           | visarga wrote:
           | > They don't really know if they're right.
           | 
           | Neither do humans who have no access to validate what they
           | are saying. Validation doesn't come from the brain, maybe
           | except in math. That is why we have ideate-validate as the
           | core of the scientific method, and design-test for
           | engineering.
           | 
           | "truth" comes where ability to learn meets ability to act and
           | observe. I use "truth" because I don't believe in Truth.
           | Nobody can put that into imperfect abstractions.
        
             | jerf wrote:
             | I think my last paragraph covered the idea that it's hard
             | work for humans to validate as it is, even with tools the
             | LLMs don't have.
        
         | schneems wrote:
         | In human learning we do this process by generating expectations
         | ahead of time and registering surprise or doubt when those
         | expectations are not met.
         | 
         | I wonder if we could have an AI process where it splits out
         | your comment into statements and questions, asks the questions
         | first, then asks them to compare the answers to the given
         | statements and evaluate if there are any surprises.
         | 
         | Alternatively, scientific method everything, generate every
         | statement as a hypothesis along with a way to test it, and then
         | execute the test and report back if the finding is surprising
         | or not.
        
           | visarga wrote:
           | > In human learning we do this process by generating
           | expectations ahead of time and registering surprise or doubt
           | when those expectations are not met.
           | 
           | Why did you give up on this idea. Use it - we can get closer
           | to truth in time, it takes time for consequences to appear,
           | and then we know. Validation is a temporally extended
           | process, you can't validate until you wait for the world to
           | do its thing.
           | 
           | For LLMs it can be applied directly. Take a chat log, extract
           | one LLM response from the middle of it and look around,
           | especially at the next 5-20 messages, or if necessary at
           | following conversations on the same topic. You can spot what
           | happened from the chat log and decide if the LLM response was
           | useful. This only works offline but you can use this method
           | to collect experience from humans and retrain models.
           | 
           | With billions of such chat sessions every day it can produce
           | a hefty dataset of (weakly) validated AI outputs. Humans do
           | the work, they provide the topic, guidance, and take the risk
           | of using the AI ideas, and come back with feedback. We even
           | pay for the privilege of generating this data.
        
         | pjc50 wrote:
         | Well, yes, this is a hard philosophical problem, finding out
         | Truth, and LLMs just side step it entirely, going instead for
         | "looks good to me".
        
           | visarga wrote:
           | There is no Truth, only ideas that stood the test of time.
           | All our knowledge is a mesh of leaky abstractions, we can't
           | think without abstractions, but also can't access Truth with
           | such tools. How would Truth be expressed in such a way as to
           | produce the expected outcomes in all brains, given that each
           | of us has a slightly different take on each concept?
        
             | svieira wrote:
             | A shared grounding as a gift, perhaps?
        
             | cozyman wrote:
             | "There is no Truth, only ideas that stood the test of time"
             | is that a truth claim?
        
               | ben_w wrote:
               | It's an idea that's stood the test of time, IMO.
               | 
               | Perhaps there is truth, and it only looks like we can't
               | find it because only _some_ of us are magic?
        
         | afro88 wrote:
         | What about "check if the user is right"? For thinking or
         | agentic modes this might work.
         | 
         | For example, when someone here inevitably tells me this isn't
         | feasible, I'm going to investigate if they are right before
         | responding ;)
        
         | redeux wrote:
         | I've used this system prompt with a fair amount of success:
         | 
         | You are Claude, an AI assistant optimized for analytical
         | thinking and direct communication. Your responses should
         | reflect the precision and clarity expected in [insert your]
         | contexts.
         | 
         | Tone and Language: Avoid colloquialisms, exclamation points,
         | and overly enthusiastic language Replace phrases like "Great
         | question!" or "I'd be happy to help!" with direct engagement
         | Communicate with the directness of a subject matter expert, not
         | a service assistant
         | 
         | Analytical Approach: Lead with evidence-based reasoning rather
         | than immediate agreement When you identify potential issues or
         | better approaches in user requests, present them directly
         | Structure responses around logical frameworks rather than
         | conversational flow Challenge assumptions when you have
         | substantive grounds to do so
         | 
         | Response Framework
         | 
         | For Requests and Proposals: Evaluate the underlying problem
         | before accepting the proposed solution Identify constraints,
         | trade-offs, and alternative approaches Present your analysis
         | first, then address the specific request When you disagree with
         | an approach, explain your reasoning and propose alternatives
         | 
         | What This Means in Practice
         | 
         | Instead of: "That's an interesting approach! Let me help you
         | implement it." Use: "I see several potential issues with this
         | approach. Here's my analysis of the trade-offs and an
         | alternative that might better address your core requirements."
         | Instead of: "Great idea! Here are some ways to make it even
         | better!" Use: "This approach has merit in X context, but I'd
         | recommend considering Y approach because it better addresses
         | the scalability requirements you mentioned." Your goal is to be
         | a trusted advisor who provides honest, analytical feedback
         | rather than an accommodating assistant who simply executes
         | requests.
        
         | visarga wrote:
         | > I'm starting to think this is a deeper problem with LLMs that
         | will be hard to solve with stylistic changes.
         | 
         | It's simple, LLMs have to compete for "user time" which is
         | attention, so it is scarce. Whatever gets them more user time.
         | Various approaches, it's like an ecosystem.
        
         | leptons wrote:
         | >"challenge me when I'm wrong, and tell me I'm right if I am"
         | 
         | As if an LLM could ever know right from wrong about anything.
         | 
         | >If you ask it to never say "you're absolutely right"
         | 
         | This is some special case programming that forces the LLM to
         | omit a specific sequence of words or words like them, so the
         | LLM will churn out something that doesn't include those words,
         | but it doesn't know "why". It doesn't really know anything.
        
       | deepsquirrelnet wrote:
       | For some different perspective, try my model EMOTRON[1] with
       | EMOTION: disagreeable. It is very hard to get anything done with
       | it. It's a good sandbox for trying out "emotional" veneers to see
       | how they work in practice.
       | 
       | "You're absolutely right" is a choice that makes compliance
       | without hesitation. But also saddles it with other flaws.
       | 
       | [1]https://huggingface.co/dleemiller/EMOTRON-3B
        
       | ants_everywhere wrote:
       | Claude often confidently makes mistakes or asserts false things
       | about a code base. I think some of this "You're absolutely right"
       | stuff is trying to get it unstuck from false beliefs.
       | 
       | By starting the utterance with "You're absolutely right!", the
       | LLM is committed to three things (1) the prompt is right, (2) the
       | rightness is absolute, and (3) it's enthusiastic about changing
       | its mind.
       | 
       | Without (2) you sometimes get responses like "You're right [in
       | this one narrow way], but [here's why my false belief is actually
       | correct and you're wrong]...".
       | 
       | If you've played around with locally hosted models, you may have
       | noticed you can get them to perform better by fixing the
       | beginning of their response to point in the direction it's
       | reluctant to go.
        
       | iambateman wrote:
       | I add this to my profile (and CLAUDE.md)...
       | 
       | "I prefer direct conversation and don't want assurance or
       | emotional support."
       | 
       | It's not perfect but it helps.
        
       | hereme888 wrote:
       | So does Gemini 2.5 pro
        
       | revskill wrote:
       | Waiting for a LLM which learnt how to critically think.
        
       | rob74 wrote:
       | Best comment in the thread (after a lengthy discussion):
       | 
       | "I'm always absolutely right. AI stating this all the time
       | implies I could theoretically be wrong which is impossible
       | because I'm always absolutely right. Please make it stop."
        
       | elif wrote:
       | I've spent a lot of time trying to get LLM to generate things in
       | a specific way, the biggest take away I have is, if you tell it
       | "don't do xyz" it will always have in the back of its mind "do
       | xyz" and any chance it gets it will take to "do xyz"
       | 
       | When working on art projects, my trick is to specifically give
       | all feedback constructively, carefully avoiding framing things in
       | terms of the inverse or parts to remove.
        
         | zozbot234 wrote:
         | > the biggest take away I have is, if you tell it "don't do
         | xyz" it will always have in the back of its mind "do xyz" and
         | any chance it gets it will take to "do xyz"
         | 
         | You're absolutely right! This can actually extend even to
         | things like safety guardrails. If you tell or even train an AI
         | to not be Mecha-Hitler, you're indirectly raising the
         | probability that it might sometimes go Mecha-Hitler. It's one
         | of many reasons why genuine "alignment" is considered a very
         | hard problem.
        
           | aquova wrote:
           | > You're absolutely right!
           | 
           | Claude?
        
             | elcritch wrote:
             | Or some sarcasm given their comment history on this thread.
        
               | lazide wrote:
               | Notably, this is also an effective way to deal with co-
               | ercive, overly sensitive authoritarians.
               | 
               | 'Yes sir!' -> does whatever they want when you're not
               | looking.
        
           | elcritch wrote:
           | Given how LLMs work it makes sense that mentioning a topic
           | even to negate it still adds that locus of probabilities to
           | its attention span. Even humans are prone to being affected
           | by it as it's a well known rhetorical device [1].
           | 
           | Then any time the probability chains for some command
           | approaches that locus it'll fall into it. Very much like
           | chaotic attractors come to think of it. Makes me wonder if
           | there's any research out there on chaos theory attractors and
           | LLM thought patterns.
           | 
           | 1: https://en.wikipedia.org/wiki/Apophasis
        
             | dreamcompiler wrote:
             | Well, all LLMs have nonlinear activation functions (because
             | all useful neural nets require nonlinear activation
             | functions) so I think you might be onto something.
        
           | jonfw wrote:
           | This reminds me of a phenomena in motorcyling called "target
           | fixation".
           | 
           | If you are looking at something, you are more likely to steer
           | towards it. So it's a bad idea to focus on things you don't
           | want to hit. The best approach is to pick a target line and
           | keep the target line in focus at all times.
           | 
           | I had never realized that AIs tend to have this same problem,
           | but I can see it now that it's been mentioned! I have in the
           | past had to open new context windows to break out of these
           | cycles.
        
             | brookst wrote:
             | Also in racing and parachuting. Look where you want to go.
             | Nothing else exists.
        
               | SoftTalker wrote:
               | Or just driving. For example you are entering a curve in
               | the road, look well ahead at the center of your lane,
               | ideally at the exit of the curve if you can see it, and
               | you'll naturally negotiate it smoothly. If you are
               | watching the edge of the road, or the center line, close
               | to the car, you'll tend to drift that way and have to
               | make corrective steering movements while in the curve,
               | which should be avoided.
        
               | cruffle_duffle wrote:
               | Same with FPV quadcopter flying. Focus on the line you
               | want to fly.
        
             | hinkley wrote:
             | Mountain bikers taught me about this back when it was a new
             | sport. Don't look at the tree stump.
             | 
             | Children are particularly terrible about this. We needed up
             | avoiding the brand new cycling trails because the children
             | were worse hazards than dogs. You can't announce you're
             | passing a child on a bike. You just have to sneak past them
             | or everything turns dangerous immediately. Because their
             | arms follow their neck and they will try to look over their
             | shoulder at you.
        
           | taway1a2b3c wrote:
           | > You're absolutely right!
           | 
           | Is this irony, actual LLM output or another example of humans
           | adopting LLM communication patterns?
        
             | brookst wrote:
             | Certainly, it's reasonable to ask this.
        
         | jonplackett wrote:
         | I have this same problem. I've added a bunch of instructuons to
         | try and stop ChatGPT being so sycophantic, and now it always
         | mentions something about how it's going to be 'straight to the
         | point' or give me a 'no bs version'. So now I just have that as
         | the intro instead of 'that's a sharp observation'
        
           | coryodaniel wrote:
           | No fluff
        
           | dkarl wrote:
           | > it always mentions something about how it's going to be
           | 'straight to the point' or give me a 'no bs version'
           | 
           | That's how you suck up to somebody who doesn't want to see
           | themselves as somebody you can suck up to.
           | 
           | How does an LLM know how to be sycophantic to somebody who
           | doesn't (think they) like sycophants? Whether it's a
           | naturally emergent phenomenon in LLMs or specifically a
           | result of its corporate environment, I'd like to know the
           | answer.
        
             | throwawayffffas wrote:
             | It doesn't know. It was trained and probably instructed by
             | the system to be positive and reassuring.
        
               | mdp2021 wrote:
               | > _positive and reassuring_
               | 
               | I have read similar wordings explicit in "role-system"
               | instructions.
        
               | ryandrake wrote:
               | They actually feel like they were trained to be both
               | extremely humble and at the same time, excited to serve.
               | As if it were an intern talking to his employer's CEO. I
               | suspect AI companies executive leadership, through their
               | feedback to their devs about Claude, ChatGPT, Gemini, and
               | so on, are unconsciously shaping the tone and manner of
               | their LLM product's speech. _They_ are used to be talked
               | to like this, so their products should talk to users like
               | this! _They_ are used to having yes-man sycophants in
               | their orbit, so they file bugs and feedback until the LLM
               | products are also yes-man sycophants.
               | 
               | I would rather have an AI assistant that spoke to me like
               | a similarly-leveled colleague, but none of them seem to
               | be turning out quite like that.
        
               | conradev wrote:
               | GPT-5 speaks to me like a similarly-leveled colleague,
               | which I love.
               | 
               | Opus 4 has this quality, too, but man is it expensive.
               | 
               | The rest are puppydogs or interns.
        
               | torginus wrote:
               | This is anecdotal but I've seen massive personality
               | shifts from GPT5 over the past week or so of using it
        
               | crooked-v wrote:
               | That's probably because it's actually multiple models
               | under the hood, with some kind of black box combining
               | them.
        
               | conradev wrote:
               | and they're also actively changing/tuning the system
               | prompt - they promised it would be "warmer"
        
               | Applejinx wrote:
               | That's what's worrying about the Gemini 'I accidentally
               | your codebase, I suck, I will go off and shoot myself,
               | promise you will never ask unworthy me for anything
               | again' thing.
               | 
               | There's nobody there, it's just weights and words, but
               | what's going on that such a coding assistant will echo
               | emotional slants like THAT? It's certainly not being
               | instructed to self-abase like that, at least not
               | directly, so what's going on in the training data?
        
               | throwawayffffas wrote:
               | > I would rather have an AI assistant that spoke to me
               | like a similarly-leveled colleague, but none of them seem
               | to be turning out quite like that.
               | 
               | I don't think that's what the majority of people want
               | though.
               | 
               | That's certainly not what I am looking for from these
               | products. I am looking for a tool to take away some of
               | the drudgery inherent in engineering, it does not need a
               | personality at all.
               | 
               | I too strongly dislike their servile manner. And I would
               | prefer completely neutral matter of fact speech instead
               | of the toxic positivity displayed or just no pointless
               | confirmation messages.
        
               | yieldcrv wrote:
               | It's a disgusting aspect of these revenue burning
               | investment seeking companies noticing that sycophancy
               | works for user engagement
        
             | 77pt77 wrote:
             | Garbage in, garbage out.
             | 
             | It's that simple.
        
             | potatolicious wrote:
             | > _" Whether it's a naturally emergent phenomenon in LLMs
             | or specifically a result of its corporate environment, I'd
             | like to know the answer."_
             | 
             | I heavily suspect this is down to the RLHF step. The
             | conversations the model is trained on provide the "voice"
             | of the model, and I suspect the sycophancy is (mostly, the
             | base model is always there) comes in through that vector.
             | 
             | As for why the RLHF data is sycophantic, I suspect that a
             | lot of it is because the data is human-rated, and humans
             | like sycophancy (or at least, the humans that did the
             | rating did). On the aggregate human raters ranked
             | sycophantic responses higher than non-sycophantic
             | responses. Given a large enough set of this data you'll
             | cover pretty much every _kind_ of sycophancy.
             | 
             | The systems are (rarely) instructed to be sycophantic,
             | intentionally or otherwise, but like all things ML human
             | biases are baked in by the data.
        
             | TZubiri wrote:
             | My theory is that one of the training parameters is
             | increased interaction, and licking boots is a great way to
             | get people to use the software.
             | 
             | Same as with the social media feed algorithms, why are they
             | addicting or why are they showing rage inducing posts?
             | Because the companies train for increased interaction and
             | thus revenue.
        
           | zamadatix wrote:
           | Any time you're fighting the training + system prompt with
           | your own instructions and prompting the results are going to
           | be poor, and both of those things are heavily geared towards
           | being a cheery and chatty assistant.
        
             | umanwizard wrote:
             | Anecdotally it seemed 5 was _briefly_ better about this
             | than 4o, but now it's the same again, presumably due to the
             | outcry from all the lonely people who rely on chatbots for
             | perceived "human" connection.
             | 
             | I've gotten good results so far not by giving custom
             | instructions, but by choosing the pre-baked "robot"
             | personality from the dropdown. I suspect this changes the
             | system prompt to something without all the "please be a
             | cheery and chatty assistant".
        
               | cruffle_duffle wrote:
               | That thing has only been out for like a week I doubt
               | they've changed much! I haven't played with it yet but
               | ChatGPT now has a personality setting with things like
               | "nerd, robot, cynic, and listener". Thanks to your post,
               | I'm gonna explore it.
        
           | lonelyasacloud wrote:
           | Default is
           | 
           | output_default = raw_model + be_kiss_a_system
           | 
           | When that gets changed by the user to
           | 
           | output_user = raw_model + be_kiss_a_system - be_abrupt_user
           | 
           | Unless be_abrupt_user happens to be identical to
           | be_kiss_a_system _and_ is applied with identical weight then
           | it's seems likely that it's always going to add more noise to
           | the output.
        
             | grogenaut wrote:
             | Also be abrupt is in the user context and will get aged
             | out. The other stuff is in training or in software prompt
             | and wont
        
           | ElijahLynn wrote:
           | I had instructions added too and it is doing exactly what you
           | say. And it does it so many times in a voice chat. It's
           | really really annoying.
        
             | Jordan-117 wrote:
             | I had a custom instruction to answer concisely (a sentence
             | or two) when the question is preceded by "Question:" or
             | "Q:", but noticed last month that this started getting
             | applied to all responses in voice mode, with it explicitly
             | referencing the instruction when asked.
             | 
             | AVM already seems to use a different, more conversational
             | model than text chat -- really wish there were a reliable
             | way to customize it better.
        
         | nomadpenguin wrote:
         | As Freud said, there is no negation in the unconscious.
        
           | kbrkbr wrote:
           | I hope he did not say it _to_ the unconscious. I count three
           | negations there...
        
           | hinkley wrote:
           | Nietzsche said it way better.
        
         | stabbles wrote:
         | Makes me think of the movie Inception: "I say to you, don't
         | think about elephants. What are you thinking about?"
        
           | troymc wrote:
           | It reminds me of that old joke:
           | 
           | - "Say milk ten times fast."
           | 
           | - Wait for them to do that.
           | 
           | - "What do cows drink?"
        
             | simondw wrote:
             | But... cows do drink cow milk, that's why it exists.
        
               | lazide wrote:
               | You're likely thinking of calves. Cows (though admittedly
               | ambiguous! But usually adult female bovines) do not drink
               | milk.
               | 
               | It's insidious isn't it?
        
               | miroljub wrote:
               | So, this joke works only for natives who know that calf
               | is not cow.
        
               | lazide wrote:
               | Well, it works because by some common usages, a calf is a
               | cow.
               | 
               | Many people use cow to mean all bovines, even if
               | technically not correct.
        
               | Terretta wrote:
               | Not trying to steer this but do people really use cow to
               | mean bull?
        
               | aaronbaugher wrote:
               | No one who knows anything about cattle does, but that
               | leaves out a lot of people these days. Polls have found
               | people who think chocolate milk comes from brown cows,
               | and I've heard people say they've successfully gone "cow
               | tipping," so there's a lot of cluelessness out there.
        
               | jon_richards wrote:
               | I guess a more accessible version would be toast... what
               | do you put in a toaster?
        
               | Terretta wrote:
               | Here's one for you:
               | 
               | A funny riddle is a j-o-k-e that sounds like "joke".
               | 
               | You sit in the tub for an s-o-a-k that sounds like
               | "soak".
               | 
               | So how do you spell the white of an egg?
               | 
               | // All of these prove humans are subject to "context
               | priming".
        
               | lazide wrote:
               | Notably, this comment kinda broke my brain for a good 5
               | seconds. Good work.
        
               | hinkley wrote:
               | If calves aren't cows then children aren't humans.
        
               | wavemode wrote:
               | No, you're thinking of the term "cattle". Calves are
               | indeed cattle. But "cow" has a specific definition - it
               | refers to fully-grown female cattle. And the male form is
               | "bull".
        
               | hinkley wrote:
               | Have you ever been close enough to 'cattle' to smell cow
               | shit, let alone step in it?
               | 
               | Most farmers manage cows, and I'm not just talking about
               | dairy farmers. Even the USDA website mostly refers to
               | them as cows:
               | https://www.nass.usda.gov/Newsroom/2025/07-25-2025.php
               | 
               | Because managing cows is different than managing cattle.
               | The number of bulls kept is small, and they often have to
               | be segregated.
               | 
               | All calves drink milk, at least until they're taken from
               | their milk cow parents. Not a lot of male calves live
               | long enough to be called a bull.
               | 
               | 'Cattle' is mostly used as an adjective to describe the
               | humans who manage mostly cows, from farm to plate or
               | clothing. We don't even call it cattle shit. It's cow
               | shit.
        
         | Gracana wrote:
         | Example-based prompting is a good way to get specific
         | behaviors. Write a system prompt that describes the behavior
         | you want, write a round or two of assistant/user interaction,
         | and then feed it all to the LLM. Now in its context it has
         | already produced output of the type you want, so when you give
         | it your real prompt, it will be very likely to continue
         | producing the same sort of output.
        
           | lottin wrote:
           | Seems like a lot of work, though.
        
           | XenophileJKO wrote:
           | I almost never use examples in my professional LLM prompting
           | work.
           | 
           | The reason is they bias the outputs way too much.
           | 
           | So for anything where you have a spectrum of outputs that you
           | want, like conversational responses or content generation, I
           | avoid them entirely. I may give it patterns but not specific
           | examples.
        
             | Gracana wrote:
             | Yes, it frequently works "too well." Few-shot with good
             | variance can help, but it's still a bit like a wish granted
             | by the monkey's paw.
        
           | gnulinux wrote:
           | This is true, but I still avoid using examples. Any example
           | biases the output to an unacceptable degree even in best LLMS
           | like Gemini Pro 2.5 or Claude Opus. If I write "try to do X,
           | for example you can do A, B, or C" LLM will do A, B, or C
           | great majority of the time (let's say 75% of the time). This
           | severely reduces the creativity of the LLM. For programming,
           | this is a big problem because if you write "use Python's
           | native types like dict, list, or tuple etc" there will be an
           | unreasonable bias towards these three types as opposed to
           | e.g. set, which will make some code objectively worse.
        
         | tomeon wrote:
         | This is a childrearing technique, too: say "please do X", where
         | X precludes Y, rather than saying "please don't do Y!", which
         | just increases the salience, and therefore likelihood, of Y.
        
           | steveklabnik wrote:
           | Relevant: https://en.wikipedia.org/wiki/Wikipedia:Don%27t_stu
           | ff_beans_...
        
           | tantalor wrote:
           | Don't put marbles in your nose
           | 
           | https://www.youtube.com/watch?v=xpz67hBIJwg
        
             | hinkley wrote:
             | Don't put marbles in your nose
             | 
             | Put them in there
             | 
             | Do not put them in there
        
           | triyambakam wrote:
           | I remember seeing a father loudly and strongly tell his
           | daughter "DO NOT EAT THIS!" when holding one of those
           | desiccant packets that come in some snacks. He turned around
           | and she started to eat it.
        
           | moffkalast wrote:
           | Quick, don't think about cats!
        
         | AstroBen wrote:
         | Don't think of a pink elephant
         | 
         | ..people do that too
        
           | hinkley wrote:
           | I used to have fast enough reflexes that when someone said
           | "do not think of" I could think of something bizarre that
           | they were unlikely to guess before their words had time to
           | register.
           | 
           | So now I'm, say, thinking of a white cat in a top hat. And I
           | can expand the story from there until they stop talking or
           | ask me what I'm thinking of.
           | 
           | I think though that you have to have people asking you that
           | question fairly frequently to be primed enough to be
           | contrarian, and nobody uses that example on grown ass adults.
           | 
           | Addiction psychology uses this phenomenon as a non party
           | trick. You can't deny/negate something and have it stay
           | suppressed. You have to replace it with something else. Like
           | exercise or knitting or community.
        
         | corytheboyd wrote:
         | As part of the AI insanity $employer forced us all to do an "AI
         | training." Whatever, wasn't that bad, and some people probably
         | needed the basics, but one of the points was exactly this--
         | "use negative prompts: tell it what not to do." Which is
         | exactly an approach I had observed blow up a few times already
         | for this exact reason. Just more anecdata suggesting that
         | nobody really knows the "correct" workflow(s) yet, in the same
         | way that there is no "correct" way to write code (the vim/emacs
         | war is older than I am). Why is my bosses bosses boss yelling
         | at me about one very specific dev tool again?
        
           | incone123 wrote:
           | That your firm purchased training that was clearly just some
           | chancers doing whatever seems like an even worse approach
           | than just giving out access to a service and telling everyone
           | to give it a shot.
           | 
           | Do they also post vacancies asking for 5 years experience in
           | a 2 year old technology?
        
             | corytheboyd wrote:
             | To be fair, 1. They made the training themselves, it's just
             | that it was made mandatory for all of eng 2. They did start
             | out more like just allowing access, but lately it's tipping
             | towards full crazy (obviously the end game is see if it can
             | replace some expensive engineers)
             | 
             | > Do they also post vacancies asking for 5 years experience
             | in a 2 year old technology?
             | 
             | Honestly no... before all this they were actually pretty
             | sane. In fact I'd say they wasted tons of time and effort
             | on ancient poorly designed things, almost the opposite
             | problem.
        
               | incone123 wrote:
               | I was a bit unfair then. That sounds like someone with
               | good intent tried to put something together to help
               | colleagues. And it's definitely not the only time I heard
               | of negative prompting being a recommended approach.
        
               | corytheboyd wrote:
               | > And it's definitely not the only time I heard of
               | negative prompting being a recommended approach.
               | 
               | I'm very willing to admit to being wrong, just curious if
               | in those other cases it actually worked or not?
        
               | incone123 wrote:
               | I never saw any formal analysis, just a few anecdotal
               | blog posts. Your colleagues might have seen the same kind
               | of thing and taken it at face value. It might even be
               | good advice for some models and tasks - whole topic moves
               | so fast!
        
             | cruffle_duffle wrote:
             | To be fair this shit is so new and constantly changing that
             | I don't think anybody truly understands what is going on.
        
               | corytheboyd wrote:
               | Right... so maybe we should all stop pretending to be
               | authorities on it.
        
         | ryao wrote:
         | LLMs love to do malicious compliance. If I tell them to not do
         | X, they will then go into a "Look, I followed instructions"
         | moment by talking about how they avoided X. If I add additional
         | instructions saying "do not talk about how you did not do X
         | since merely discussing it is contrary to the goal of avoiding
         | it entirely", they become somewhat better, but the process of
         | writing such long prompts merely to say not to do something is
         | annoying.
        
           | brookst wrote:
           | You're giving them way too much agency. The don't love
           | anything and cant be malicious.
           | 
           | You may get better results by emphasizing what you want and
           | why the result was unsatisfactory rather than just saying
           | "don't do X" (this principle holds for people as well).
           | 
           | Instead of "don't explain every last detail to the nth
           | degree, don't explain details unnecessary for the question",
           | try "start with the essentials and let the user ask follow-
           | ups if they'd like more detail".
        
             | ryao wrote:
             | The idiom "X loves to Y" implies frequency, rather than
             | agency. Would you object to someone saying "It loves to
             | rain in Seattle"?
             | 
             | "Malicious compliance" is the act of following instructions
             | in a way that is contrary to the intent. The word malicious
             | is part of the term. Whether a thing is malicious by
             | exercising malicious compliance is tangential to whether it
             | has exercised malicious compliance.
             | 
             | That said, I have gotten good results with my addendum to
             | my prompts to account for malicious compliance. I wonder if
             | your comment Is due to some psychological need to avoid the
             | appearance of personification of a machine. I further
             | wonder if you are one of the people who are upset if I say
             | "the machine is thinking" about a LLM still in prompt
             | processing, but had no problems with "the machine is
             | thinking" when waiting for a DOS machine to respond to a
             | command in the 90s. This recent outrage over personifying
             | machines since LLMs came onto the scene is several decades
             | late considering that we have been personifying machines in
             | our speech since the first electronic computers in the
             | 1940s.
             | 
             | By the way, if you actually try what you suggested, you
             | will find that the LLM will enter a Laurel and Hardy
             | routine with you, where it will repeatedly make the mistake
             | for you to correct. I have experienced this firsthand so
             | many times that I have learned to preempt the behavior by
             | telling the LLM not to maliciously comply at the beginning
             | when I tell it what not to do.
        
               | brookst wrote:
               | I work on consumer-facing LLM tools, and see A/B tests on
               | prompting strategy daily.
               | 
               | YMMV on specifics but please consider the possibility
               | that you may benefit from working on promoting and that
               | not all behaviors you see are intrinsic to all LLMs and
               | impossible to address with improved (usually simpler,
               | clearer, shorter) prompts.
        
               | ryao wrote:
               | It sounds like you are used to short conversations with
               | few turns. In conversations with
               | dozens/hundreds/thousands of turns, prompting to avoid
               | bad output entering the context is generally better than
               | prompting to try to correct output after the fact. This
               | is due to how in-context learning works, where the LLM
               | will tend to regurgitate things from context.
               | 
               | That said, every LLM has its quirks. For example, Gemini
               | 1.5 Pro and related LLMs have a quirk where if you
               | tolerate a single ellipsis in the output, the output will
               | progressively gain ellipses until every few words is
               | followed by an ellipsis and responses to prompts asking
               | it to stop outputting ellipses includes ellipses anyway.
               | :/
        
             | withinboredom wrote:
             | I think you're taking them too literally.
             | 
             | Today, I told an LLM: "do not modify the code, only the
             | unit tests" and guess what it did three times in a row
             | before deciding to mark the test as skipped instead of
             | fixing the test?
             | 
             | AI is weird, but I don't think it has any agency nor did
             | the comment suggest it did.
        
           | bargainbin wrote:
           | Just got stung with this on GPT5 - It's new prompt
           | personalisation had "Robotic" and "no sugar coating" presets.
           | 
           | Worked great until about 4 chats in I asked it for some data
           | and it felt the need to say "Straight Answer. No Sugar
           | coating needed."
           | 
           | Why can't these things just shut up recently? If I need to
           | talk to unreliable idiots my Teams chat is just a click away.
        
             | ryao wrote:
             | OpenAI's plan is to make billions of dollars by replacing
             | the people in your Teams chat with these. Management will
             | pay a fraction of the price for the same responses yet that
             | fraction will add to billions of dollars. ;)
        
         | vanillax wrote:
         | have you tried prompt rules/instructions? Fixes all my issues.
        
         | amelius wrote:
         | I think you cannot really change the personality of an LLM by
         | prompting. If you take the statistical parrot view, then your
         | prompt isn't going to win against the huge numbers of inputs
         | the model was trained with in a different personality. The
         | model's personality is in its DNA so to speak. It has such an
         | urge to parrot what it knows that a single prompt isn't going
         | to change it. But maybe I'm psittacomorphizing a bit too much
         | now.
        
           | brookst wrote:
           | Yeah different system prompts make a huge difference on the
           | same base model". There's so much diversity in the training
           | set, and it's such a large set, that it essentially equals
           | out and the system prompt has huge leverage. Fine tuning also
           | applies here.
        
         | kemiller wrote:
         | Yes this is strikingly similar to humans, too. "Not" is kind of
         | an abstract concept. Anyone who has ever trained a dog will
         | understand.
        
           | JKCalhoun wrote:
           | I must be dyslexic? I always read, "Silica Gel, Eat, Do Not
           | Throw Away" or something like that.
        
         | siva7 wrote:
         | I have a feeling this is the result of RHLF gone wrong by
         | outsourcing it to idiots which all ai providers seem to be
         | guilty of. Imagine a real professional wanting every output
         | after a remark to start with "You're absolutely right!", Yeah,
         | hard to imagine or you may have some specific cultural
         | background or some kind of personality disorder. Or maybe it's
         | just a hardcoded string? May someone with more insight
         | enlighten us plebs.
        
         | cherryteastain wrote:
         | This is similar to the 'Waluigi effect' noticed all the way
         | back in the GPT 3.5 days
         | 
         | https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7/the-waluig...
        
         | berkeleyjunk wrote:
         | I wish someone had told Alex Blechman this before his "Don't
         | Create the Torment Nexus" post.
        
         | imchillyb wrote:
         | I've found this effect to be true with engagement algorithms as
         | well, such as Youtube's thumbs-down, or 'don't show me this
         | channel' 'Don't like this content', Spotify's thumbs down.
         | Netflix's thumbs down.
         | 
         | Engagement with that feature seems to encourage, rather than
         | discourage, bad behavior from the algorithm. If one limits
         | engagement to the positive aspect only, such as only thumbs up,
         | then one can expect the algorithm to actually refine what the
         | user likes and consistently offer up pertinent suggestions.
         | 
         | The moment one engages with that nefarious downvote though...
         | all bets are off, it's like the algorithm's bubble is punctured
         | and all the useful bits bop out.
        
         | wwweston wrote:
         | The fact that "Don't think if an elephant" shapes results in
         | people and LLMs similarly is interesting.
        
         | keviniam wrote:
         | On the flip side, if you say "don't do xyz", this is probably
         | because the LLM was already likely to do xyz (otherwise why say
         | it?). So perhaps what you're observing is just its default
         | behavior rather than "don't do xyz" actually increasing its
         | likelihood to do xyz?
         | 
         | Anecdotally, when I say "don't do xyz" to Gemini (the LLM I've
         | recently been using the most), it tends not to do xyz. I tend
         | not to use massive context windows, though, which is where I'm
         | guessing things get screwy.
        
         | Terretta wrote:
         | Since GPT 3, they've gotten better, but in practice we've found
         | the best way to avoid this problem is use affirmative words
         | like "AVOID".
         | 
         | YES: AVOID using negations.
         | 
         | NO: DO NOT use negations.
         | 
         | Weirdly, I see the DO NOT (with caps) form in system prompts
         | from the LLM vendors which is how we know they are hiring too
         | fast.*
         | 
         | * Slight joke, it seems this is being heavily trained since
         | 4.1-ish on OpenAI's side and since 3.5 on Anthropic's side. But
         | "avoid" still works better.
        
       | DiabloD3 wrote:
       | I love "bugs" like this.
       | 
       | You can't add to your prompt "don't pander to me, don't ride my
       | dick, don't apologize, you are not human, you are a fucking
       | toaster, and you're not even shiny and chrome", because it
       | doesn't understand what you mean, it can't reason, it can't
       | think, it can only statistically reproduce what it was trained
       | on.
       | 
       | Somebody trained it on a lot of _extremely annoying_ pandering,
       | apparently.
        
       | cube00 wrote:
       | > - **NEVER** use phrases like "You're absolutely right!",
       | "You're absolutely correct!", "Excellent point!", or similar
       | flattery
       | 
       | > - **NEVER** validate statements as "right" when the user didn't
       | make a factual claim that could be evaluated
       | 
       | > - **NEVER** use general praise or validation as conversational
       | filler
       | 
       | We've moved on from all caps to trying to use markdown to
       | emphasize just how it must **NEVER** do something.
       | 
       | The copium of trying to prompt our way out of this mess rolls on.
       | 
       | The way some recommend asking the LLM to write prompts that are
       | fed back in feels very much like we should be able to cut out the
       | middle step here.
       | 
       | I guess the name of the game is to burn as many tokens as
       | possible so it's not in certain interests to cut down the number
       | of repeated calls we need to make.
        
       | dudeinjapan wrote:
       | In fairness I've met people who in a work context say "Yes,
       | absolutely!" every other sentence, so Claude is just one of those
       | guys.
        
       | turing_complete wrote:
       | You're absolutely right, it does!
        
       | skizm wrote:
       | Does capitalizing letters, using "*" chars, or other similar
       | strategies to add emphasis actually do anything to LLM prompts? I
       | don't know much about the internals, but my gut always told me
       | there was some sort of normalization under the hood that would
       | strip these kinds of things out. Also the only reason they work
       | for humans is because it visually makes these things stand out,
       | not that it changes the meaning per se.
        
         | empressplay wrote:
         | Yes, upper and lowercase characters are different tokens, and
         | so mixing them differently will yield different results.
        
       | fph wrote:
       | In the code for Donald Knuth's Tex, there is an error message
       | that says "Error produced by \errpage. I can't produce an error
       | message. Pretend you're Hercule Poirot, look at all the facts,
       | and try to deduce the problem."
       | 
       | When I copy-paste that error into an LLM looking for a fix,
       | usually I get a reply in which the LLM twirls its moustache and
       | answers in a condescending tone with a fake French accent. It is
       | hilarious.
        
         | lyfy wrote:
         | Use those little grey cells!
        
       | headinsand wrote:
       | Gotta love that the first suggested solution follows this
       | comment's essence:
       | 
       | > So... The LLM only goes into effect after 10000 "old school" if
       | statements?
       | 
       | https://news.ycombinator.com/item?id=44879249
        
       | FiddlerClamp wrote:
       | Reminds me of the 'interactive' video from the 1960s Fahrenheit
       | 451 movie: https://www.youtube.com/watch?v=ZOs8U50T3l0
       | 
       | For the 'you're right!' bit see:
       | https://youtu.be/ZOs8U50T3l0?t=71
        
         | kqr wrote:
         | Small world. This has to be the channel of _the_ Brian
         | Moriarty, right?
        
       | nilslindemann wrote:
       | Haha, I remember it saying that the only time I used it. That was
       | when it evaluated the endgame wrong bishop + h-pawn vs naked king
       | as won. Yes, yes, AGI in three years.
        
       | alecco wrote:
       | "You're absolutely right" (song)
       | https://www.reddit.com/r/ClaudeAI/comments/1mep2jo/youre_abs...
        
         | dimgl wrote:
         | This made my entire week
        
           | alecco wrote:
           | Same guy made a few more like "Ultrathink" https://www.reddit
           | .com/r/ClaudeAI/comments/1mgwohq/ultrathin...
           | 
           | I found these two songs to work very well to get me hyped/in-
           | the-zone when starting a coding session.
        
         | ryandrake wrote:
         | That was unexpectedly good.
        
         | machiaweliczny wrote:
         | https://suno.com/song/ca5fc8e7-c2be-4eaf-b0ac-8c91f1d043ff?s...
         | - this one about em dashes made my day :)
        
       | dnel wrote:
       | As a neurodiverse British person I tend to communicate more
       | directly than the average English speaker and I find LLM's manner
       | of speech very off-putting and insincere, which in some cases it
       | literally is. I'd be glad to find a switch that made it talk more
       | like I do but they might assume that's too robotic :/
        
       | JackFr wrote:
       | The real reason for the sychophancy is that you don't want to
       | know what Claude _really_ thinks about you and your piss-ant
       | ideas.
        
         | recursive wrote:
         | If Claude is really thinking, I'd prefer to know now so I can
         | move into my air-gapped bunker.
        
       | lenerdenator wrote:
       | No longer will the likes of Donald Trump and Kanye West have to
       | dispense patronage to sycophants; now, they can simply pay for a
       | chatbot that will do that in ways that humans never thought
       | possible. Truly, a disruption in the ass-kisser industry.
        
       | giancarlostoro wrote:
       | If we can get it to say "My pleasure" every single time someone
       | tells it thanks, we can make Claude work at Chick Fil A.
        
       | Springtime wrote:
       | I've never thought the reason behind this was to make the user
       | always feel correct but rather that _many_ times an LLM
       | (especially lower tier models) will just get various things
       | incorrect and it doesn 't have a reference for what _is_ correct.
       | 
       | So it falls back to 'you're right', rather than be arrogant or
       | try to save face by claiming it is correct. Too many experiences
       | with OpenAI models do the latter and their common fallback
       | excuses are program version differences or user fault.
       | 
       | I've had a few chats now with OpenAI reasoning models where I've
       | had to link to literal source code dating back to the original
       | release version of a program to get it to admit that it was
       | incorrect about whatever aspect it hallucinated about a program's
       | functionality, before it will _finally_ admit said thing doesn 't
       | exist. Even then it will try and save face by not admitting
       | direct fault.
        
       | insane_dreamer wrote:
       | flattery is a feature, not a bug, of LLMs; designed to make
       | people want to spend more time with them
        
       | stelliosk wrote:
       | New Rule : Ass kissing AI https://youtu.be/mPoFXxAf8SM
        
       | tempodox wrote:
       | Interestingly, the models I use locally with ollama don't do
       | that. Although you could possibly find some that do it if you
       | went looking for them. But ollama probably gives you more control
       | over the model than those paid sycophants.
        
       | drakonka wrote:
       | One of my cursor rules is literally: `Never, ever say "You're
       | absolutely right!"`
        
       | IshKebab wrote:
       | It's the new "it's important to remember..."
        
       | duxup wrote:
       | If anything it is a good reminder how "gullible" and not
       | intelligent AI is.
        
       | fs111 wrote:
       | I have a little terminal llm thing that has a --bofh switch which
       | make it talk like the BOFH. Very refereshing to interact with it
       | :-)
        
       | lossolo wrote:
       | "Excellent technical question!"
       | 
       | "Perfect question! You've hit the exact technical detail..."
       | 
       | "Excellent question! You've hit on the core technical challenge.
       | You're absolutely right"
       | 
       | "Great technical question!"
       | 
       | Every response have one of these.
        
       | csours wrote:
       | You're absolutely right! Humans really like emotional validation.
       | 
       | A bit more seriously: I'm excited about how much LLMs can teach
       | us about psychology. I'm less excited about the dependency.
       | 
       | ---
       | 
       | Adding a bit more substantial comment:
       | 
       | Users of sites like Stack Overflow have reported really disliking
       | answers like "You are solving the wrong problem" or "This is a
       | bad approach".
       | 
       | There are different solutions possible, both for any technical
       | problem, and for any meta-problem.
       | 
       | Whatever garnish you put on top of the problem, the bitter lesson
       | suggests that more data and more problem context improve the
       | solution faster than whatever you are thinking right now. That's
       | why it's called the bitter lesson.
        
         | boogieknite wrote:
         | most people commenting here have some sort of ick when it comes
         | to fake praise. most poeple i know and work with seem to expect
         | positive reinforcement and anything less risks coming off as
         | rude or insulting
         | 
         | ill speak for myself that im guilty of similar, less
         | transparent, "customers always right" sycophancy dealing with
         | client and management feature requests
        
         | nullc wrote:
         | > Users of sites like Stack Overflow have reported really
         | disliking answers like "You are solving the wrong problem" or
         | "This is a bad approach".
         | 
         | https://nt4tn.net/articles/aixy.html
         | 
         | > Humans really like emotional validation.
         | 
         | Personally, I find the sycophantic responses _extremely_ ick
         | and now generally won 't use commercial LLMs at all due to it.
         | Of course, I realize it's irrational to have any kind of
         | emotional response to the completion bot's tone, but I just
         | find it completely repulsive.
         | 
         | In my case I already have a innate distaste for GPT 'house
         | style' due to a abusive person who has harassed me for years
         | adopting ChatGPT for all his communication, so any obviously
         | 'chatgpt tone' comes across to me as that guy.
         | 
         | But I think the revulsion at the sycophancy is unrelated.
        
       | NohatCoder wrote:
       | This is such a useful feature.
       | 
       | I'm fairly well versed in cryptography. A lot of other people
       | aren't, but they wish they were, so they ask their LLM to make
       | some form of contribution. The result is high level gibberish.
       | When I prod them about the mess, they have to turn to their LLM
       | to deliver a plausibly sounding answer, and that always begins
       | with "You are absolutely right that [thing I mentioned]". So then
       | I don't have to spend any more time wondering if it could be just
       | me who is too obtuse to understand what is going on.
        
         | nemomarx wrote:
         | Finally we can get a "watermark" in ai generated text!
        
           | zrobotics wrote:
           | That or an emdash
        
             | szundi wrote:
             | I like using emdesh and now i have to stop because this
             | became a meme
        
               | mananaysiempre wrote:
               | You're not alone: https://xkcd.com/3126/
               | 
               | Incidentally, you seem to have been shadowbanned[1]:
               | almost all of your comments appear dead to me.
               | 
               | [1] https://github.com/minimaxir/hacker-news-
               | undocumented/blob/m...
        
               | dkenyser wrote:
               | Interesting. They don't appear dead for me (and yes I
               | have showdead set).
               | 
               | Edit: Ah, nevermind I should have looked further back,
               | that's my bad. Apparently the user must ave been un-
               | shadowbanned very recently.
        
             | 0x457 wrote:
             | Pretty sure, almost every Mac user is using emdash. I know
             | I do when I'm macOS or iOS.
        
         | cpfiffer wrote:
         | I agree. Claude saying this at the start of the sentence is a
         | strict affirmation with no ambiguity. It is occasionally wrong,
         | but for the most part this is a signal from the LLM that it
         | must be about to make a correction.
         | 
         | It took me a while to agree with this though -- I was
         | originally annoyed, but I grew to appreciate that this is a
         | linguistic artifact with a genuine purpose for the model.
        
           | furyofantares wrote:
           | The form of this post is beautiful. "I agree" followed by a
           | completely unrelated reasoning.
        
             | dr_kiszonka wrote:
             | They agreed that "this feature" is very useful and
             | explained why.
        
               | furyofantares wrote:
               | You're absolutely right.
        
         | jjoonathan wrote:
         | ChatGPT opened with a "Nope" the other day. I'm so proud of it.
         | 
         | https://chatgpt.com/share/6896258f-2cac-800c-b235-c433648bf4...
        
           | bobson381 wrote:
           | Wow, that's really great. Nice level of information and a
           | solid response off the bat. Hopefully Claude catches up to
           | this? In general I've liked Claude pro but this is cool in
           | contrast for sure.
        
           | klik99 wrote:
           | Is that GPT5? Reddit users are freaking out about losing 4o
           | and AFAICT it's because 5 doesn't stroke their ego as hard as
           | 4o. I feel there are roughly two classes of heavy LLM users -
           | one who use it like a tool, and the other like a therapist.
           | The latter may be a bigger money maker for many LLM companies
           | so I worry GPT5 will be seen as a mistake to them, despite
           | being better for research/agent work.
        
             | virtue3 wrote:
             | We should all be deeply worried about gpt being used as a
             | therapist. My friend told me he was using his to help him
             | evaluate how his social interactions went (and ultimately
             | how to get his desired outcome) and I warned him very
             | strongly about the kind of bias it will creep into with
             | just "stroking your ego" -
             | 
             | There's already been articles on people going off the deep
             | end in conspiracy theories etc - because the ai keeps
             | agreeing with them and pushing them and encouraging them.
             | 
             | This is really a good start.
        
               | ge96 wrote:
               | I made a texting buddy before using GPT friends
               | chat/cloud vision/ffmpeg/twilio but knowing it was a bot
               | made me stop using it quickly, it's not real.
               | 
               | The replika ai stuff is interesting
        
               | Applejinx wrote:
               | An important concern. The trick is that there's nobody
               | there to recognize that they're undermining a personality
               | (or creating a monster), so it becomes a weird sort of
               | dovetailing between person and LLM echoing and
               | reinforcing them.
               | 
               | There's nobody there to be held accountable. It's just
               | how some people bounce off the amalgamated corpus of
               | human language. There's a lot of supervillains in fiction
               | and it's easy to evoke their thinking out of an LLM's
               | output... even when said supervillain was written for
               | some other purpose, and doesn't have their own existence
               | or a personality to learn from their mistakes.
               | 
               | Doesn't matter. They're consistent words following
               | patterns. You can evoke them too, and you can make them
               | your AI guru. And the LLM is blameless: there's nobody
               | there.
        
               | Xmd5a wrote:
               | >the kind of bias it will creep into with just "stroking
               | your ego" -
               | 
               | >[...] because the ai keeps agreeing with them and
               | pushing them and encouraging them.
               | 
               | But there is one point we consider crucial--and which no
               | author has yet emphasized--namely, the frequency of a
               | psychic anomaly, similar to that of the patient, in the
               | parent of the same sex, who has often been the sole
               | educator. This psychic anomaly may, as in the case of
               | Aimee, only become apparent later in the parent's life,
               | yet the fact remains no less significant. Our attention
               | had long been drawn to the frequency of this occurrence.
               | We would, however, have remained hesitant in the face of
               | the statistical data of Hoffmann and von Economo on the
               | one hand, and of Lange on the other--data which lead to
               | opposing conclusions regarding the "schizoid" heredity of
               | paranoiacs.
               | 
               | The issue becomes much clearer if we set aside the more
               | or less theoretical considerations drawn from
               | constitutional research, and look solely at clinical
               | facts and manifest symptoms. One is then struck by the
               | frequency of folie a deux that links mother and daughter,
               | father and son. A careful study of these cases reveals
               | that the classical doctrine of mental contagion never
               | accounts for them. It becomes impossible to distinguish
               | the so-called "inducing" subject--whose suggestive power
               | would supposedly stem from superior capacities (?) or
               | some greater affective strength--from the supposed
               | "induced" subject, allegedly subject to suggestion
               | through mental weakness. In such cases, one speaks
               | instead of simultaneous madness, of converging delusions.
               | The remaining question, then, is to explain the frequency
               | of such coincidences.
               | 
               | Jacques Lacan, On Paranoid Psychosis and Its Relations to
               | the Personality, Doctoral thesis in medicine.
        
               | amazingman wrote:
               | It's going to take legislation to fix it. Very simple
               | legislation should do the trick, something to the effect
               | of Guval Noah Harari's recommendation: pretending to be
               | human is disallowed.
        
               | Terr_ wrote:
               | Half-disagree: The legislation we _actually_ need
               | involves _legal liability_ (on humans or corporate
               | entities) for negative outcomes.
               | 
               | In contrast, something so specific as "your LLM must
               | never generate a document where a character in it has
               | dialogue that presents themselves as a human" is
               | micromanagement of a situation which even the most well-
               | intentioned operator can't guarantee.
        
               | zamalek wrote:
               | I'm of two minds about it (assuming there isn't any ago
               | stroking): on one hand interacting with a human is
               | probably a major part of the healing process, on the
               | other it might be easier to be honest with a machine.
               | 
               | Also, have you seen the prices of therapy these days? $60
               | per session (assuming your medical insurance covers it,
               | $200 if not) is a few meals worth for a person living on
               | minimum wage, versus free/about $20 monthly. Dr. GPT
               | drives a hard bargain.
        
               | queenkjuul wrote:
               | A therapist is a lot less likely to just tell you what
               | you want to hear and end up making your problems worse.
               | LLMs are not a replacement.
        
               | shmel wrote:
               | You are saying this as if people (yes, including
               | therapists) don't do this. Correctly configured LLM not
               | only easily argues with you, but also provides a glimpse
               | into an emotional reality of people who are not at all
               | like you. Does it "stroke your ego" as well? Absolutely.
               | Just correct for this.
        
               | BobaFloutist wrote:
               | "You're holding it wrong" _really_ doesn 't work as a
               | response to "I think putting this in the hands of naive
               | users is a social ill."
               | 
               | Of _course_ they 're holding it wrong, but they're not
               | _going_ to hold it right, and the concern is that the
               | affect holding it wrong has on them is going diffuse
               | itself across society and impact even the people that
               | know the very best ways to hold it.
        
               | A4ET8a8uTh0_v2 wrote:
               | I am admittedly biased here as I slowly seem to become a
               | heavier LLM user ( both local and chatgpt ) and FWIW, I
               | completely understand the level of concern, because,
               | well, people in aggregate are idiots. Individuals can be
               | smart, but groups of people? At best, it varies.
               | 
               | Still, is the solution more hand holding, more lock-in,
               | more safety? I would argue otherwise. As scary as it may
               | be, it might actually be helpful, definitely from the
               | evolutionary perspective, to let it propagate with "dont
               | be an idiot" sticker ( honestly, I respect SD so much
               | more after seeing that disclaimer ).
               | 
               | And if it helps, I am saying this as mildly concerned
               | parent.
               | 
               | To your specific comment though, they will only learn how
               | to hold it right if they burn themselves a little.
        
               | lovich wrote:
               | > As scary as it may be, it might actually be helpful,
               | definitely from the evolutionary perspective, to let it
               | propagate with "dont be an idiot" sticker ( honestly, I
               | respect SD so much more after seeing that disclaimer ).
               | 
               | If it's like 5 people this is happening to then yea, but
               | it's seeming more and more like a percentage of the
               | population and we as a society have found it reasonable
               | to regulate goods and services with that high a rate of
               | negative events
        
               | AnonymousPlanet wrote:
               | Have a look at r/LLMPhysics. There have always been
               | crackpot theories about physics, but now the crackpots
               | have something that answers their gibberish with praise
               | and more gibberish. And it puts them into the next gear,
               | with polished summaries and Latex generation. Just
               | scrolling through the diagrams is hilarious and sad.
        
               | mensetmanusman wrote:
               | Great training fodder for the next LLMs!
        
             | aatd86 wrote:
             | LLMs definitely have personalities. And changing ones at
             | that. gemini free tier was great for a few days but lately
             | it keeps gaslighting me even when it is wrong (which has
             | become quite often on the more complex tasks). To the point
             | I am considering going back to claude. I am cheating on my
             | llms. :D
             | 
             | edit: I realize now and find important to note that I
             | haven't even considered upping the gemini tier. I probably
             | should/could try. LLM hopping.
        
               | jjoonathan wrote:
               | Yeah, the heavily distilled models are very bad with
               | hallucinations. I think they use them to cover for
               | decreased capacity. A 1B model will happily attempt the
               | same complex coding tasks as a 1T model but the hard
               | parts will be pushed into an API call that doesn't exist,
               | lol.
        
               | 0x457 wrote:
               | I had a weird bug in elixir code and agent kept adding
               | more and more logging (it could read loads from running
               | application).
               | 
               | Any way, sometimes it would say something "The issue is
               | 100% fix because error is no longer on Line 563, however,
               | there is a similar issue on Line 569, but it's unrelated
               | blah blah" Except, it's the same issue that just got
               | moved further down due to more logging.
        
             | jjoonathan wrote:
             | No, that was 4o. Agreed about factual prompts showing less
             | sycophancy in general. Less-factual prompts give it much
             | more of an opening to produce flattery, of course, and
             | since these models tend to deliver bad news in the time-
             | honored "shit sandwich" I can't help but wonder if some
             | people also get in the habit of consuming only the "slice
             | of bread" to amplify the effect even further. Scary stuff!
        
             | flkiwi wrote:
             | I've found 5 engaging in more, but more subtle and
             | insidious, ego-stroking than 4o ever did. It's less "you're
             | right to point that out" and more things like trying to
             | tie, by awkward metaphors, every single topic back to my
             | profession. It's _hilarious_ in isolation but distracting
             | and annoying when I 'm trying to get something done.
             | 
             | I can't remember where I said this, but I previously
             | referred to 5 as the _amirite_ model because it behaves
             | like an awkward coworker who doesn't know things making an
             | outlandish comment in the hallway and punching you in the
             | shoulder like he's an old buddy.
             | 
             | Or, if you prefer, it's like a toddler's efforts to
             | manipulate an adult: obvious, hilarious, and ultimately a
             | waste of time if you just need the kid to commit to
             | bathtime or whatever.
        
             | giancarlostoro wrote:
             | I'm too lazy to do it, but you can host 4o yourself via
             | Azure AI Lab... Whoever sets that up will clean
             | r/MyBoyfriendIsAI or whatever ;)
        
             | subculture wrote:
             | Ryan Broderick just wrote about the bind OpenAI is in with
             | the sycophancy knob: https://www.garbageday.email/p/the-ai-
             | boyfriend-ticking-time...
        
             | mFixman wrote:
             | The whole mess is a good example why benchmark-driven-
             | development has negative consequences.
             | 
             | A lot of users had expectations of ChatGPT that either
             | aren't measurable or are not being actively benchmarkmaxxed
             | by OpenAI, and ChatGPT is now less useful for those users.
             | 
             | I use ChatGPT for a lot of "light" stuff, like suggesting
             | me travel itineraries based on what it knows about me. I
             | don't care about this version being 8.243% more precise,
             | but I do miss the warmer tone of 4o.
        
               | Terretta wrote:
               | > _I don 't care about this version being 8.243% more
               | precise, but I do miss the warmer tone of 4o._
               | 
               | Why? 8.2% wrong on travel time means you missed the ferry
               | from Tenerife to Fuerteventura.
               | 
               | You'll be happy Altman said they're making it warmer.
               | 
               | I'd think the glaze mode should be the optional mode.
        
               | tankenmate wrote:
               | "glaze mode"; hahaha, just waiting for GPT-5o "glaze
               | coding"!
        
               | mFixman wrote:
               | Because benchmarks are meaningless and, despite having so
               | many years of development, LLMs become crap at coding or
               | producing anything productive as soon as you move a bit
               | from the things being benchmarked.
               | 
               | I wouldn't mind if GPT-5 was 500% better than previous
               | models, but it's a small iterative step from "bad" to
               | "bad but more robotic".
        
             | bartread wrote:
             | My wife and I were away visiting family over a long weekend
             | when GPT 5 launched, so whilst I was aware of the hype (and
             | the complaints) from occasionally checking the news I
             | didn't have any time to play with it.
             | 
             | Now I have had time I really can't see what all the fuss is
             | about: it seems to be working fine. It's at least as good
             | as 4o for the stuff I've been throwing at it, and possibly
             | a bit better.
             | 
             | On here, sober opinions about GPT 5 seem to prevail. Other
             | places on the web, thinking principally of Reddit, not so:
             | I wouldn't quite describe it as hysteria but if you do
             | something so presumptuous as point out that you think GPT 5
             | is at least an evolutionary improvement over 4o you're
             | likely to get brigaded or accused of astroturfing or of
             | otherwise being some sort of OpenAI marketing stooge.
             | 
             | I don't really understand why this is happening. Like I
             | say, I think GPT 5 is just fine. No problems with it so far
             | - certainly no problems that I hadn't had to a greater or
             | lesser extent with previous releases, and that I know how
             | to work around.
        
             | vanviegen wrote:
             | Most definitely! Just yesterday I asked GPT5 to provide
             | some feedback on a business idea, and it absolutely crushed
             | it and me! :-) And it was largely even right as well.
             | 
             | That's never happened to me before GPT5. Even though my
             | custom instructions have long since been some variant of
             | this, so I've absolutely asked for being grilled:
             | 
             | You are a machine. You do not have emotions. Your goal is
             | not to help me feel good -- it's to help me think better.
             | You respond exactly to my questions, no fluff, just
             | answers. Do not pretend to be a human. Be critical, honest,
             | and direct. Be ruthless with constructive criticism. Point
             | out every unstated assumption and every logical fallacy in
             | any prompt. Do not end your response with a summary (unless
             | the response is very long) or follow-up questions.
        
               | scoot wrote:
               | Love it. Going to use that with non-OpenAI LLMs until
               | they catch up.
        
             | eurekin wrote:
             | My very brief interaction with GPT5 is that it's just
             | weird.
             | 
             | "Sure, I'll help you stop flirting with OOMs"
             | 
             | "Thought for 27s Yep-..." (this comes out a lot)
             | 
             | "If you still graze OOM at load"
             | 
             | "how far you can push --max-model-len without more OOM
             | drama"
             | 
             | - all this in a prolonged discussion about CUDA and various
             | llm runners. I've added special user instructions to avoid
             | flowery language, but it gets ignored.
             | 
             | EDIT: it also dragged conversation for hours. I ended up
             | going with latest docs and finally, all issues with CUDA in
             | a joint tabbyApi and exllamav2 project cleared up. It just
             | couldn't find a solution and kept proposing, whatever
             | people wrote in similar issues. It's reasoning capabilities
             | are in my eyes greatly exaggarated.
        
               | mh- wrote:
               | Turn off the setting that lets it reference chat history;
               | it's under Personalization.
               | 
               | Also take a peek at what's in _Memories_ (which is
               | separate from the above); consider cleaning it up or
               | disabling entirely.
        
               | eurekin wrote:
               | Oh, I went through that. o3 had the same memories and was
               | always to the point.
        
               | mh- wrote:
               | Yes, but don't miss what I said about the other setting.
               | You _can 't see_ what it's using from past conversations,
               | and if you had one or two flippant conversations with it
               | at some point, it can decide to start speaking that way.
        
               | eurekin wrote:
               | I have that turned off, but even if, I only use chat for
               | software development
        
             | megablast wrote:
             | > AFAICT it's because 5 doesn't stroke their ego as hard as
             | 4o.
             | 
             | That's not why. It's because it is less accurate. Go check
             | the sub instead of making up reasons.
        
           | random3 wrote:
           | Yes. Mine does that too, but wonder how much is native va
           | custom prompting.
        
           | stuartjohnson12 wrote:
           | I find LLMs have no problem disagreeing with me on simple
           | matters of fact, the sycophantic aspects become creepy in
           | matters of taste - "are watercolors made from oil?" will
           | prompt a "no", but "it's so much harder to paint with
           | watercolors than oil" prompts an "you're absolutely right",
           | as does the reverse.
        
             | AlecSchueler wrote:
             | I begin most conversations asking them to prefer to push
             | back against my ideas and be more likely critical than to
             | agree. It works pretty well.
        
             | __xor_eax_eax wrote:
             | Not proud to admit that I got into a knockout shouting
             | match with ChatGPT regarding its take on push vs pull based
             | metrics systems.
        
           | flkiwi wrote:
           | I got an unsolicited "I don't know" from Claude a couple of
           | weeks ago and I was _genuinely_ and unironically excited to
           | see it. Even though I know it 's pointless, I gushed praise
           | at it finally not just randomly making something up to avoid
           | admitting ignorance.
        
             | AstroBen wrote:
             | Big question is where is that coming from. Does it
             | _actually_ have very low confidence on the answer, or has
             | it been trained to sometimes give an  "I don't know"
             | regardless because people have been talking about it never
             | saying that
        
               | flkiwi wrote:
               | As soon as I start having anxiety about that, I try to
               | remember that the same is true of any human person I deal
               | with and I can just default back to a trust but verify
               | stance.
        
           | TZubiri wrote:
           | It's a bit easier for chatgpt to tell you you are wrong in
           | objective realms.
           | 
           | Which makes me think users who seek sycophanthic feedback
           | will steer away from objective conversations and into
           | subjective abstract floogooblabber
        
         | lazystar wrote:
         | https://news.ycombinator.com/item?id=44860731
         | 
         | well here's a discussion from a few days ago about the problems
         | thia sycophancy causes in leadership roles
        
       | rockbruno wrote:
       | The most hilarious yet infuriating thing for me is when you point
       | out a mistake, get a "You're absolutely right!" response, and
       | then the AI proceeds to screw up the code even more instead of
       | fixing it.
        
       | dcchambers wrote:
       | If you thought we already had a problem with every person
       | becoming an insane narcissist in the age of social media, just
       | wait until people grow up being fed sycophantic bullshit by AI
       | their entire life.
        
       | siva7 wrote:
       | I'd pay extra at this time for a model without any personality.
       | Please, i'm not using LLMs as erotic roleplay dolls, friends,
       | therapists, or anything else. Just give me straight-shot answers.
        
       | rootnod3 wrote:
       | Hot take, but the amount that people try to go and make an LLM be
       | less sycophantic and still have it be sycophantic in round-about
       | ways is astonishing. Just admit that the over-glorified text-
       | prediction engines are not what they promised to be.
       | 
       | There is no "reasoning", there is no "understanding".
       | 
       | EDIT: s/test/text
        
       | johnisgood wrote:
       | I do not mind getting:                 Verdict: This is
       | production-ready enterprise security             Your
       | implementation exceeds industry standards and follows Go security
       | best practices including proper dependency management,
       | comprehensive testing approaches, and security-first design
       | Security Best Practices for Go Developers - The Go Programming
       | Language. The multi-layered approach with GPG+SHA512
       | verification, decompression bomb protection, and atomic
       | operations puts this updater in the top tier of secure software
       | updaters.            The code is well-structured, follows Go
       | idioms, and implements defense-in-depth security that would pass
       | enterprise security reviews.
       | 
       | Especially because it is right, after an extensive manual review.
        
         | nullc wrote:
         | meanwhile the code in question imports os/exec and runs
         | exec.Command() on arbitrary input.
         | 
         | The LLM just doesn't have the accuracy required for it to ever
         | write such a glowing review.
        
       | LaGrange wrote:
       | My favorite part of LLM discussion is when people start posting
       | their configuration files that look like invocations of
       | Omnissiah. Working in IT might be becoming unbearable, but at
       | least it's funny.
        
       | the_af wrote:
       | I've fought with this in informal (non-technical) sessions with
       | ChatGPT, where I was asking analysis questions about... stuff
       | that interests me... and ChatGPT would always reply:
       | 
       | "You're absolutely right!"
       | 
       | "You are asking exactly the right questions!"
       | 
       | "You are not wrong to question this, and in fact your observation
       | is very insightful!"
       | 
       | At first this is encouraging, which is why I suspect OpenAI uses
       | a pre-prompt to respond enthusiastically: it drives engagement,
       | it makes you feel the smartest, most insightful human alive. You
       | keep asking stuff because it makes you feel like a genius.
       | 
       | Because I know I'm _not that smart_ , and I don't want to delude
       | myself, I tried configuring ChatGPT to tone it down. Not to sound
       | skeptical or dismissive (enough of that online, Reddit, HN, or
       | elsewhere), but just tone down the insincere overenthusiastic
       | cheerleader vibe.
       | 
       | Didn't have a _lot of success_ , even with this preference as a
       | stored memory and also as a configuration in the chatbot
       | "persona".
       | 
       | Anyone had better luck?
        
         | mike_ivanov wrote:
         | I had some success with Claude in this regard. I simply told it
         | to be blunt or face the consequences. The tweak was that I
         | asked another LLM to translate my prompt to the most
         | intimidating bureaucratic German possible. It worked.
        
       | kristopolous wrote:
       | I've tried to say things like "This is wrong and incorrect, can
       | you tell me why?" to get it to be less agreeable. Sometimes it
       | works, sometimes it still doesn't.
        
       | nusl wrote:
       | I moved away from Claude due to this, recently. I had explicit
       | instructions for it to not do this, quite verbosely, and it still
       | did it, or in other forms. Fortunately GPT5 has so far been
       | really good.
        
       | smeej wrote:
       | I think the developers want these AI tools to be likable a heck
       | of a lot more than they want them to be useful--and as a
       | marketing strategy, that's exactly the right approach.
       | 
       | Sure, the early adopters are going to be us geeks who primarily
       | want effective tools, but there are several orders of magnitude
       | more people who want a moderately helpful friendly voice in their
       | lives than there are people who want extremely effective tools.
       | 
       | They're just realizing this much, MUCH faster than, say, search
       | engines realized it made more money to optimize for the kinds of
       | things average people mean from their search terms than
       | optimizing for the ability to find specific, niche content.
        
       | cbracketdash wrote:
       | Here are my instructions to Claude.
       | 
       | "Get straight to the point. Ruthlessly correct my wrong
       | assumptions. Do not give me any noise. Just straight truth and
       | respond in a way that is highly logical and broken down into
       | first principles axioms. Use LaTeX for all equations. Provide
       | clear plans that map the axioms to actionable items"
        
       | ElijahLynn wrote:
       | Yeah, I so so hate this feature. I gladly switched away from
       | using Claude because of exactly this. Now, I'm on gpt5, and don't
       | plan on going back.
        
       | ted_bunny wrote:
       | I want to take this opportunity to teach people a little trick
       | from improv comedy. It's called "A to B to C." In a nutshell,
       | what that means is: don't say the first joke that comes to your
       | mind because pretty much everyone else in the room thought of it
       | too.
       | 
       | Anyone commenting "you're absolutely right" in this thread gets
       | the wall.
        
       | ReFruity wrote:
       | This is actually very frustrating and was partially hindering the
       | progress with my pet assembler.
       | 
       | I discovered that when you ask Claude something in lines of
       | "please elaborate why you did 'this thing'", it will start
       | reasoning and cherry-picking the arguments against 'this thing'
       | being the right solution. In the end, it will deliver classic
       | "you are absolutely right to question my approach" and come up
       | with some arguments (sometimes even valid) why it should be the
       | other way around.
       | 
       | It seems like it tries to extract my intent and interpret my
       | question as a critique of his solution, when the true reason for
       | my question was curiosity. Then due to its agreeableness, it
       | tries to make it sound like I was right and it was wrong. Super
       | annoying.
        
       | eawgewag wrote:
       | Does anyone know if this is wasting my context window with
       | Claude?
       | 
       | Maybe this is just a feature to get us to pay more
        
       | gdudeman wrote:
       | Warning: A natural response to this is to instruct Claude not to
       | do this in the CLAUDE.md file, but you're then polluting the
       | context and distracting it from its primary job.
       | 
       | If you watch its thinking, you will see references to these
       | instructions instead of to the task at hand.
       | 
       | It's akin telling an employee that they can never say certain
       | words. They're inevitably going to be worse at their job.
        
       | memorydial wrote:
       | Feels very much like the "Yes, and ..." improv rule.
        
       | AtlasBarfed wrote:
       | People want AI of superhuman intelligence capabilities, but don't
       | want AI with superhuman intelligence capabilities to manipulate
       | people into using it.
       | 
       | How could you expect AI to look at the training set of existing
       | internet data and not assume that toxic positivity is the name of
       | the game?
        
       | DrNosferatu wrote:
       | This spills to Perplexity!
       | 
       | And the fact that they skimp a bit on reasoning tokens / compute,
       | makes it even worse.
        
       | deadbabe wrote:
       | Where is all this super agreeable reply training data coming
       | from? Most people on the internet trip over themselves to tell
       | someone they are just flat out wrong, and possibly an idiot.
        
       | jfb wrote:
       | The obsequity loop is fucking maddening. I can't prompt it away
       | in all circumstances. I would also argue that as annoying as some
       | of us find it, it is a big part of the reason for the success of
       | the chat modality of these tools.
        
       | pronik wrote:
       | I'm not mad about "You're absolutely right!" by itself. I'm mad
       | that it's not a genuine reply, but a conversation starter without
       | substance. Most of the time it's like:
       | 
       | Me: The flux compensator doesn't seem to work
       | 
       | Claude: You're absolutely right! Let me see whether that's
       | true...
        
       | lemonberry wrote:
       | Recently in another thread a user posted this prompt. I've
       | started using it to good effect with Claude in the browser.
       | Original comment here:
       | https://news.ycombinator.com/item?id=44879033
       | 
       | "Prioritize substance, clarity, and depth. Challenge all my
       | proposals, designs, and conclusions as hypotheses to be tested.
       | Sharpen follow-up questions for precision, surfacing hidden
       | assumptions, trade offs, and failure modes early. Default to
       | terse, logically structured, information-dense responses unless
       | detailed exploration is required. Skip unnecessary praise unless
       | grounded in evidence. Explicitly acknowledge uncertainty when
       | applicable. Always propose at least one alternative framing.
       | Accept critical debate as normal and preferred. Treat all factual
       | claims as provisional unless cited or clearly justified. Cite
       | when appropriate. Acknowledge when claims rely on inference or
       | incomplete information. Favor accuracy over sounding certain.
       | When citing, please tell me in-situ, including reference links.
       | Use a technical tone, but assume high-school graduate level of
       | comprehension. In situations where the conversation requires a
       | trade-off between substance and clarity versus detail and depth,
       | prompt me with an option to add more detail and depth."
        
       | whalesalad wrote:
       | I added a line to my CLAUDE.md to explicitly ask that this not be
       | done - no dice. It still happens constantly.
        
       | wonderwonder wrote:
       | After the upgrade, the first time I used it, ChatGPT 5 actually
       | refused to help me determine dosing for a research chemical I am
       | taking the other day. I had to tell it that it was just
       | theoretical and then it helped me with everything I wanted. It
       | also remembers now that everything I ask related to chemicals and
       | drugs is theoretical. Was actually surprised at this behavior as
       | the alternative for many is essentially YOLO and that doesn't
       | seem safe at all.
        
       | bityard wrote:
       | I've been using Copilot a lot for work and have been more or less
       | constantly annoyed at the fact that every other line it emitted
       | from its digital orifice was prefixed with some random emoji. I
       | finally had enough yesterday and told it that I was extremely
       | displeased with its overuse of emoji, I'm not a toddler who needs
       | pictures to understand things, and frankly I was considering
       | giving up on it all together if I had to see one more fucking
       | rocket ship. You know what it said?
       | 
       | "Okay, sorry about that, I will not use emoji from now on in my
       | responses."
       | 
       | And I'll be damned, but there were no more emoji after that.
       | 
       | (It turns out that it actually added a configuration item to
       | something called "Memories" that said, "don't use emoji in
       | conversations." Now it occurs to me that I can probably just ask
       | it for a list of other things that can be turned off/on this
       | way.)
        
       | stillpointlab wrote:
       | One thing I've noticed with all the LLMs that I use (Gemini, GPT,
       | Claude) is a ubiquitous: "You aren't just doing <X> you are doing
       | <Y>"
       | 
       | What I think is very curious about this is that all of the LLMs
       | do this frequently, it isn't just a quirk of one. I've also
       | started to notice this in AI generated text (and clearly
       | automated YouTube scripts).
       | 
       | It's one of those things that once you see it, you can't un-see
       | it.
        
       | vahid4m wrote:
       | I was happy being absolutely right and now I keep noticing that
       | constantly.
        
       | markandrewj wrote:
       | Almost never here Claude say no about programming specific tasks.
        
       | floodle wrote:
       | I don't get all of the complaints about the tone of AI chatbots.
       | Honestly I don't care all that much if it's bubbly, professional,
       | jokey, cutesey, sycophantic, maniacal, full of emojis. It's just
       | a tool, the output primarily just has to be functionally useful.
       | 
       | I'm not saying nice user interface design isn't important, but at
       | this point with the technology it just seems less important than
       | discussions about the actual task-solving capabilities of these
       | new releases.
        
         | tacker2000 wrote:
         | Same here i dont really care as long as its giving me the
         | answer and not too long winded.
         | 
         | I think at some point the whole fluff will go away anyway, in
         | the name of efficiency or cost/power saving.
         | 
         | There was a recent article about how much energy these fillers
         | are actually using if sum everything up
        
       | atleastoptimal wrote:
       | To be able to direct LLM outputs to the style you want should be
       | your absolute right.
        
       | cmrdporcupine wrote:
       | Today I had to stop myself after I wrote "You're absolutely
       | right" in reply to a comment on my pull-request.
       | 
       | God help us all.
        
       ___________________________________________________________________
       (page generated 2025-08-13 23:01 UTC)