hngopher.com

       [HN Gopher] I trusted an LLM, now I'm on day 4 of an afternoon p...
       ___________________________________________________________________
        
       I trusted an LLM, now I'm on day 4 of an afternoon project
        
       Author : nemofoo
       Score  : 85 points
       Date   : 2025-01-27 21:37 UTC (1 hours ago)
        
 (HTM) web link (nemo.foo)
 (TXT) w3m dump (nemo.foo)
        
       | BigParm wrote:
       | LLM == WGCM = Wild Goose Chase Model
        
       | tanseydavid wrote:
       | I ask this question without a hint of tone or sarcasm. You said:
       | "*it's a junior dev faking competence. Trust it at your own
       | risk.*" My question is simply: "wouldn't you expect to personally
       | be able to tell that a human junior dev was faking competence?"
       | Why should it be different with the LLM?
        
         | latexr wrote:
         | Obviously, it depends on context. When talking to someone live
         | you can pick up on subtle hints such as tone of voice, or where
         | they look, or how they gesticulate, or a myriad other signals
         | which give you a hint to their knowledge gaps. If you're
         | communicating via text, the signals change. Furthermore, as you
         | interact with people more often you understand them better and
         | refine your understanding of them. LLMs always forget and
         | "reset" and are in flux. They aren't as consistent. Plus, they
         | don't grow with you and pick up on _your_ signals and wants.
         | 
         | It's incredibly worrying that it needs to be explained again
         | and again that LLMs are different from people, do not behave
         | like people, and should not be compared to people or interacted
         | like people, because they are not people.
        
           | appleorchard46 wrote:
           | Interestingly your description of social cues you expect to
           | pick up on are the exact sort of social cues I struggle with.
           | If someone says something, generally speaking I expect it to
           | be true unless there is an issue with it that suggests
           | otherwise.
           | 
           | I suppose the wide range of negative and positive experiences
           | people seem to have working with LLMs is related to the wide
           | range of expectations people have for their interactions in
           | general.
        
         | layer8 wrote:
         | Not instantly. You'd give the human junior dev the benefit of
         | the doubt at first. But when it becomes clear that the junior
         | dev is faking competence all the time (that might take longer
         | than the four days in TFA -- yes I know it's not exactly
         | comparable, just saying) and won't stop with that and start
         | being honest instead, you'd eventually let them go, because
         | that's no way to work with someone.
        
       | zitterbewegung wrote:
       | I have used LLMs as a tool and I start to "give up" working with
       | it after a few tries. It excels at simple tasks, boilerplate, or
       | scripts but larger programs you really have to know what exactly
       | you want to do.
       | 
       | I do see the LLMs ingesting more and more documentation and
       | content and they are improving at giving me right answers. Almost
       | two years ago I don't believe they had every python package
       | indexed and now they appear to have at least the documentation or
       | source code of it.
        
         | XorNot wrote:
         | The trouble is the only reliable use-case LLMs actually seem
         | good at is "augmented search engine". Any attempts at coding
         | with them just end up feeling like trying to code via a worse
         | interface.
         | 
         | So it's handy to get a quick list of "all packages which do X",
         | but it's worse then useless to have it speculate as to which
         | one to use or why, because of the hallucination problem.
        
       | anigbrowl wrote:
       | I've found them tobe quite a time saver, within limits. The blog
       | post seemed scattered and disorganized to me, and the author
       | admits having no experience with using LLMs to this end, so
       | perhaps the problem lies behind their eyes.
        
       | addaon wrote:
       | There's not much actual LLM-generated text in this post to go by,
       | but it seems like each of the tokens generated by the LLM would
       | be reasonable to have high probability. It sounds like the
       | developer here thought that the sequence of tokens then carried
       | meaning, where instead any possible meaning came from the
       | reading. I wonder if this developer would be as irritated by the
       | inaccuracies if they had cast sticks onto the ground to manage
       | their stock portfolio and found the prophecy's "meaning" to be
       | plausible but inaccurate.
        
       | pieix wrote:
       | > AI isn't a co-pilot; it's a junior dev faking competence. Trust
       | it at your own risk.
       | 
       | This is a good take that tracks with my (heavy) usage of LLMs for
       | coding. Leveraging productive-but-often-misguided junior devs is
       | a skill every dev should actively cultivate!
        
         | codr7 wrote:
         | What you're doing is sacrificing learning for speed.
         | 
         | Which is fine, if it's a conscious choice for yourself.
        
           | piva00 wrote:
           | I don't think GP was talking about themselves being a junior
           | using LLMs, at least my interpretation was that devs should
           | learn how to leverage misguided junior, and LLMs are more-or-
           | less on the level of a misguided junior.
           | 
           | Which I completely agree, I use LLMs for the cases where I do
           | know what I'm trying to do, I just can't remember some exact
           | detail that would require reading documentation. It's much
           | quicker to leverage a LLM rather than going on a wild goose
           | chase of the piece of information I know exists.
           | 
           | Also it's a pretty good tool to scaffold the boring stuff,
           | asking a LLM "generate test code for X asserting A, B, and C"
           | and editing it to be a proper test frees up mental space for
           | more important stuff.
           | 
           | I wouldn't trust a LLM to generate any kind of business
           | logic-heavy code, instead I use it as a quite smart
           | template/scaffold generator.
        
         | nrb wrote:
         | > Leveraging productive-but-often-misguided junior devs is a
         | skill every dev should actively cultivate!
         | 
         | Feels like this is only worthwhile because the junior dev
         | learns from the experience; an investment that yields benefits
         | all around, in the broad sense. Nobody wants a junior around
         | that refuses to learn in perpetuity, serving only as a drag on
         | productivity and eventually your sanity.
        
       | stuaxo wrote:
       | The junior dev faking competence is useful but needs a lot of
       | supervision (unlike a real junior dev we don't know if this one
       | will get better).
        
       | tacoooooooo wrote:
       | the "AI lies" takeaway is way off for those actually using these
       | tools. Calling it a "junior dev faking competence" is catchy, but
       | misses the point. We're not expecting a co-pilot, it's a tool, a
       | super-powered intern that needs direction. The spaghetti code
       | mess wasn't AI "lying", it was a lack of control and proper
       | prompting.
       | 
       | Experienced folks aren't surprised by this. LLMs are fast for
       | boilerplate, research, and exploring ideas, but they're not
       | autonomous coders. The key is you staying in charge: detailed
       | prompts, critical code review, iterative refinement. Going back
       | to web interfaces and manual pasting because editor integration
       | felt "too easy" is a massive overcorrection. It's like ditching
       | cars for walking after one fender bender.
       | 
       | Ultimately, this wasn't an AI failure, it was an inexperienced
       | user expecting too much, too fast. The "lessons learned" are
       | valid, but not AI-specific. For those who use LLMs effectively,
       | they're force multipliers, not replacements. Don't blame the tool
       | for user error. Learn to drive it properly.
        
         | cruffle_duffle wrote:
         | "Experienced folks" in this case means folks who've used LLM's
         | enough to somewhat understand how to "feed them" in ways that
         | make the tools generate productive output.
         | 
         | Learning to properly prompt an LLM to get a net gain in value
         | is a skill in it of itself.
        
         | latexr wrote:
         | > We're not expecting a co-pilot
         | 
         | Microsoft's offering is literally called "copilot". That is
         | exactly what they're marketing it as.
        
       | potsandpans wrote:
       | Counterexample: Ive been able to complete more side projects in
       | the last month leveraging llms than i have ever in my life. One
       | of which I believe to have potential as a viable product, and
       | another which involved complicated rust `no_std` and linker setup
       | for compiling rust code onto bare metal RISCV from scratch.
       | 
       | I think the key to being successful here is to realize that
       | you're still at the wheel as an engineer. The llm is there to
       | rapidly synthesize the universe of information.
       | 
       | You still need to 1) have solid fundamentals in order to have an
       | intuition against that synthesis, and 2) be experienced enough to
       | translate that synthesis into actionable outcomes.
       | 
       | If youre lacking in either, youre at the same whims of copypasta
       | that have always existed.
        
         | mythrwy wrote:
         | I've had both experiences strangely enough.
        
         | talldayo wrote:
         | > which involved complicated rust `no_std` and linker setup for
         | compiling rust code onto bare metal RISCV from scratch.
         | 
         | That's complicated, but I wouldn't say the resulting software
         | is complex. You gave an LLM a repetitive, translation-based
         | job, and you got good results back. I can also believe that an
         | LLM could write up a dopey SAAS in half the time it would take
         | a human to do the same.
         | 
         | But having the right parameters only takes you _so_ far. Once
         | you click generate, you are trusting that the model has some
         | familiarity with your problem and can guide you without needing
         | assistance. Most people I 've seen rely entirely on linting and
         | runtime errors to debug AI code, not "solid fundamentals" that
         | can fact-check a problem they needed ChatGPT to solve first
         | place. And the "experience" required to iterate and deploy AI-
         | generated code basically boils down to your copy-and-paste
         | skills. I like my UNIX knowledge, but it's not a big enough
         | gate to keep out ChatGPT Andy and his cohort of enthusiastic
         | morons.
         | 
         | We're going to see thousands of AI-assisted success stories
         | come out of this. But we already had those "pennies on the
         | dollar" success stories from hiring underpaid workers out of
         | India and Pakistan. AI will not solve the unsolved problems of
         | our industry and in many ways it will exacerbate the
         | preexisting issues.
        
         | baxtr wrote:
         | Is it reasonable to assume that more senior devs benefit more
         | from LLMs?
        
           | dogma1138 wrote:
           | It depends it think it's less about how senior they are and
           | how good they are at writing requirements, and knowing what
           | directives should be explicitly stated and what can be safely
           | inferred.
           | 
           | Basically if they are good at utilizing junior developers and
           | interns or apprentices they probably will do well with an LLM
           | assistant.
        
         | dogma1138 wrote:
         | Indeed LLMs are useful as an intern, they are at the "cocky
         | grad" stage of their careers. If you don't understand the
         | problem and can't steer the solution and worse has only limited
         | understanding of the code they produce you are unlikely to be
         | productive.
         | 
         | On the other hand if you understand what needs to be done, and
         | how to direct the work the productivity boost can be massive.
         | 
         | Claude 3.5 sonnet and O1 are awesome at code generation even
         | with relatively complex tasks and they have a long enough
         | context and attention windows that the code they produce even
         | on relatively large projects can be consistent.
         | 
         | I also found a useful method of using LLMs to "summarize" code
         | in an instructive manner which can be used for future prompts.
         | For example summarizing a large base class that may be reused
         | in multiple other classes can be more effective than having to
         | overload a large part of your context window with a bunch do
         | code.
        
       | nsavage wrote:
       | Funny enough, I posted an article I wrote here yesterday with the
       | same sort of thesis. Different technologies (mine was Docker) but
       | same idea of LLM leading me astray and causing a lot of
       | frustration
        
       | powerset wrote:
       | I've had a similar experience, shipping new features at
       | incredible speed, then waste a ton of time going down the wrong
       | track trying to debug something because the LLM gave me a
       | confidently wrong solution.
        
         | williamcotton wrote:
         | Well that's kind of on you for not noticing that it was the
         | wrong solution, isn't it?
        
           | cruffle_duffle wrote:
           | I think the parents post happened to everybody, and if it
           | hasn't it will.
           | 
           | The edge between being actually more productive or just
           | "pretend productive" using large language models is something
           | that we all haven't completely figured out yet.
        
           | trinix912 wrote:
           | Sometimes the solution is 99% correct but the other 1% is so
           | subtly wrong that it both doesn't work and is a debugging
           | hell.
        
           | mythrwy wrote:
           | Ya but you kind of get painted in corner sometimes. And
           | sunken cost fallacy kicks in.
        
           | nyarlathotep_ wrote:
           | often it's something you casually overlook, some minor
           | implementation detail that you didn't give much thought to
           | that ends up being a huge mess later on, IME
        
       | KronisLV wrote:
       | In my experience LLMs will help you with things that have been
       | solved thousands of times before and are just a matter of finding
       | some easily researched solution.
       | 
       | The very moment when you try to go off the beaten path and do
       | something unconventional or stuff that most people won't have
       | written a lot about, it gets more tricky. Just consider how many
       | people will know how to configure some middleware in a Node.js
       | project... vs most things related to hardware or low level work.
       | Or even working with complex legacy codebases that have bits of
       | code with obscure ways of interacting and more levels of
       | abstraction that can be reasonably put in context.
       | 
       | Then again, if an LLM gets confused, then a person might as well.
       | So, personally I try to write code that'd be understandable by
       | juniors and LLMs alike.
        
         | winocm wrote:
         | In my experience, a LLM decided to not know type alignment
         | rules in C and confidently trotted out the wrong answer. It
         | left a horrible taste in my mouth for the one time I decided to
         | look at using a LLM for anything and it keeps leaving me
         | wondering if I'd end up more time bashing the LLM into working
         | than just working out the answer myself and learning the
         | underlying reasoning why.
         | 
         | It was so wrong that I wonder what version of the C standard it
         | was even hallucinating.
        
         | NitpickLawyer wrote:
         | > vs most things related to hardware or low level work.
         | 
         | counter point:
         | 
         | https://github.com/ggerganov/llama.cpp/pull/11453
         | 
         | > This PR provides a big jump in speed for WASM by leveraging
         | SIMD instructions for qX_K_q8_K and qX_0_q8_0 dot product
         | functions.
         | 
         | > Surprisingly, 99% of the code in this PR is written by
         | DeekSeek-R1. The only thing I do is to develop tests and write
         | prompts (with some trials and errors)
        
           | alfalfasprout wrote:
           | A single PR doesn't really "prove" anything. Optimization
           | passes on well-tested narrowly scoped code are something that
           | LLMs are already pretty good at.
        
         | ryandrake wrote:
         | I use CoPilot pretty much as a smarter autocomplete that can
         | sometimes guess what I'm planning to type next. I find it's not
         | so good at answering prompts, but if I type:
         | r = (rgba >> 24) & 0xff;
         | 
         | ...and then pause, it's pretty good at guessing:
         | g = (rgba >> 16) & 0xff;         b = (rgba >> 8) & 0xff;
         | a = rgba & 0xff;
         | 
         | ... for the next few lines. I don't really ask it to do more
         | heavy lifting than that sort of thing. Certainly nothing like
         | "Write this full app for me with these requirements [...]"
        
       | anaisbetts wrote:
       | Context matters a lot, copy-pasting snippets to a webpage is
       | _way_ less effective than Cursor/Windsurf.
        
       | cmdtab wrote:
       | Today, I needed to write a proxy[0] that wraps an object and log
       | all method calls recursively.
       | 
       | I asked claude to write the initial version. It came up with a
       | complicated class based solution. I spent more than 30 minutes
       | getting a good abstract to come out. I was copy pasting
       | typescript errors and applying fixes it suggested without
       | thinking much.
       | 
       | In the end, I gave up and wrote what I wanted myself in 5
       | minutes.
       | 
       | 0] https://github.com/cloudycotton/browser-
       | operator/blob/main/s...
        
       | transcriptase wrote:
       | I've been able to do more far complex things with ESP32s and RPis
       | in an evening without knowing the first thing about python or
       | c++.
       | 
       | I can also tell when it's stuck in some kind of context swamp and
       | won't be any more help, because it will just keep making the same
       | stupid mistakes over and over and generally forgetting past
       | instructions.
       | 
       | At that point I take the last working code and paste it into a
       | new chat.
        
       | thot_experiment wrote:
       | As opposed to not trusting an LLM, and ending up on day 4 of an
       | afternoon project? :P
       | 
       | I've been doing that since way before LLMs were a thing.
        
       | ravroid wrote:
       | I've found LLMs most useful for spinning up prototypes. But I'm
       | able to offload less tasks to the LLM as the project grows in
       | complexity and size.
       | 
       | One strategy I've been experimenting with is maintaining a 'spec'
       | document, outlining all of the features and any relevant
       | technical notes. I include the spec with all of the relevant
       | source files in my prompt before asking the LLM to implement a
       | new change or feature. This way it doesn't have to do as much
       | guessing as to what my code is doing, and I can avoid relying on
       | long-running conversations to maintain context. Instead, for each
       | big change I include an up-to-date spec and all of the relevant
       | source files.
       | 
       | I use an NPM script to automate concatenating the spec + source
       | files + prompt, which I then copy/paste to o1. So far this has
       | been working somewhat reliably for the early stages of a project
       | but has diminishing returns.
        
       | mordymoop wrote:
       | I am frankly tired of seeing this kind of post on HN. I feel like
       | the population of programmers is bifurcating into those who are
       | committed to mastering these tools, learning to work around their
       | limitations and working to leverage their strengths... and those
       | who are committed to complaining about how they aren't already
       | perfect Culture Ship Minds.
       | 
       | We get it. They're not superintelligent at everything yet. They
       | couldn't infer what you must've really meant in your heart from
       | your initial unskillful prompt. They couldn't foresee every
       | possible bug and edge case from the first moment of
       | conceptualizing the design, a flaw which I'm sure you don't have.
       | 
       | The thing that pushes me over the line into ranting territory is
       | that _computer programmers_ , of all people, should know that
       | _computers do what you tell them to._
        
       ___________________________________________________________________
       (page generated 2025-01-27 23:00 UTC)