[HN Gopher] How I program with LLMs
       ___________________________________________________________________
        
       How I program with LLMs
        
       Author : stpn
       Score  : 834 points
       Date   : 2025-01-07 00:07 UTC (1 days ago)
        
 (HTM) web link (crawshaw.io)
 (TXT) w3m dump (crawshaw.io)
        
       | mlepath wrote:
       | The first rule of programming with LLMs is don't use them for
       | anything you don't know how to do. If you can look at the
       | solution and immediately know what's wrong with it, they are a
       | time saver otherwise...
       | 
       | I find chat for search is really helpful (as the article states)
        
         | qianli_cs wrote:
         | Exactly, you have to (vaguely) know what you're looking for and
         | have some basic ideas of what algorithms would work. AI is good
         | at helping with syntax stuff but not really good at thinking.
        
         | itsgrimetime wrote:
         | IMO this is a bad take. I use LLMs for things I don't know how
         | to do myself all the time. Now, I wouldn't use one to write
         | some new crypto functions because the risk associated with
         | getting it wrong is huge, but if I need to write something like
         | a wrapper around some cloud provider SDK that I'm unfamiliar
         | with, it gets me 90% of the way there. It also is way more
         | likely to know at least _some_ of the best practices where I'll
         | likely know none. Even for more complex things getting some
         | working hello world examples from an LLM gives me way more
         | threads to pull on and research than web searching ever has.
        
           | Retr0id wrote:
           | > if I need to write something like a wrapper around some
           | cloud provider SDK that I'm unfamiliar with
           | 
           | But "writing a wrapper" is (presumably) a process you're
           | familiar with, you can tell if it's going off the rails.
        
             | joemazerino wrote:
             | Writing a wrapper is easier to verify because of the
             | context of the API or SDK you're wrapping. Seems wrong?
             | Check the docs. Doesn't work? Curl it yourself.
        
           | Barrin92 wrote:
           | >It also is way more likely to know at least _some_ of the
           | best practices
           | 
           | What's way more likely to know the best practices is the
           | documentation. A few months ago there was a post that made
           | the rounds about how the Arc browser introduced a really
           | severe security flaw by misconfiguring their Firebase ACLs
           | despite the fact that the correct way to configure them is
           | outlined in the docs.
           | 
           | This to me is the sort of thing (although maybe not
           | necessarily in this case) out of LLM programming. 90% isn't
           | good enough, it's the same as Stackoverflow pasting. If
           | you're a serious engineer and you are unsure about something,
           | it is your task to go to the reference material, or you're at
           | some point introducing bugs like this.
           | 
           | In our profession it's not just crypto libraries, one
           | misconfigured line in a yaml file can mean causing millions
           | of dollars of damage or leaking people's most private
           | information. That can't be tackled with a black box chatbot
           | that may or may not be accurate.
        
           | zmmmmm wrote:
           | > write something like a wrapper around some cloud provider
           | SDK that I'm unfamiliar with
           | 
           | you're equating "unfamliar" with "don't know how to do" but I
           | will claim you _do_ know how to do it, you would just be slow
           | because you have to reference documentation and learn which
           | functions do what.
        
         | photon_collider wrote:
         | "Trust but verify" is still useful especially when you ask LLMs
         | to do stuff you don't know. I've used LLMs to help me get
         | started on tasks where I wasn't even sure of what a solution
         | was. I would then inspect the code and review any relevant
         | documentation to see if the proposed solution would work. This
         | has been time consuming but I've learned a lot regardless.
        
         | IanCal wrote:
         | That seems like a wild restriction.
         | 
         | You can give them more latitude for things you know how to
         | _check_.
         | 
         | I didn't know how to setup the right gnarly typescript generic
         | type to solve my problem but I could easily verify it's
         | correct.
        
           | fastball wrote:
           | If you don't understand what the generic is doing, there
           | might be edge-cases you don't appreciate. I think Typescript
           | types are fairly non-essential so it doesn't really matter,
           | but for more important business logic it definitely can make
           | a difference.
        
             | IanCal wrote:
             | I understand what it's doing, and could easily set out the
             | cases I needed.
        
               | fastball wrote:
               | If you understand what it is doing, you could do it
               | yourself, surely?
        
               | IanCal wrote:
               | Have you never understood the solution to a puzzle much
               | more easily than solving it yourself? I feel there's
               | literally a huge branch of mathematics dedicated to the
               | difference between _finding_ and _validating_ a solution.
               | 
               | More specifically, I didn't know how to solve it, though
               | obviously could have spent much more time and learned.
               | There were only a small number of possible cases, but I
               | needed certain ones to work and others not to. I was
               | easily able to create the examples but not find the
               | solution. With looping through claude I could solve it in
               | a few minutes. I then got an explanation, could read the
               | right relevant docs and feel satisfied that not only did
               | everything pass the automated checks but my own
               | reasoning.
        
           | kccqzy wrote:
           | If you merely know how to check, would you also know how to
           | _fix_ it after you find that it 's wrong?
           | 
           | If you are lucky to have the LLM fix it for you, great. If
           | you don't know how to fix it yourself and the LLM doesn't
           | either, you've just wasted a lot of time.
        
             | IanCal wrote:
             | It did fix it, I iterated passing in the type and linter
             | errors until it passed all the requirements I had.
             | 
             | > If you merely know how to check, would you also know how
             | to fix it after you find that it's wrong?
             | 
             | Probably? I'm capable of reading documentation, learning
             | and asking others.
             | 
             | > If you don't know how to fix it yourself and the LLM
             | doesn't either, you've just wasted a lot of time.
             | 
             | You may be surprised by how little time, but regardless it
             | would have taken more time to hit that point without the
             | tool.
             | 
             | Also sometimes things don't work out, that's OK. As long as
             | overall it improves work, that's all we need.
        
         | kamaal wrote:
         | >>f you can look at the solution and immediately know what's
         | wrong with it, they are a time saver otherwise...
         | 
         | Indeed getting good at writing code using LLMs demands being
         | very good at reading code.
         | 
         | To that extent its more like blitz chess than autocomplete. You
         | need to think and verify in trees as it goes.
        
         | billmcneale wrote:
         | That's the wrong approach.
         | 
         | I use chat for things I don't know how to do all the time. I
         | might not know how to do it, but I sure know how to test that
         | what I'm being told is correct. And as long as it's not, I
         | iterate with the chat bot.
        
           | WhiteNoiz3 wrote:
           | A better way to phrase it might be don't use it for something
           | that you aren't able to verify or validate.
        
             | sdesol wrote:
             | I agree with this. I keep harping on this, but we are sold
             | automation instead of a power tool. If you have domain
             | knowledge in the problem that you are solving, then LLMs
             | can become an extremely valuable aid.
        
           | bityard wrote:
           | I feel like that's a good option ONLY if the code you are
           | writing will never be deployed to an environment where
           | security is a concern. Many security bugs in code are
           | notoriously difficult to spot and even frequently slip
           | through reviews from humans who are actively looking for
           | exactly those kinds of bugs.
           | 
           | I suppose we could ask the question: Are LLMs better at
           | writing secure code than humans? I'll admit I don't know the
           | answer to that, but given what we know so far, I seriously
           | doubt it.
        
           | zmmmmm wrote:
           | I think it's just a broader definition of "know how to do".
           | If you can write a test for it then I'm going to argue you
           | know "how" to do it in a bigger picture sense. As in, you
           | understand the requirements and inherent underlying technical
           | challenges behind what you are asking to be done.
           | 
           | The issue is, there are always subtle aspects to problems
           | that most developers only know by instinct. Like, "how is it
           | doing the unicode conversion here" or "what about the case
           | when the buffer is exactly the same size as the message, is
           | there room for the terminating character?". You need the
           | instincts for these to properly construct tests and review
           | the code it did. If you do have those instincts, I argue you
           | _could_ write the code, it 's just a lot of effort. But if
           | you _don 't_, I will argue you can't test it either and can't
           | use LLMs to produce (at least) professional level code.
        
         | j45 wrote:
         | You can ask the LLM to teach it to you step by step, and then
         | you can validate it by doing it as well as you go, still
         | quicker than learning it and not knowing how to debug it.
         | 
         | Learning how something works is critical or it's far worse than
         | technical debt.
        
           | lelandfe wrote:
           | Yes, I have a friend learning their first programming
           | language with much assistance from ChatGPT and it's actually
           | going really well.
        
             | j45 wrote:
             | Awesome, I wish more people knew about this compared to
             | trying to do magic Harry Potter single prompt to do
             | everything.
        
         | turnsout wrote:
         | I completely agree. In graphics programming, I love having it
         | do things that are annoying but easy to verify (like setting up
         | frame buffers in WebGL). I also ask it do more ambitious things
         | like implementing an algorithm in shader code, and it will
         | sometimes give a result that is mostly correct but subtly
         | wrong. I only have been able to catch those subtle errors
         | because I know what to look for.
        
         | tnvmadhav wrote:
         | I'd like to rephrase as, "don't deploy LLM generated code if
         | you don't know how it works (or what it does)"
         | 
         | This means, it's okay to use LLM to try something new that
         | you're on the fence about. Learn it and then once you've
         | learned that concept or the idea, you can go ahead to use same
         | code if it's good enough.
        
           | JKCalhoun wrote:
           | "don't deploy LLM generated code if you don't know how it
           | works (or what it does)"
           | 
           | (Which goes for StackOverflow, etc.)
        
             | switchbak wrote:
             | I've seen a whole flurry of reverts due to exactly this.
             | I've also dabbled in trusting it a little too much, and had
             | the expected pain.
             | 
             | I'm still learning where it's usable and where I'm over-
             | reaching. At present I'm at about break-even on time spent,
             | which bodes well for the next few years as they iron out
             | some of the more obvious issues.
        
         | staticautomatic wrote:
         | My experience is the opposite. I find them most valuable for
         | helping me do things that would be extremely hard or impossible
         | for me to figure out. To wit, I just used one to decode a
         | pagination cursor format and write a function that takes a
         | datetime and generates a valid cursor. Ain't nobody got time
         | for that.
        
         | ignoramous wrote:
         | > _... don 't use them for anything you don't know how to do
         | ... I find chat for search is really helpful (as the article
         | states)_
         | 
         | Not really. I often use Chat to understand codebases. Instead
         | trying to navigate mature, large-ish FOSS projects (like say,
         | the _Android Run Time_ ) by looking at it file by file, method
         | by method, field by field (all to laborious), I just ask ...
         | _Copilot_. It is way, way faster than I and are mostly
         | _directionally_ correct with its answers.
        
         | logicchains wrote:
         | Don't use them for anything you don't know how to test. If you
         | can write unit tests you understand and it passes them all (or
         | visually inspect/test a GUI it generated), you know it's doing
         | well.
        
         | SkyBelow wrote:
         | How you use the LLM matters.
         | 
         | Having an LLM do something for you that you don't know how to
         | do is asking for trouble. An expert likely can off load a few
         | things they aren't all that important, but any junior is going
         | to dig themselves into a significant hole with this technique.
         | 
         | But asking an LLM to help you learn how to do something is
         | often an option. Can't one just learn it using other resources?
         | Of course. LLMs shouldn't be a must have. If at any point you
         | have to depend upon the LLM, that is a red flag. It should be a
         | possible tool, used when it saves time, but swapped for other
         | options when they make sense.
         | 
         | For an example, I had a library I was new to and asked copilot
         | how to do some specific task. It gave me the options. I used
         | this output to go to google and find the matching documentation
         | and gave it a read. I then when back to copilot and wrote up my
         | understanding of what the documentation said and checked to see
         | if copilot had anything to add.
         | 
         | Could I have just read the entire documentation? That is an
         | option, but one that costs more time to give deeper expertise.
         | Sometimes that is the option to go with, but in this case
         | having a more shallow knowledge to get a proof of concept
         | thrown together fit my situation better.
         | 
         | Anyone just copying an AI's output and putting it in a PR
         | without understanding what it does? That's asking for trouble
         | and it will come back to bite them.
        
       | justatdotin wrote:
       | lots of colleauges using copilot or whatever for autocomplete - I
       | just find that annoying.
       | 
       | or writing tests - that's ... not so helpful. worst is when a
       | lazy dev takes the generated tests and leaves it at that: usually
       | just a few placeholders that test the happy path but ignore
       | obvious corner cases. (I suppose for API tests that comes down to
       | adding test case parameters)
       | 
       | but chatting about a large codebase, I've been amazed at how
       | helpful it can be.
       | 
       | what software patterns can you see in this repo? how does the
       | implementation compare to others in the organisation? what common
       | features of the pattern are missing?
       | 
       | also, like a linter on steroids, chat can help explore how my
       | project might be refactored to better match the organisation's
       | coding style.
        
         | roskilli wrote:
         | If you don't mind me asking: which popular LLM(s) have you been
         | using for this and how are you providing the code base into the
         | context window?
        
           | fragmede wrote:
           | Not OP but Aider provides a repo map to the LLM as context,
           | which consists of the directory tree, filenames, and
           | important symbols in each file. It can use the popular LLMs
           | as well as Ollama.
           | 
           | https://aider.chat/docs/repomap.html
           | 
           | Aider hosts a leaderboard that rates LLMs on performance,
           | including a section on refactoring.
           | 
           | https://aider.chat/docs/leaderboards/refactor.html
        
             | Zambyte wrote:
             | AI generated images _can_ be good, and even reasonable to
             | use for branding. Slapping an image right at the top of the
             | page that says  "Abstract Synxex Tree" with a meaningless
             | graph and an absolutely expressionless and useless humanoid
             | robot is a great way to immediately lose my interest in
             | anything they have to say though. The homepage would be
             | more interesting as a wall of text.
        
               | klibertp wrote:
               | Agreed, mostly, but this is not a homepage. On the
               | homepage, there's a video demo and a wall of text
               | (https://aider.chat/). Still, that Synxex Tree should
               | disappear :)
        
       | wrs wrote:
       | I've been working with Cursor's agent mode a lot this week and am
       | seeing where we need a new kind of tool. Because it sees the
       | whole codebase, the agent will quickly get into a state where
       | it's changed several files to implement some layering or refactor
       | something. This requires a response from the developer that's
       | sort of like a code review, in that you need to see changes and
       | make comments across multiple files, but unlike a code review,
       | it's not finished code. It probably doesn't compile, big chunks
       | of it are not quite what you want, it's not structured into
       | coherent changesets...it's kind of like you gave the intern the
       | problem and they submitted a bit of a mess. It would be a
       | terrible PR, but it's a useful intermediate state to take another
       | step from.
       | 
       | It feels like the IDE needs a new mode to deal with this state,
       | and that SCM needs to be involved somehow too. Somehow help the
       | developer guide this somewhat flaky stream of edits and sculpt it
       | into a good changeset.
        
         | fragmede wrote:
         | Aider commits to git with each command, making it easy to back
         | out changes, and also squash them into discrete chunks later
         | (and reorder them with interactive rebase).
        
           | golergka wrote:
           | Automatically runs linter and tests on every edit and
           | forwards failures back to LLM as well.
        
         | Aeolun wrote:
         | I think the full agent mode context is actually often hard to
         | see, but there's a list somewhere. The list of files in your
         | chat dialog is not the full context (it adds open files too). I
         | find that if I reduce the context size Cursor gives me much
         | better results.
        
       | User23 wrote:
       | LLMs are, at their core, search tools. Training is indexing and
       | prompting is querying that index. The granularity being at the
       | n-gram rather than the document level is a huge deal though.
       | 
       | Properly using them requires understanding that. And just like we
       | understand every query won't find what we want, neither will
       | every prompt. Iterative refinement is virtually required for
       | nontrivial cases. Automating that process, like eg cursor agent,
       | is very promising.
        
         | IanCal wrote:
         | Half of the problems are people treating them as searchers when
         | they aren't. They're absolutely not ngram indexes of existing
         | data, either.
        
         | mvdtnz wrote:
         | I'm losing track of the number of different things the Hacker
         | News commenters claim LLMs are "at their core".
        
           | bitwize wrote:
           | LLMs are, at their core, _fucking Dissociated Press_. That 's
           | what makes them fun and interesting, and that's the problem
           | with using them for real production work.
        
           | sulam wrote:
           | Isn't this answer obvious/facile but also true? They're next
           | token predictors.
        
         | sdesol wrote:
         | > LLMs are, at their core, search tools.
         | 
         | This is the wrong take. Search tools are deterministic unless
         | you purposely inject random weights into the ranking. With
         | search tools, the same search query will always yield the same
         | search result, provided they are designed too and/or the
         | underlying data has not changed.
         | 
         | With LLMs, I can ask the exact same question and get a
         | different response, even if the data has not changed.
        
           | Scene_Cast2 wrote:
           | The randomness comes from sampling. With local LLMs, you can
           | fix the random seed, or even disable sampling all together -
           | both will get you determinism.
           | 
           | I agree that LLMs are not search tools, but for very
           | different reasons.
        
             | klabb3 wrote:
             | Semantics. It may be able to get deterministic but it's
             | _unstable_ wrt unrelated changes in the training data, no?
             | If I add a page about sausages to a search index, the
             | results for "ski jacket" will be unaffected. In a practical
             | sense, LLMs are non-deterministic. I mean, ChatGPT even has
             | a "regenerate" button to expose this "turbulence" as a
             | feature.
        
               | User23 wrote:
               | Hence n-grams rather than documents.
               | 
               | Also what's with using "semantics" as a dismissal when
               | the technology we're talking about is the most
               | semantically relevant search ever made.
        
             | sdesol wrote:
             | Thanks for the info on local LLMs. Based on my chats with
             | multiple LLMs, the biggest issue appears to be hardware.
             | 
             | Non-deterministic hardware: All LLMs mentioned that modern
             | computing hardware, such as GPUs or TPUs, can introduce
             | non-determinism due to factors like parallel processing,
             | caching, or numerical instability. This can make it
             | challenging to achieve determinism, even with fixed random
             | seeds or deterministic algorithms.
             | 
             | You can find the summary of my chats https://beta.gitsense.
             | com/?chat=1c3e69f9-7b8b-48a3-8b99-bb1b.... If you scroll to
             | the top and click on the "Conversation" link in the first
             | message, you can read the individual responses.
        
         | jcranmer wrote:
         | > LLMs are, at their core, search tools.
         | 
         | Fundamentally, _no they 're not_. That is why you have cases
         | like the Air Canada chatbot that told a user about a refund
         | opportunity that didn't exist, or the lawyer in Mata v Avianca
         | who cited a case that didn't exist. If you ask an LLM to search
         | for something that doesn't exist, there's a decent chance it
         | will hallucinate something into existence for you.
         | 
         | What LLMs are good at is effectively turning fuzzy search terms
         | into non-fuzzy terms; they're also pretty good at taking some
         | text and recasting into an extremely formulaic paradigm. In
         | other words, turning unstructured text into something
         | structured. The problem they have is that they don't have
         | enough understanding of the world to do something useful that
         | with structured representation that needs to be accurate.
        
       | notjoemama wrote:
       | Our company has a no AI use policy. The assumption is zero trust.
       | We simply can't know whether a model or its framework could or
       | would send proprietary code outside the network. So it's best to
       | assume all LLMs/AI is or will send code or fragments of code.
       | While I applaud the incredible work by their creators, I'm not
       | sure how a responsible enterprise class company could rely on
       | "trust us bro" EULAs or repo readmes.
        
         | codebje wrote:
         | The same way responsible enterprise class companies rely on
         | "trust us bro" EULAs for financial systems, customer databases,
         | payroll, and all the other systems it would be very expensive
         | and error prone to build custom for every business.
        
           | ryanobjc wrote:
           | Pretty much this.
           | 
           | OpenAI poisoned the well badly with their "we train off your
           | chats" nonsense.
           | 
           | If you are using any API service, or any enterprise ChatGPT
           | plan, your tokens are not being logged and recycled into new
           | training data.
           | 
           | As for why trust them? Like the parent said: EULAs. Large
           | companies trust EULAs and terms of service for every single
           | SAAS product they use, and they use tons and tons of them.
           | 
           | OpenAI in a clumsy attempt to create a regulatory moat by
           | doing sketchy shit and waving wild "AI will kill us all"
           | nonsense has created a situation where the usefullness of
           | these transforming generative solutions are automatically
           | rejected by many.
        
         | pama wrote:
         | Your company could locally host LLMs; you wont get chatGPT or
         | Claude quality, but you can get something that would have been
         | SOTA a year ago. You can vet the public inference codebases
         | (they are only of moderate complexity), and you control your
         | own firewalls.
        
           | CubsFan1060 wrote:
           | You can run Claude on both AWS and Google Cloud. I'm fairly
           | certain they don't share data, but would need to verify to be
           | sure.
        
             | evilduck wrote:
             | You can also run Llama 405B and the latest (huge) DeepSeek
             | on your own hardware and get LLMs that trade blows with
             | Claude and ChatGPT, while being fully isolated and offline
             | if needed.
        
               | krembo wrote:
               | With Amazon Bedrock you can get an isolated serverless
               | Claude or llama with a few clicks
        
               | evilduck wrote:
               | True, but if your org is super paranoid about data
               | exfiltration you're probably not sending it to AWS
               | either.
        
           | Kostchei wrote:
           | You can get standalone/isolated versions of chatGPT, if your
           | org is large enough, in partnership with OpenAI. And others.
           | They run on the same infra but in accounts you set up, cost
           | the same, but you have visibility on the compute, and control
           | of data exfil - ie is there is none.
        
         | j45 wrote:
         | Local LLMs for code aren't that out of the question to run.
         | 
         | Even for not code generation, but even smaller models only for
         | programming to weigh on different design approaches, etc.
        
         | attentive wrote:
         | So, you're asking how enterprise class companies are using
         | github for repos and gmail for all the enterprise mail? What's
         | next, zoom/teams for meetings?
        
         | lazybreather wrote:
         | Palo Alto networks provides security product "AI access
         | security" which claims to solve the problem you mentioned -
         | access control, data protection etc. I don't personally use it
         | neither does my org. Giving here just in case it is useful for
         | someone.
        
         | BBosco wrote:
         | The vast majority of fortune 500's have legal frameworks up for
         | dealing with internal AI use already because the reality is
         | employees are going to use it regardless of internal policy.
         | Assuming every employee will act in good faith just because a
         | blanket AI ban is in place is extremely optimistic at best, and
         | isn't a good substitute for actual understanding.
        
           | sulam wrote:
           | Internal policies at these companies are rarely subject to a
           | level of faith that you're implying. Instead external access
           | to systems is logged, internal systems are often sandboxed or
           | otherwise constrained in how you interact with them, and
           | anything that looks like exfiltration sets off enough alarms
           | to have your manager talking to you that same day, if not
           | that same hour.
        
         | Pyxl101 wrote:
         | Just curious, how does your company host its email? Documents?
         | Files?
        
         | janalsncm wrote:
         | You can run pretty decent models on your laptop these days.
         | Works in airplane mode.
         | 
         | https://ollama.com/
        
         | golergka wrote:
         | What's the realistic attack scenario? Will Sam Altman steal
         | your company's code? Or will next version of GPT learn on your
         | secret sauce algorithms and then your competitors will get them
         | when they generate code for their tasks and your company loses
         | its competitive advantage?
         | 
         | I'm actually sure that there are companies for which these
         | scenarios are very real. But I don't think there's a lot of
         | them. Most of the code our industry works on has very little
         | value outside of context of particular product and company.
        
           | cudgy wrote:
           | So why bother securing anything at all if not willing to
           | secure the raisons d'etre? Doesn't that suggest that these
           | companies are trivial entities?
        
             | golergka wrote:
             | There are plenty of very realistic attack scenarios, that's
             | why we secure stuff.
        
         | Aeolun wrote:
         | I mean, we host our code on Github. What are they going to do
         | with Copilot code snippets?
        
         | mbesto wrote:
         | > proprietary code outside the network
         | 
         | Thought exercise: what would seriously happen if you did let
         | some of your proprietary code outside your network? Oddly
         | enough, 75% of the people writing code on HN probably have
         | their companies code stored in GitHub. So there already is an
         | inherent trust factor with GH/MSFT.
         | 
         | As another anecdote - Twitch's source code got leaked a few
         | years back. Did Twitch lose business because of it?
        
           | aulin wrote:
           | > Thought exercise: what would seriously happen if you did
           | let some of your proprietary code outside your network
           | 
           | Lawsuits? Lawful terminations? Financial damages?
        
             | mbesto wrote:
             | Huh? No, i'm saying, what potential damage does an
             | organization have? Not the individual who may leak data
             | outside your network.
        
               | aulin wrote:
               | Those are risks both for the individual and for the
               | company when there are contracts in place with third
               | parties involving code sharing.
               | 
               | Other risks include leaking industrial secrets that may
               | significantly damage company business or benefit
               | competitors.
        
               | klibertp wrote:
               | Please acknowledge that your situation is pretty unique.
               | Just take a look at the comments: how many people say, or
               | outright presume, that _their company 's_ code is already
               | on GitHub? I'd wager that your org _doesn 't_ keep code
               | at a 3rd party provider, right? Then, you're in a
               | minority.
               | 
               | I don't mean to dismiss your concerns - in your
               | situation, they are probably warranted - I just wanted to
               | say that they are unique and not necessarily shared by
               | people who don't share your circumstances.
        
               | aulin wrote:
               | This subthread started with someone from a no AI policy
               | company, people are dismissing it with snarky comments,
               | along the line of your code is not as important as you
               | believe. I'm just trying to show a different picture, we
               | work in a pretty vast field and people commenting here
               | don't necessarily represent a valid sample.
        
               | klibertp wrote:
               | > people are dismissing it with snarky comments, along
               | the line of your code is not as important as you believe.
               | 
               | That says more about those people than about your/OP's
               | code :)
               | 
               | Personally, I had a few collisions with regulation and
               | compliance over the years, so I can appreciate the
               | completely different mindset you need when working with
               | them. On the other hand, at my current position, not only
               | do we have everything on Github, but there were also
               | instances where I was tasked with mirroring everything to
               | bitbucket! (For code escrow... i.e., if we go out of
               | business, our customer will get access to the mirrored
               | code.)
               | 
               | > people commenting here don't necessarily represent a
               | valid sample.
               | 
               | Right. I should have said that you're in the minority
               | _here_. I 'm not sure what's the ratio of dumb CRUD apps
               | to "serious business" kind of development in the wild. I
               | know there are whole programming subfields where your
               | kinds of concerns are typical. They might just be
               | underrepresented here.
        
               | aulin wrote:
               | Yes I've had plenty of experiences with orgs that self
               | host everything, I don't think it's a minority it's just
               | a different cluster than the one most represented here.
               | 
               | Still I believe hosting is somewhat different, if
               | anything because it's something established, known
               | players, trusted practices. AI is new, contracts are
               | still getting refined, players are still making their
               | name, companies are moving fast and I doubt data
               | protection is their priority.
               | 
               | I may be wrong but I think it's reasonable for IT
               | departments to be at least prudent towards these
               | frameworks. Search is ok, chat is okish, crawling whole
               | projects for autocompletion I'd be more careful.
        
               | mbesto wrote:
               | > Yes I've had plenty of experiences with orgs that self
               | host everything, I don't think it's a minority it's just
               | a different cluster than the one most represented here.
               | 
               | I've done 800+ tech diligence projects and have first
               | hand knowledge of every single one's use of VCS. At least
               | 95% of the codebases are stored on a cloud hosted VCS.
               | It's absolutely a minority to host your own VCS.
        
               | mbesto wrote:
               | > I doubt data protection is their priority.
               | 
               | So you're basing your whole argument on nothing other
               | than "I just don't feel like they do that".
               | 
               | Does this look unserious to you?
               | https://trust.openai.com/
        
               | mbesto wrote:
               | First, I didn't dismiss their "no AI policy" nor did I
               | use snarky comments. I was asking a legitimate question -
               | which is - most orgs have their code stored on another
               | server out of their control, so what's the legitimate
               | business issue if your code gets leaked? I still haven't
               | gotten an answer.
        
           | switchbak wrote:
           | The other consideration: your company's code probably just
           | isn't that good.
           | 
           | I think many people over-value this giant pile of text.
           | That's not to say IP theft doesn't exist, but I think the
           | actual risk is often overblown. Most of an organization's
           | value is in the team's collective knowledge and teamwork
           | ability, not in the source code.
        
         | lm28469 wrote:
         | > I'm not sure how a responsible enterprise class company could
         | rely on "trust us bro" EULAs or repo readmes.
         | 
         | Isn't that what we do with operating systems, internet
         | providers, &c. ?
        
           | aulin wrote:
           | How is that related? we're talking of continuously sending
           | proprietary code and related IP to a third party, seems a
           | pretty valid concern to me.
           | 
           | I, for one, work every day with plenty of proprietary vendor
           | code under very restrictive NDAs. I don't think they would be
           | very happy knowing I let AIs crawl our whole code base and
           | send it to remote language models just to have fancy
           | autocompletion.
        
             | lm28469 wrote:
             | Do you read every single line of code of every single
             | dependency you have ? I don't see how llms are more of a
             | threat than a random compromised npm package or something
             | from a OS package manager. Chances are you're already
             | relying on tons and tons of "trust me bro" and "it's
             | opensource bro don't worry, just read the code if you feel
             | like it"
        
               | aulin wrote:
               | One thing is consciously sharing IP with third parties
               | violating contracts, another is falling victim of
               | malicious code in the toolchain.
               | 
               | Npm concern though suggests we likely work in very
               | different industries so that may explain the different
               | perspective.
        
             | bongodongobob wrote:
             | Ok, the LLM crawls your code. Then what? What is the
             | exfiltration scenario?
        
             | ryanobjc wrote:
             | "Continuously sending proprietary code and related IP to a
             | third party"
             | 
             | Isn't this... github?
             | 
             | Companies and people are doing this all day every day. LLM
             | APIs are really no different. Only when you magic it up as
             | "the AI is doing thinking" ... but in reality text ->
             | tokens -> math -> tokens -> text. It's a transformation of
             | numbers into other numbers.
             | 
             | The EULAs and ToS say they don't log or retain information
             | from API requests. This is really no different than Google
             | Drive, Atlassian Cloud, Github, and any number of online
             | services that people store valuable IP and proprietary
             | business and code in.
        
         | tsukikage wrote:
         | You can get models that run offline. The other risk is
         | copyright/licensing exposure; e.g. the AI regurgitates a
         | recognisably large chunk of GPL code, and suddenly you have a
         | legal landmine in your project waiting to be discovered.
         | There's no sane way for a reviewer to spot this situation in
         | general.
         | 
         | You can ask a human to not do that, and there are various risks
         | to them personally if they do so regardless. I'd like to see
         | the AI providers take on some similar risks instead of
         | disclaiming them in their EULAs before I trust them the way I
         | might a human.
        
         | cudgy wrote:
         | Does your company develop software overseas where legal action
         | is difficult? Or where their ip could be nationalized or
         | secretly stolen? Where network communications are monitored and
         | saved?
        
         | k__ wrote:
         | Seems like only working on open source code has its benefits.
        
       | bangaladore wrote:
       | The killer feature about LLMs with programming in my opinion is
       | autocomplete (the simple copilot feature). I can probably be 2-3x
       | more productive as I'm not typing (or thinking much). It does a
       | fairly good job pulling in nearby context to help it. And that's
       | even without a language server.
       | 
       | Using it to generate blocks of code in a chat like manner in my
       | opinion just never works well enough in the domains I use it on.
       | I'll try to get it to generate something and then realize when I
       | get some functional result I could've done it faster and more
       | effectively.
       | 
       | Funny enough, other commenters here hate autocomplete but love
       | chat.
        
         | m3kw9 wrote:
         | The autocomplete is mostly a nusance and maybe low percentage
         | of the time it does right.
        
           | tptacek wrote:
           | Yeah, I don't like it either. I think it speaks to the
           | mindset difference Crawshaw is talking about here. When I'm
           | writing code, I don't want things getting in my way. I have a
           | plan. I'm actually pretty Zen about all the typing. It's part
           | of my flow-state. But when I'm exploring code in a dialog
           | with a chatbot, I'm happy for the help.
        
             | switchbak wrote:
             | I think we're going to be considered dinosaurs pretty soon.
             | Much like how it's getting harder to buy a manual
             | transmission, programming 'the old way' will probably just
             | fade away over time.
        
           | LVB wrote:
           | The biggest nuisance aspect for me is when it is trying to do
           | things that the LSP can do 100% correctly. Almost surely it
           | is my tooling setup and the LLM is squashing LSP stuff.
           | Seeing Copilot (or even Cursor) suggesting methods or
           | parameters that don't exist is really annoying. Just stand
           | down and let the LSP answer those basic questions, TYVM.
        
             | throwup238 wrote:
             | Cursor ostensibly has a config setting to run a "shadow"
             | workspace [1], aka a headless copy of the window you're
             | working in to get feedback from linters and LSPs but
             | they've been iterating so fast I'm not sure it's still
             | working (or ever did much, really).
             | 
             | It really feels like we're at the ARPANET stage where
             | there's so much obvious hanging fruit, it's just going to
             | take companies a while to perfect it.
             | 
             | [1] https://www.cursor.com/blog/shadow-workspace
        
           | ahoka wrote:
           | The industry standard was 40% accepted the last time I
           | checked. Correct could be a bit lower, so maybe 1/3?
           | 
           | It's like having to delete the auto-closed parenthesis more
           | often than not.
        
           | jghn wrote:
           | I thought so too. Until I worked with a client who doesn't
           | allow the use of LLM tools, and I had to turn my Copilot off.
           | That's when I realized how much I'd grown to rely on it
           | despite the headaches.
        
         | LeftHandPath wrote:
         | I've never used it, simply because I hate autocomplete in
         | emails.
         | 
         | Gmail autocomplete saves me _maybe_ 2-5s per email: the
         | recipients name, a comma, and a sign off. Maybe a quarter or
         | half sentence here or there, but never exactly what I would've
         | typed.
         | 
         | In code bases, I've never seen the appeal. It's only reliably
         | good at stuff that I can easily find on Google. The savings are
         | inconsequential at best, and negative at worst when it
         | introduces hard-to-pinpoint bugs.
         | 
         | LLMS are incredible technology, but when applied to code, they
         | act more like non-deterministic macros.
        
           | switchbak wrote:
           | "negative at worst when it introduces hard-to-pinpoint bugs"
           | - this is actually very true. I've had it recreate patterns
           | _partially_, and paste in the wrong thing in a place that was
           | very hard to discern.
           | 
           | It probably saved me 40 mins, then proceeded to waste 2 hours
           | of me hunting for that issue. I'm probably at the break-even
           | on the whole. The ultimate promise is very compelling, but my
           | current use isn't particularly amazing. I do use a niche
           | language though, so I'm outside the global optima.
        
             | LeftHandPath wrote:
             | Exactly! I expect that some are able to put it to good use.
             | I am not one of those people.
             | 
             | My experiences with ChatGPT and Gemini have included lots
             | of confident but wrong answers, eg "What castle was built
             | at the highest altitude". Thats what gives me pause.
             | 
             | Gemini spits out a great 2D A* implementation no problem.
             | That is _awesome_. Actually, contrary to my original
             | comment, I probably will use AI for that sort of thing
             | going forward.
             | 
             | Despite that, I don't want it in my IDE. Maybe I'm just a
             | bit of a Luddite.
        
         | imhoguy wrote:
         | Both autocomplete and chat are half-way UX solutions. Really
         | what I need is some kind of mix of in-place chat with
         | completion.
         | 
         | For context, very often I have to put some comment before the
         | line for completion to set an expectation context.
         | 
         | Instead editor should allow me to influence completion with
         | some kind of in-place suggestion input available under keyboard
         | shortcut. Then I could type what I want into such input and
         | when I hit Enter or Tab the completion proposal appears. Even
         | better if it would let me undo/modify such input, and have
         | shortcuts like "show me different option", "go back to
         | previous".
        
         | switchbak wrote:
         | I had to turn autocomplete off. I value it when I want it, but
         | otherwise it's such a distraction that it both slows me down,
         | and actively irritates me.
         | 
         | Perhaps I'm just an old man telling the LLM to get off my lawn,
         | but I find it does bad things to my ability to concentrate on
         | hard things.
         | 
         | Having a good sense of when it would be useful, and invoking it
         | on demand seems to be a decent enough middle ground for me.
         | Much of it boils down to UX - if it could be present but not
         | actively distracting, I'd probably be ok with it.
        
       | jimmydoe wrote:
       | Anyone has good recommendation of LocalLLM for autocompletion
       | 
       | Most editors I use supports online LLM but it's too slow
       | sometimes for me.
        
         | ec109685 wrote:
         | Unless your network is poor, I'd imagine (but definitely could
         | be wrong in your case!), the bottleneck is the LLM speed, not
         | the latency to the data center its running in.
        
         | th4t1sW13rd wrote:
         | https://www.continue.dev/
        
           | jimmydoe wrote:
           | Thank you!
        
       | wdutch wrote:
       | I no longer work in tech, but I still write simple applications
       | to make my work life easier.
       | 
       | I frequently use what OP refers to as chat-driven programming,
       | and I find it incredibly useful. My process starts by explaining
       | a minimum viable product to the chat, which then generates the
       | code for me. Sometimes, the code requires a bit of manual
       | tweaking, but it's usually a solid starting point. From there, I
       | describe each new feature I want to add--often pasting in
       | specific functions for the chat to modify or expand.
       | 
       | This approach significantly boosts what I can get done in one
       | coding session. I can take an idea and turn it into something
       | functional on the same day. It allows me to quickly test all my
       | ideas, and if one doesn't help as expected, I haven't wasted much
       | time or effort.
       | 
       | The biggest downside, however, is the rapid accumulation of
       | technical debt. The code can get messy quickly. There's often a
       | lot of redundancy and after a few iterations it can be quite
       | daunting to modify.
        
         | j45 wrote:
         | Is there a model you prefer to use?
        
           | KTibow wrote:
           | Not wdutch but Claude Sonnet is one of the best models out
           | there for programming, o1 is sometimes better but costs more
        
         | chii wrote:
         | > The code can get messy quickly. There's often a lot of
         | redundancy and after a few iterations it can be quite daunting
         | to modify.
         | 
         | i forsee in the future an LLM that has sufficient context
         | length for (automatic) refactoring and tech debt removal, by
         | pasting large portions of these existing code in.
        
           | scarface_74 wrote:
           | Even without LLMs, at least with statically type languages
           | like C#, ReSharper can do solution wide refactoring that are
           | guaranteed correct as long as you don't use reflection.
           | 
           | https://www.jetbrains.com/help/resharper/Refactorings__Index.
           | ..
           | 
           | I don't see any reason it couldn't do more aggressive
           | refactors with LLMs and either correct itself or don't do the
           | refactor if it fails static code checking. Visual Studio can
           | already do real time type checking for compile time errors
        
           | Aeolun wrote:
           | Cursor has recently added something like this 'Bug Finder'.
           | It told me that finding bugs on my entire codebase would cost
           | me $21 or so, so I never actually tried, but it sounds cool.
        
         | prettyblocks wrote:
         | I have a similar approach, but the mess can be contained by
         | asking for optimizations and refactors very frequently and only
         | asking for very granular features.
        
         | trash_cat wrote:
         | > The biggest downside, however, is the rapid accumulation of
         | technical debt. The code can get messy quickly. There's often a
         | lot of redundancy and after a few iterations it can be quite
         | daunting to modify.
         | 
         | What stops you from using o1 or sonnet to refactor everything?
         | It sounds like a typical LLM task.
        
         | SkyBelow wrote:
         | >The biggest downside, however, is the rapid accumulation of
         | technical debt.
         | 
         | Is that really related to the LLM?
         | 
         | Even in pre-LLM times, anytime I've scrapped together some code
         | to solve some small immediate problem it grows tech debt at an
         | amazing rate. Getting a feel for when a piece of code is going
         | to be around long enough that it needs to be refactored,
         | cleaned up, documented, etc. is a skill I developed over time.
         | Even now it isn't a prefect guess, as there is an ongoing tug
         | of war between wasting time today refactoring something I might
         | not touch again with wasting time tomorrow having to pick up
         | something I didn't clean up.
        
       | nemothekid wrote:
       | I think "Chat driven programming" is the most common type of the
       | most hyped LLM-based programming I see on twitter that I just
       | can't relate to. I've incorporated LLMs mainly as auto-complete
       | and search; asking ChatGPT to write a quick script or to scaffold
       | some code for which the documentation is too esoteric to parse.
       | 
       | But having the LLM do things for me, I frequently run into issues
       | where it feels like I'm wasting my time with an intern. " _Chat-
       | based LLMs do best with exam-style questions_ " really speaks to
       | me, however I find that constructing my prompts in such a way
       | where the LLM does what I want uses just as much brainpower as
       | just programming the thing my self.
       | 
       | I do find ChatGPT (o1 especially) really good at optimizing
       | existing code.
        
         | throwup238 wrote:
         | _> "Chat-based LLMs do best with exam-style questions" really
         | speaks to me, however I find that constructing my prompts in
         | such a way where the LLM does what I want uses just as much
         | brainpower as just programming the thing my self._
         | 
         | It speaks to me too because my mechanical writing style (as
         | opposed to creative prose) could best be described as what I
         | learned in high school AP English/Literature and the rest of
         | the California education system. For whatever reason that
         | writing style dominated the training data and LLMs just happens
         | to be easy to use because I came out of the same education
         | system as many of the people working at OpenAI/Anthropic.
         | 
         | I've had to stop using several generic turns of phrase like "in
         | conclusion" because it made my writing look too much like
         | ChatGPT.
        
         | AlotOfReading wrote:
         | It's interesting that you find it useful for optimization. I've
         | found that they're barely capable of anything more than shallow
         | optimization in my stuff without significant direction.
         | 
         | What I find useful is that I can keep thinking at one
         | abstraction level without hopping back and forth between
         | algorithm and codegen. The chat is also a written artifact I
         | can use the faster language parts of my brain on instead of the
         | slower abstract thought parts.
        
         | tptacek wrote:
         | There's an art to cost-effectively coaxing useful answers
         | (useful drafts of code) from an LLM, and there's an art to
         | noticing the most productive questions to put to that process.
         | It's a totally different way of programming than having an LLM
         | looking over your shoulder while you direct, function by
         | function, type by type, the code you're designing.
         | 
         | If you feel like you're wasting your time, my bet is that
         | you're either picking problems where there isn't enough value
         | to negotiate with the LLM, or your expectations are too high.
         | Crawshaw mentions this in his post: a lot of the value of this
         | chat-driven style is that it very quickly gets you unstuck on a
         | problem. Once you get to that point, you take over! You don't
         | convince the LLM to build the final version you actually commit
         | to your branch.
         | 
         | Generating unit test cases --- in particular, generating unit
         | test cases that reconcile against unsophisticated, brute-force,
         | easily-validated reference implementations of algorithms ---
         | are a perfect example of where that cost/benefit can come out
         | nicely.
        
         | sibeliuss wrote:
         | My technique is to feed it a series of intro questions that
         | prepare it for the final task. Chat the thing into a proper
         | comfort level, and then from there, with the context at hand,
         | ask to help solve the real problem. Def feels like a new kind
         | of programming model because its still very programming-esque.
        
         | Aeolun wrote:
         | I've found that everything just works (more or less) since
         | switching to Cursor. Agent based composer mode is magical. Just
         | give it a few files for context, and ask it to do what you
         | want.
        
       | _boffin_ wrote:
       | Does anyone know of any good chat based ui builders. No. Not
       | build a chat app.
       | 
       | Does webflow have something?
       | 
       | My problem is being able to describe what I want in the style I
       | want.
        
         | replwoacause wrote:
         | https://lovable.dev
         | 
         | https://bolt.new
         | 
         | https://v0.dev
         | 
         | Never used them myself but have seen them mentioned on Reddit
         | and Twitter.
        
       | singpolyma3 wrote:
       | It seems like everything I see about success using LLMs for this
       | kind of work is for greenfield. What about three weeks later when
       | the job changes to maintenance and interation on something that's
       | already working? Are people applying LLMs to that space?
        
         | kylebenzle wrote:
         | Yes, it's just harder the larger the pre-existing code base.
        
         | throwup238 wrote:
         | My codebase is relatively greenfield (started working on it
         | early last year) but it's up to ~50k lines in a mixed C++/Rust
         | codebase with a binding layer whose API predates every LLM's
         | training sets. Even when I started ChatGPT/Claude weren't very
         | useful but now the project requires a completely different
         | strategy when working with LLMs (it's a QT AI desktop app so
         | I'm dogfooding a lot). I've also used them in a larger codebase
         | (~500k lines) and that also requires a different approach from
         | the former. It feels a lot like the transition from managing 2
         | to 20 to 200 to 2000 people. It's a different ballgame with
         | each step change. A very well encapsulated code base of ~500k
         | lines is manageable for small changes but not for refactoring,
         | exploration, etc, at least until useful context sizes increase
         | another order of magnitude (I keep trying Gemini's 2M but it's
         | been a disappointment).
         | 
         | I have a _lot_ of documentation aimed at the AI in `docs
         | /notes/` (some of it written by an LLM but proofread before
         | committing) and I instruct Cursor/Windsurf/Aider via their
         | respective rules/config files to look at the documentation
         | before doing anything. At some scale that initial context
         | becomes just a directory listing & short description of
         | everything in the notes folder, which eventually breaks down
         | due to context size limits, either because I exceed the maximum
         | length of the rules or the agent requires pulling in too much
         | context for the change.
         | 
         | I've found that there's actually an uncanny valley between
         | greenfield projects where the model is free to make whatever
         | assumptions it wants and brownfield projects where it's
         | possible to provide enough context from the existing codebase
         | to get both API accuracy (hallucinations) and general patterns
         | through few-shot examples. This became very obvious once I had
         | enough examples of that binding layer. Even though I could
         | include all of the documentation for the library, it didn't
         | work consistently until I had a variety of production examples
         | to point it to.
         | 
         | Right now, I probably spend as much time writing each prompt as
         | I do massaging the notes folder and rules every time I notice
         | the model doing something wrong.
        
         | zkry wrote:
         | Logically this makes sense: every model has a context size and
         | complexity capacity where it will no longer be able to function
         | properly. Any usage of said model will accelerate the approach
         | to this limit. Once the limit is reached, the LLM is no longer
         | as helpful as it was.
         | 
         | I work on full blown legacy apps and needless to say I don't
         | even bother with LLMs when working on these most of the time.
        
         | Mashimo wrote:
         | I used AI code completion from GitHub copilot on a 20 year old
         | project. You still have to create new classes, new test,
         | refactor etc.
        
         | valenterry wrote:
         | Yeah, it sucks. LLMs are not great with a big context yet. I
         | hope that is being worked on. I need the LLM to read my whole
         | project AND optimally all related slack conversations, the wiki
         | and related libraries.
        
           | glouwbug wrote:
           | Then what will you do?
        
             | valenterry wrote:
             | I can for example tell it to refactor things. It would have
             | to write files of course. E.g. "Add retries with
             | exponential backoffs to all calls to service X"
        
       | e12e wrote:
       | Interesting. I wonder what the equivalent of sketch.dev would
       | look like if it targeted Smalltalk and was embedded in a
       | Smalltalk image (preferably with a local LLM running in
       | smalltalk)?
       | 
       | I'd love to be able to tell my (hypothetical smalltalk) tablet to
       | create an app for me, and work interactively, interacting with
       | the app as it gets built...
       | 
       | Ed: I suppose I should just try and see where cloud ai can take
       | smalltalk today:
       | 
       | https://github.com/rsbohn/Cuis-Smalltalk-Dexter-LLM
        
         | klibertp wrote:
         | Worth a look: https://github.com/feenkcom/gt4llm If you load
         | this in GT, you'll get a Lepiter book with interactive
         | tutorials.
        
       | dewitt wrote:
       | One interesting bit of context is that the author of this post is
       | a legit world-class software engineer already (though probably
       | too modest to admit it). Former staff engineer at Google and co-
       | founder / CTO of Tailscale. He doesn't _need_ LLMs. That he says
       | LLMs make him more productive at all as a hands-on developer,
       | especially around first drafts on a new idea, means a lot to me
       | personally.
       | 
       | His post reminds me of an old idea I had of a language where all
       | you wrote was function signatures and high-level control flow,
       | and maybe some conformance tests around them. The language was
       | designed around filling in the implementations for you. 20 years
       | ago that would have been from a live online database, with
       | implementations vying for popularity on the basis of speed or
       | correctness. Nowadays LLMs would generate most of it on the fly,
       | presumably.
       | 
       | Most ideas are unoriginal, so I wouldn't be surprised if this has
       | been tried already.
        
         | knighthack wrote:
         | I knew he was a world-class engineer the moment I saw that his
         | site didn't bother with CSS stylesheets, ads, pictures, or
         | anything beyond a rudimentary layout.
         | 
         | The whole article page reads like a site from the '90s, written
         | from scratch in HTML.
         | 
         | That's when I _knew_ the article would go hard.
         | 
         | Substantive pieces don't need fluffy UIs - the idea takes the
         | stage, not the window dressing.
        
           | shaneofalltrad wrote:
           | I wonder what he uses, I noticed the first paragraph took
           | over a second to load... Largest Contentful Paint element
           | 1,370 ms This is the largest contentful element painted
           | within the viewport. Element p
        
             | cess11 wrote:
             | Looks like it loads all the Google surveillance without
             | asking. Should IP-block the EU.
        
           | alexvitkov wrote:
           | Glad to know I was a world class engineer at the age of 8,
           | when all I knew were the <h1> and <b> tags!
        
         | dekhn wrote:
         | I think what you're describing is basically "interface driven
         | development" and "test driven development" taken to the
         | extreme: where the formal specification of an implementation is
         | defined by the test suite. I suppose a cynic would say that's
         | what you get if you left an AI alone in a room with Hyrum's
         | Law.
        
         | gopalv wrote:
         | > That he says LLMs make him more productive at all as a hands-
         | on developer, especially around first drafts on a new idea,
         | means a lot to me personally.
         | 
         | There is likely to be a great rift in how very talented people
         | look at sharper tools.
         | 
         | I've seen the same division pop up with CNC machines, 3d
         | printers, IDEs and now LLMs.
         | 
         | If you are good at doing something, you might find the new
         | tool's output to be sub-par over what you can achieve yourself,
         | but often the lower quality output comes much faster than you
         | can generate.
         | 
         | That causes the people who are deliberate & precise about their
         | process to hate the new tool completely - expressing in the
         | actual code (or paint, or marks on wood) is much better than
         | trying to explain it in a less precise language in the middle
         | of it. The only exception I've seen is that engineering folks
         | often use a blueprint & refine it on paper.
         | 
         | There's a double translation overhead which is wasteful if you
         | don't need it.
         | 
         | If you have dealt with a new hire while being the senior of the
         | pair, there's that familiar feeling of wanting to grab their
         | keyboard instead of explaining how to build that regex - being
         | able to do more things than you can explain or just having a
         | higher bandwidth pipe into the actual task is a common sign of
         | mastery.
         | 
         | The incrementalists on the other hand, tend to love the new
         | tool as they tend to build 6 different things before picking
         | what works the best, slowly iterating towards what they had in
         | mind in the first place.
         | 
         | I got into this profession simply because I could Ctrl-Z to the
         | previous step much more easily than my then favourite chemical
         | engineering goals. In Chemistry, if you get a step wrong, you
         | go to the start & start over. Plus even when things work, yield
         | is just a pain there (prove it first, then you scale up
         | ingredients etc).
         | 
         | Just from the name of sketch.dev, it appears that this author
         | is of the 'sketch first & refine' model where the new tool just
         | speeds up that loop of infinite refinement.
        
           | liotier wrote:
           | > If you are good at doing something, you might find the new
           | tool's output to be sub-par over what you can achieve
           | yourself, but often the lower quality output comes much
           | faster than you can generate. That causes the people who are
           | deliberate & precise about their process to hate the new tool
           | completely
           | 
           | Wow, I've been there ! Years ago we dragged a GIS system
           | kicking and screaming from its nascent era of a dozen
           | ultrasharp dudes with the whole national fiber optics network
           | in their head full of clever optimizations, to three thousand
           | mostly clueless users churning out industrial scale
           | spaghetti... The old hands wanted a dumb fast tool that does
           | their bidding - they hated the slower wizard-assisted
           | handholding, that turned out to be essential to the new
           | population's productivity.
           | 
           | Command line vs. GUI again... Expressivity vs.
           | discoverability, all the choices vs. don't make me think.
           | Know your users !
        
             | namaria wrote:
             | This whole thing makes me think of that short story "The
             | Machine Stops".
             | 
             | As we keep burrowing deeper and deeper into an overly
             | complex system that allows people to get into parts of it
             | without understanding the whole, we are edging closer to a
             | situation where no one is left who can actually reason
             | about the system and it starts to deteriorate beyond repair
             | until it suddenly collapses.
        
           | jprete wrote:
           | This is a good characterization. I'm precision-driven and
           | know what I need to do at any low level. It's the high-level
           | definition that is uncertain. So it doesn't really help to
           | produce a dozen prototypes of an idea and pick one, nor does
           | it help to fill in function definitions.
        
           | tikkun wrote:
           | Intersting.
           | 
           | So engineers that like to iterate and explore are more likely
           | to like LLMs.
           | 
           | Whereas engineers that like have a more rigid specific
           | process are more likely to dislike LLMs.
        
             | godelski wrote:
             | I frequently iterate and explore when writing code. Code
             | gets written multiple times before being merged. Yet, I
             | still haven't found LLMs to be helpful in that way. The
             | author gives "autocomplete", "search", and "chat-driven
             | programming" as 3 paradigms. I get the most out of search
             | (though a lot of this is due to the decreasing value of
             | Google), autocomplete is pretty weak to me especially as I
             | macro or just use contextual complete, and I've failed
             | miserably at chat-driven programming on every attempt. I
             | spend more time debugging the AI than it would to debug
             | myself. Albeit it __feels__ faster because I'm doing more
             | typing + waiting rather than continuous thinking (but the
             | latter has extra benefits).
        
             | erosivesoul wrote:
             | FWIW I find LLMs almost useless for writing novel code.
             | Like it can spit out a serviceable UUID generator when I
             | need it, but try writing something with more than a layer
             | or two of recursion and it gets confused. I turn copilot on
             | for boilerplate and off for solving new problems.
        
           | harrall wrote:
           | I believe it's more that people hate trying new tools because
           | they've already made their choice and made it their identity.
           | 
           | However, there are also people who love everything new and
           | jump onto the latest hype too. They try new things but then
           | immediately advocate it without merit.
           | 
           | Where are the sane people in the middle?
        
             | dns_snek wrote:
             | As an experienced software developer, I paid for ChatGPT
             | for a couple of months, I trialed Gemini Pro for a couple
             | of months, and I've used the current version of Claude.
             | 
             | I'd be happy if LLMs could produce working code as often
             | and as quickly as the evangelist claim, but whenever I try
             | to use LLM to work on my day to day tasks, I almost always
             | walk away frustrated and disappointed - and most of my work
             | is boring on technical merits, I'm not writing novel comp-
             | sci algorithms or cryptography libraries.
             | 
             | Every time I say this, I'm painted as some luddite who just
             | hates change when the reality is that no, current LLMs are
             | just not fit for many of the purposes they're being
             | evangelized for. I'd love nothing more than to be a 2x
             | developer on my side projects, but it just hasn't happened
             | and it's not for the lack of trying or open mindedness.
             | 
             | edit: I've never actually seen any LLM-driven developers
             | work in real time. Are there any live coding channels that
             | could convince the skeptics what we're missing out on
             | something revolutionary?
        
               | harrall wrote:
               | You're the middle ground I was talking about. You tried
               | it. You know where it works and where it doesn't.
               | 
               | I've used LLM to generate code samples and my IDE
               | (IntelliJ) uses an LLM for auto-suggestions. That's
               | mostly about it for me.
        
               | davepeck wrote:
               | I see less "painting as a luddite" in response to
               | statements like this, and more... surprise. Mild
               | skepticism, perhaps!
               | 
               | Your experience diverges from that of other experienced
               | devs who have used the same tools, on probably similar
               | projects, and reached different conclusions.
               | 
               | That includes me, for what it's worth. I'm a graybeard
               | whose current work is primarily cloud data pipelines that
               | end in fullstack web. Like most devs who have fully
               | embraced LLMs, I don't think they are a magical panacea.
               | But I've found many cases where they're unquestionably an
               | accelerant -- more than enough to justify the cost.
               | 
               | I don't mean to say your conclusions are wrong. There
               | seems to be a bimodal distribution amongst devs. I
               | suspect there's something about _how_ these tools are
               | used by each dev, and in the specific
               | circumstances/codebases/social contexts, that leads to
               | quite different outcomes. I would _love_ to read a better
               | investigation of this.
        
               | efnx wrote:
               | I think it also depends on _what_ the domain is, and also
               | to a certain degree the tools / stack you use. LLMs
               | aren't coherent or correct when working on novel
               | problems, novel domains or using novel tools.
               | 
               | They're great for doing something that has been done
               | before, but their hallucinations are wildly incorrect
               | when novelty is at play - and I'll add they're always
               | very authoritative! I'm glad my languages of choice have
               | a compiler!
        
               | davepeck wrote:
               | Yeah, absolutely.
               | 
               | LLMs work best for code when both (a) there's sufficient
               | relevant training data aka we're not doing something
               | particularly novel and (b) there's sufficient context
               | from the current codebase to pick up expected patterns,
               | the peculiarities of the domain models, etc.
               | 
               | Drop (a) and get comical hallucinations; drop (b) and
               | quickly find that LLMs are deeply mediocre at top-level
               | architectural and framework/library choices.
               | 
               | Perhaps there's also a (c) related to precision. You can
               | write code to issue a SQL query and return JSON from an
               | API endpoint in multiple just-fine ways. Misplace a
               | pthread_mutex_lock, however, and you're in trouble. I
               | certainly don't trust LLMs to get things like this right!
               | 
               | (It's worth mentioning that "novelty" is a tough concept
               | in the context of LLM training data. For instance, maybe
               | nobody has implemented a font rasterizer in Rust before,
               | but plenty of people have written font rasterizers and
               | plenty of others have written Rust; LLMs seem quite good
               | at synthesizing the two.)
        
               | jpc0 wrote:
               | My recent example for where its helpful.
               | 
               | Pretty nice at autocomplete. Like writing json tags in go
               | structs. Can just autocomplete that's stuff for me no
               | problem, it saved me seconds per line, seconds I tell
               | you.
               | 
               | It's stupid as well... Autofilled a function, looks
               | correct. Reread it 10 minutes later and well... Minor
               | mistake that would have caused a crash at runtime. It
               | looked correct but in reality it just didn't have enough
               | context ( the context is in an external doc on my second
               | screen ... ) and there was no way it would ever have
               | guessed the correct code.
               | 
               | It took me longer to figure out why the code looked wrong
               | than if I had just typed it myself.
               | 
               | Did it speed up my workflow on code I could have given a
               | junior to write? Not really, but some parts were quicker
               | while other were slower.
               | 
               | And imagine if that code bad crashed in production next
               | week instead of right now while the whole context is
               | still in my head. Maybe that would be hours of debugging
               | time...
               | 
               | Maybe as parent said, for a domain where you are braking
               | new ground, it can generate some interesting ideas you
               | wouldn't have thought about. Like a stupid pair that can
               | get you out if a local manima but in general doesn't help
               | much it can be a significant help.
               | 
               | But then again you could do what has been done for
               | decades and speak to another human about the problem, at
               | least they may have signed the same NDA as you...
        
               | holoduke wrote:
               | Yesterday i wanted to understand what a team was doing in
               | a go project. I have never really touched go before. I do
               | understand software, because I develop for plus 20 years.
               | But chatgpt was perfectly able to give me a summary on
               | how the implementation worked. Gave me examples and
               | suggestions. And within a day fulltime pasting code and
               | asking question i had a good understanding of the
               | codebase. It would have be a lot more difficult with only
               | google.
        
               | twelve40 wrote:
               | how often do you get to learn an unfamiliar language? is
               | it something you need to do every day? so this use case,
               | did it save you much time overall?
        
               | NoOn3 wrote:
               | I have very similar experience. For me LLM are good at
               | explaining someone else's complex code, but for some
               | reason they don't help me write new code well. I would
               | also like to see any LLM-driven developers work in real
               | time.
        
               | HappMacDonald wrote:
               | My experience thus far is that LLMs can be quite good at:
               | 
               | * Information lookup
               | 
               | -- when search engines are enshittified and bogged down
               | by SEO spam and when it's difficult to transform a
               | natural language request into a genuinely unique set of
               | search keywords
               | 
               | -- Search-enabled LLMs have the most up to date reach in
               | these circumstances but even static LLMs can work in a
               | pinch when you're searching for info that's probably well
               | represented in their training set before their knowledge
               | cutoff
               | 
               | * Creatively exploring a vaguely defined problem space
               | 
               | -- Especially when one's own head feels like it's too
               | full of lead to think of anything novel
               | 
               | -- Watch out to make sure the wording of your request
               | doesn't bend the LLM too far into a stale direction. For
               | example naming an example can make them tunnel vision
               | onto that example vs considering alternatives to it.
               | 
               | * Pretending to be Stack Exchange
               | 
               | -- EG, the types of questions one might pose on SE one
               | can pose to an LLM and get instant answers, with less
               | criticism for having asked the question in the first
               | place (though Claude is apparently not above gently
               | checking in if one is encountering an X Y problem) and
               | often the LLM's hallucination rate is no worse than that
               | of other SE users
               | 
               | * Shortcut into documentation for tools with either thin
               | or difficult to navigate docs
               | 
               | -- While one must always fact-check the LLM, doing so is
               | usually quicker in this instance than fishing online for
               | which facts to even check
               | 
               | -- This is most effective for tools where tons of people
               | do seem to already know how the tool works (vs tools
               | nobody has ever heard of) but it's just not clear how
               | they learned that.
               | 
               | * Working examples to ice-break a start of project
               | 
               | * Simple automation scripts with few moving parts,
               | especially when one is particular about the goal and the
               | constraints
               | 
               | -- Online one might find example scripts that _almost_
               | meet your needs but always fail to meet them in some
               | fashion that 's irritating to figure out how to coral
               | back into your problem domain
               | 
               | -- LLMs have deep experience with tools and with _short_
               | snippets of coherent code, so their success rate on
               | utility scripts are much higher than on  "portions of
               | complex larger projects".
        
               | edanm wrote:
               | Totally respect your position, given that you actually
               | _tried_ the tool and found it didn 't work for you. That
               | said, one valid explanation is that the tool isn't good
               | for what you're trying to achieve. But an alternative
               | explanation is that you haven't learned how to use the
               | tool effectively.
               | 
               | You seem open to this possibility, since you ask:
               | 
               | > I've never actually seen any LLM-driven developers work
               | in real time. Are there any live coding channels that
               | could convince the skeptics what we're missing out on
               | something revolutionary?
               | 
               | I don't know many yet, but Steve Yegge, a fairly famous
               | developer in his own right, has been talking about this
               | for the last few months, and has walked a few people
               | through his "Chat Oriented Programming" (CHOP) ideas. I
               | believe if you search for that phrase, you'll find a few
               | videos, some from him and some from others. Can't
               | guarantee they're all quality videos, though anything
               | Steve himself does is interesting, IMO.
        
             | evilfred wrote:
             | Middle Ground Fallacy
        
               | harrall wrote:
               | Fallacy fallacy
        
               | goatlover wrote:
               | The middle ground between hyping the new tech and being
               | completely skeptical about it is usually right. New tech
               | is usually not everything it's hyped up to be, but also
               | usually not completely useless or bad for society. It's
               | likely we're not about to usher in the singularity or
               | doom society, but LLMs are useful enough to stick around
               | in various tools. Also it's probably the case that a
               | percentage of they hype is driven by wanting funding.
        
               | oblio wrote:
               | > New tech is usually not everything it's hyped up to be,
               | but also usually not completely useless or bad for
               | society.
               | 
               | Except for cryptocurrencies (at least their ratio of
               | investments to output) :-p
        
             | wvenable wrote:
             | > Where are the sane people in the middle?
             | 
             | They are the quiet ones.
        
               | jrockway wrote:
               | Yup! I don't have a lot to say about LLMs for coding.
               | There are places where I'm certain they're useful and
               | that's where I use them. I don't think "generate a react
               | app from scratch" helps me, but things like "take a CPU
               | profile and write it to /tmp/pprof.out" have worked well.
               | I know how to do the latter, but would need to look at
               | the docs for the exact function name to call, and the LLM
               | just knows and checks the error on opening the file and
               | all that tedium. It's helpful.
               | 
               | At my last job I spent a lot of time on cleanups and
               | refactoring and never got the LLM to help me in any way.
               | This is the thing that I try every few months and see
               | what's changed, because one day it will be able to do the
               | tedious things I need to get done and spare me the
               | tedium.
               | 
               | Something I should try again is having the LLM follow a
               | spec and see how it does. A long time ago I wrote some
               | code to handle HTTP conditional requests. I pasted the
               | standard into my code, and wrote each chunk of code in
               | the same order as the spec. I bet the LLM could just do
               | that for me; not a lot of knowledge of code outside that
               | file was required, so you don't need many tokens of
               | context to get a good result. But alas the code is
               | already written and works. Maybe if I tried doing that
               | today the LLM would just paste in the code I already
               | wrote and it was trained on ;)
        
           | travisporter wrote:
           | > I got into this profession simply because I could Ctrl-Z to
           | the previous step much more easily than my then favourite
           | chemical engineering goals.
           | 
           | That is interesting. Asking as a complete ignoramus - is
           | there not a way to do this now? Like start off with a 100 of
           | reagent and at every step use a bit and discard if wrong
        
             | ssivark wrote:
             | But for every step that turns out to be "correct" you now
             | have to go back and redo that in your held-out sample
             | anyways. So it's not like you get to save on repeating the
             | work -- IIUC you just changed it from depth-first execution
             | order to breadth-first execution order.
        
               | Vampiero wrote:
               | > International Islamic University Chittagong
               | 
               | ??? What's up with native English speakers and random
               | acronyms of stuff that isn't said that often? YMMV, IIUC,
               | IANAL, YSK... Just say it and save everyone else a google
               | search.
        
               | HappMacDonald wrote:
               | So just to make sure I'm on the same page: you're
               | bemoaning how commonly people abbreviate uncommon
               | sayings?
        
               | Vampiero wrote:
               | I'm bemoaning the fact that I have to google random
               | acronyms every time an American wants to say the most
               | basic shit as if everyone on the internet knows their
               | slang and weird four letter abbreviations
               | 
               | And googling those acronyms usually returns unrelated
               | shit unless you go specifically to urban dictionary
               | 
               | And then it's "If I understand correctly". Oh. Of course.
               | He couldn't be arsed to type that
        
               | amenhotep wrote:
               | FWIW IMO YTA
        
               | edgineer wrote:
               | frfr
        
               | tmtvl wrote:
               | I'm not a native English speaker, but IIUC is clearly 'If
               | I Understand Correctly'. If you look at the context it's
               | often fairly easy to figure out what an initialism means.
               | I mean even I can usually deduce the meaning and I'm
               | barely intelligent enough to qualify as 'sentient'.
        
             | numpad0 wrote:
             | That likely ends up with 100 failed results all attributed
             | to the same set of causes
        
           | dboreham wrote:
           | Calculators vs slide rules.
        
           | numpad0 wrote:
           | I can't relate to this comment at all. Doesn't feel like
           | what's said in GP either.
           | 
           | IMO, LLMs are super fast predictive input and hallucinatory
           | unzip; files to be decompressed don't have to exist yet, but
           | input has to be extremely deliberate and precise.
           | 
           | You have to have a valid formula that gives the resultant
           | array that don't require no more than 100 IQ to comprehend,
           | and then they unroll it for you into the whole code.
           | 
           | They don't reward trial and error that much. They don't seem
           | to help outsiders like 3D printers did, either. It is indeed
           | a discriminatory tool as in it mistreats amateurs.
           | 
           | And, by the way, it's also increasingly obvious to me that
           | assuming pro-AI posture more than what you would from purely
           | rational and utilitarian standpoint triggers a unique mode of
           | insanity in humans. People seem to contract a lot of
           | negativity doing it. Don't do that.
        
         | CraigJPerry wrote:
         | >> where all you wrote was function signatures and high-level
         | control flow, and maybe some conformance tests around them
         | 
         | AIUI that's where idris is headed
        
         | greenyouse wrote:
         | That approach sounds similar to the Idris programming language
         | with Type Driven Development. It starts by planning out the
         | program structure with types and function signatures. Then the
         | function implementation (aka holes) can be filled in after the
         | function signatures and types are set.
         | 
         | I feel like this is a great approach for LLM assisted
         | programming because things like types, function signatures,
         | pre/post conditions, etc. give more clarity and guidance to the
         | LLM. The more constraints that the LLM has to operate under,
         | the less likely it is to get off track and be inconsistent.
         | 
         | I've taken a shot at doing some little projects for fun with
         | this style of programming in TypeScript and it works pretty
         | well. The programs are written in layers with the domain
         | design, types, schema, and function contracts being figured out
         | first (optionally with some LLM help). Then the function
         | implementations can be figured out towards the end.
         | 
         | It might be fun to try Effect-TS for ADTs + contracts + compile
         | time type validation. It seems like that locks down a lot of
         | the details so it might be good for LLMs. It's fun to play
         | around with different techniques and see what works!
        
           | lysecret wrote:
           | 100% this is what I do in python too!
        
         | brabel wrote:
         | I am not a genius but have a couple of decades experience and
         | finally started using LLMs in anger in the last few weeks. I
         | have to admit that when my free quota from GitHub Copilot ran
         | out (I had already run out of Jetbrains AI as well!! Our
         | company will start paying for some service as the trials have
         | been very successful), I had a slight bad feeling as my
         | experience was very similar to OP: it's really useful to get me
         | started, and I can finish it much more easily from what the AI
         | gives me than if I started from scratch. Sometimes it just
         | fills in boilerplate, other times it actually tells me which
         | functions to call on an unfamiliar API. And it turns out it's
         | really good at generating tests, so it makes my testing more
         | comprehensive as it's so much faster to just write them out
         | (and refine a bit usually by hand). The chat almost completely
         | replaced my StackOverflow queries, which saves me much time and
         | anxiety (God forbid I have to ask something on SO as that's a
         | time sink: if I just quickly type out something I am just
         | asking to be obliterated by the "helpful" SO moderators... with
         | the AI, I just barely type anything at all, leave it with typos
         | and all, the AI still gets me!).
        
           | EagnaIonat wrote:
           | Have you tried using Ollama? You can download and run an LLM
           | locally on your machine.
           | 
           | You can also pick the right model for the right need and it's
           | free.
        
             | mentos wrote:
             | I'm using ChatGPT4o to convert a C# project to C++. Any
             | recommendation on what Ollama model I could use instead?
        
               | neonsunset wrote:
               | The one that does not convert C# at all and asks you to
               | just optimize it in C# instead (and to use the
               | appropriate build option) :D
        
               | mentos wrote:
               | I'm converting game logic from C# to UE5 C++. So far made
               | great progress using ChatGPT4o and o1
        
               | neonsunset wrote:
               | Do you find these working out better for you than Claude
               | 3.5 Sonnet? So far I've not been a fan of the ChatGPT
               | models' output.
        
               | mentos wrote:
               | I find ChatGPT better with UE4/5 C++ but they are very
               | close.
               | 
               | Biggest advantage is the o1 128k context. I can one shot
               | an entire 1000 line class where normally I'd have to go
               | function by function with 4o.
        
             | brabel wrote:
             | Yes. If the AI is not integrated with the IDE, it's not as
             | helpful. If there were an IDE plugin that let you use a
             | local model, perhaps that would be an option, but I haven't
             | seen that (Github Copilot allows selecting different
             | models, but I didn't check more carefully whether that also
             | includes a local one, anyone knows?).
        
               | oogali wrote:
               | It's doable as it's what I use to experiment.
               | 
               | Ollama + CodeGPT IntelliJ plugin. It allows you to point
               | at a local instance.
        
               | mark_l_watson wrote:
               | I also use Ollama for coding. I have a 32G M2 Mac, and
               | the models I can run are very useful for coding and
               | debugging, as well as data munging, etc. That said,
               | sometimes I also use Claude Sonnet 3.5 and o1. (BTW, I
               | just published an Ollama book yesterday, so I am a little
               | biassed towards local models.)
        
               | matrix12 wrote:
               | Thanks for the book!
        
               | bpizzi wrote:
               | > (Github Copilot allows selecting different models, but
               | I didn't check more carefully whether that also includes
               | a local one, anyone knows?).
               | 
               | To my knowledge, it doesn't.
               | 
               | On Emacs there's gptel which integrates quiet nicely
               | different LLM inside Emacs, including a local Ollama.
               | 
               | > gptel is a simple Large Language Model chat client for
               | Emacs, with support for multiple models and backends. It
               | works in the spirit of Emacs, available at any time and
               | uniformly in any buffer.
               | 
               | https://github.com/karthink/gptel
        
               | th4t1sW13rd wrote:
               | This can use Ollama: https://www.continue.dev/
        
           | devjab wrote:
           | I'm genuinely curious but what did you use StackOverflow for
           | before? With a couple of decades in the industry I can't
           | remember when the last time I "Google programmed" anything
           | was. I always go directly to the documentation for whatever
           | it is I'm working for, because where else would I find out
           | how it actually works? It's not like I haven't "Google
           | programmed" when I was younger, but it's just such a slow
           | process based on trusting strangers on the internet that it
           | never really made much sense once I started knowing what I
           | was doing. I sort of view LLM's in a similar manner. Why
           | would you go to them rather than the actual documentation? I
           | realize this might sound arrogant or rude, and I really hope
           | you believe me when I say that I don't mean it like this. The
           | reason I'm curious is because we're really struggling getting
           | junior developers to not look, everywhere, but the
           | documentation first. Which means they often actually don't
           | know how what they build works. Which can be an issue when
           | they load every object of a list into memory isntead of using
           | a generator...
           | 
           | As far as using LLMs in anger I would really advice anyone to
           | use them. GitHub copilot hasn't been very useful for me
           | personally, but I get a lot of value out of running my
           | thought process by a LLM. I think better when I "think out
           | loud" and that is obviously challenging when everyone is
           | busy. Running my ideas by an LLM helps me process them in a
           | similar (if not better) fashion, often it won't even really
           | matter what the LLM conjures up because simply describing
           | what I want to do often gives me new ideas, like "thinking
           | out loud".
           | 
           | As far as coding goes. I find it extremely useful to have
           | LLMs write cli scripts to auto-generate code. The code the
           | LLM will produce is going to be absolute shite, but that
           | doesn't matter if the output is perfectly fine. It's reduced
           | my personal reliance on third party tools by quite a lot.
           | Because why would I need a code generator for something (and
           | in that process trust a bunch of 3rd party libraries) when I
           | can have a LLM write a similar tool in half an hour?
        
             | wiseowise wrote:
             | > Why would you go to them rather than the actual
             | documentation?
             | 
             | Not every documentation is made equal. For example: Android
             | docs are royal shit. They cover some basic things, e.g.
             | show a button, but good look finding esoteric Bluetooth
             | information or package management, etc. Most of it is a mix
             | of experimentation and historical knowledge (baggage).
        
               | devjab wrote:
               | > Not every documentation is made equal.
               | 
               | They are wildly different. I'm not sure the Android API
               | reference is that bad, but that is mainly because I've
               | spent a good amount years with the various .Net API
               | references and the Android one is a much more shiny turd
               | than those. I haven't had issues with Bluetooth myself,
               | the Bluetooth SIG has some nice specification PDF's but I
               | assume you're talking about the ones which couldn't be
               | found? I mean this in a "they don't seem to exist" kind
               | of way and not that, you specifically, couldn't find
               | them.
               | 
               | I agree though. It's just that I've never really found
               | internet answers to be very useful. I did actually search
               | for information a few years back when I had to work with
               | a solar inverter datalogger, but it turned out that
               | having the ridicilously long German engineering manual
               | scanned, OCR processed and translated was faster. Anyway,
               | we all have our great white whales. I'm virtually
               | incapable of understanding the SQLAlchemy documentation
               | as an example, luckily I'll probably never have to use it
               | again.
        
             | brabel wrote:
             | I believe you don't mean to be rude, but you just sound
             | completely naive to me. To think that documentation
             | includes everything is just, like, have you actually been
             | coding anything at all that goes just slightly off the
             | happy path? Example from yesterday: I have a modular JavaFX
             | application (i.e. it uses Java JMS modules, not just
             | Maven/Gradle modules). I introduced a call to `url()` in
             | JavaFX CSS. That works when running using the classpath,
             | but not when using the module path. I spent half an hour
             | reading docs to see what they say about modular
             | applications. They didn't mention anything at all.
             | Specially because in my case, I was not just doing
             | `getClass().getResource`... I was using the CSS directive
             | to load a resource from the jar. This is exactly when I
             | would likely go on SO and ask if anyone had seen this
             | before. It used to be highly likely someone who's an expert
             | on JavaFX would see and answer my question, sometimes even
             | people who directly worked on JavaFX!
             | 
             | StackOverflow was not really meant for juniors, as juniors
             | usually can indeed find answers on documentation, normally.
             | It was, like ExpertsExchange before it, a place for
             | veterans to exchange tribal knowledge like this. If you
             | think only juniors use SO, you seem to have arrived at the
             | scene just yesterday and just don't know what you're
             | talking about.
        
         | ilrwbwrkhv wrote:
         | Being a dev at a large company is usually the sign that you're
         | not very good though. And anyone can start a company with the
         | right connections.
        
           | ksenzee wrote:
           | You've just disproved your own assertion. Either that or you
           | believe everyone who's any good has the right connections.
        
           | tomwojcik wrote:
           | That's a terrible blanket statement, very US-centric. Not
           | everyone wants to start a company and you can't just reduce
           | ones motivations to your measure of success.
        
             | joseda-hg wrote:
             | God knows many of the best devs I've known would be an
             | absolute nightmare on the business side, they'd rather have
             | a capable business person if they could avoid it
        
         | benterix wrote:
         | > designed around filling in the implementations for you. 20
         | years ago that would have been from a live online database
         | 
         | This reminds me a bit of PowerBuilder (or was it
         | PowerDesigner?) from early 1990s. They sold it to SAP later, I
         | was told it's still being used today.
        
         | antirez wrote:
         | I have also many years of programming experience and find
         | myself strongly "accelerated" by LLMs when writing code. But,
         | if you think at it, it makes sense that many seasoned
         | programmers are using LLMs better. LLMs are a helpful tool, but
         | also a hard-to-use tool, and in general it's fair to think that
         | better programmers can do a better use of some assistant (human
         | or otherwise): better understanding its strengths, identifying
         | faster the good and bad output, providing better guidance to
         | correct the approach...
         | 
         | Other than that, what correlates more strongly with the ability
         | to use LLMs effectively is, I believe, language skills: the
         | ability to describe problems very clearly. LLMs reply quality
         | changes very significantly with the quality of the prompt.
         | Experienced programmers that can _also_ communicate effectively
         | provide the model with many design hints, details where to
         | focus, ..., basically escaping many local minima immediately.
        
           | bsenftner wrote:
           | Communication skills are the keys to using LLMs. Think about
           | it: every type of information you want is in them, in fact it
           | is there multiple times, with multiple levels of seriousness
           | in the treatment of the idea. If one is casual in their
           | request, using casual language, then the LLM will reply with
           | a casual reply because that matched your request best. To get
           | a hard, factual answer from those that are experts in a
           | subject, use the formal term, use the expert's language and
           | you'll get back a rely more likely to be correct because it's
           | in the same level of formal treatment as correct answers.
        
             | psychoslave wrote:
             | >every type of information you want is in them
             | 
             | Actually, I'm afraid that no. It won't give us the step by
             | step scalable processes to make humanity as a whole enter
             | in a loop of indefinitely long period of world peace, with
             | each of us enjoying life in its own thriving manner. That
             | would be great information to broadcast, though.
             | 
             | Also it equally has ability to produce large pile of
             | completely delusional answers, that mimics just as well
             | genuinely sincere statements. Of course, we can also
             | receive that kind of misguiding answers from humans. But
             | the amount of output that mere humans can throw out in such
             | a form is far more limited.
             | 
             | All that said, it's great to be able to experiment with it,
             | and there are a lot of nice and fun things to do with it.
             | It can be a great additional tool, but it won't be a self-
             | sufficient panacea of information source.
        
               | bsenftner wrote:
               | > It won't give us the step by step scalable processes to
               | make humanity as a whole enter in a loop of indefinitely
               | long period of world peace
               | 
               | That's not anywhere, that's a totally unsolved and open
               | ended problem, why would you think an LLM would have
               | that?
        
               | fmbb wrote:
               | If what you meant was
               | 
               | > Think about it: every type of already solved problem
               | you want information about is in them, in fact it is
               | there multiple times, with multiple levels of seriousness
               | in the treatment of the idea.
               | 
               | then that was not clear from your comment saying LLMs
               | contain any information you want.
               | 
               | One has to be careful communicating about LLms because
               | the world is full of people that actually believe LLMs
               | are generally intelligent super beings.
        
               | numpad0 wrote:
               | I think GP's saying that it must be in your prompt, not
               | in the weights.
               | 
               | If you want LLM make sandwich, you have to tell them you
               | `want triangular sandwiches of standard serving size made
               | with white bread and egg based filling`, not `it's almost
               | noon and I'm wondering if sandwich for lunch is a good
               | idea`. Fine-tuning partially solves that problem but they
               | still like the former.
        
               | arminiusreturns wrote:
               | After a small prompt engineering:
               | https://0bin.net/paste/zolMrjVz#dgZrZzKU-
               | PlxdkJTdG0pZU9bsCM3...
        
               | psychoslave wrote:
               | Interesting, thanks for sharing. Could you also give some
               | insights on the process you followed?
        
               | arminiusreturns wrote:
               | Sure. Lately I've found that the "role" part of prompt
               | engineering seems to be the most important. So what I've
               | been doing is telling ChatGPT to play the role of _the
               | most educated /wise/knowledgeable/skilled $field
               | $role(advisor, lawyer, researcher etc) in the history of
               | the world_ and then giving it some context for the task
               | before asking for the actual task.
               | 
               | Sometimes asking it to self reflect on how the prompt
               | itself could be better engineered helps if the initial
               | response isn't quite right.
        
           | mhalle wrote:
           | I completely agree that communication skills are critical in
           | extracting useful work or insight from LLMs. The analogy for
           | communicating with people is not far-fetched. Communicating
           | successfully with a specific person requires an understanding
           | of their strengths and weaknesses, their tendencies and blind
           | spots. The same is true for communicating with LLMs.
           | 
           | I have actually found that from a documentation point of
           | view, querying LLMs has made me better and explaining things
           | to people. If, given the documentation for a system or API, a
           | modern LLM can't answer specific questions about how to
           | perform a task, a person using the same documentation will
           | also likely struggle. It's proving to be a good way to test
           | the effectiveness of documentation, for humans and for LLMs.
        
           | LouisSayers wrote:
           | > the ability to describe problems very clearly
           | 
           | Yes, and to provide enough context.
           | 
           | There's probably a lot that experience is contributing to the
           | interaction as well, for example - knowing when the LLM has
           | gone too far, focusing on what's important vs irrelevant to
           | the task, modularising and refactoring code, testing etc
        
           | gen220 wrote:
           | Hey! Asking because I know you're a fellow vimmer [0]. Have
           | you integrated LLMs into your editor/shell? Or are you
           | largely copy-pasting context between a browser and vim? This
           | context-switching of it all has been a slight hang-up for me
           | in adopting LLMs. Or are you asking more strategic questions
           | where copy-paste is less relevant?
           | 
           | [0] your videos on writing systems software were part of what
           | inspired me to make a committed switch into vim. thank you
           | for those!
        
             | qup wrote:
             | You want aider.
        
           | rudiksz wrote:
           | > "seasoned programmers are using LLMs better".
           | 
           | I do not remember a single instance when code provided to me
           | by an LLM worked at all. Even if I ask something small that
           | cand be done in 4-5 lines of code is always broken.
           | 
           | From a fellow "seasoned" programmer to another: how the hell
           | do you write the prompts to get back correct working code?
        
             | jkaptur wrote:
             | The story from the article matches my experience. The LLM's
             | first answer is often a _little_ broken, so I tweak it
             | until it 's actually correct.
        
             | numpad0 wrote:
             | dc: not a seasoned dev, with <b> and <h1> tags on "not".
             | 
             | They can't think for you. All intelligent thinking you have
             | to do.
             | 
             | First, give them high level requirement that can be
             | clarified into indented bullet points that looks like code.
             | Or give them such list directly. Don't give them half-open
             | questions usually favored by talented and autonomous
             | individuals.
             | 
             | Then let them further decompress that pseudocode bullet
             | points into code. They'll give you back code that resemble
             | a digitized paper test answer. Fix obvious errors and you
             | get a B grade compiling code.
             | 
             | They can't do non-conventional structures, Quake style
             | performance optimized codes, realtime robotics, cooperative
             | multithreading, etc., just good old it takes what it takes
             | GUI app API and data manipulation codes.
             | 
             | For those use cases with these points in mind, it's a lot
             | faster to let LLM generate tokens than typing `int
             | this_mandatory_function_does_obvious (obvious *obvious){
             | ...` manually on a keyboard. That should arguably be a
             | productivity boost in the sense that the user of LLM is
             | effectively typing faster.
        
             | HappMacDonald wrote:
             | I'd ask things like "which LLM are you using", and "what
             | language or APIs are you asking it to write for".
             | 
             | For the standard answers of "GPT-4 or above", "claude
             | sonnet or haiku", or models of similar power and well known
             | languages like Python, Javascript, Java, or C and assuming
             | no particularly niche or unheard of APIs or project
             | contexts the failure rate of 4-5 line of code scripts in my
             | experience is less than 1%.
        
             | wvenable wrote:
             | I rarely get back not working code but I've also
             | internalized it's limitations so I no longer ask it for
             | things it's not going to be able to do.
             | 
             | As other commenters have pointed it, there also a lot of
             | variation between different models and some are quite dumb.
             | 
             | I've had no issues with 10-20 line coding problems. I've
             | also had it built a lot of complete shell scripts and had
             | no problem there either.
        
             | antirez wrote:
             | Check my YouTube channel if you have a few minutes. I just
             | published a video about adding a complex feature (UTF-8) to
             | the Kilo editor, using Claude.
        
             | mordymoop wrote:
             | I write the prompt as if I'm writing an email to a
             | subordinate that clearly specifies what the code needs to
             | do.
             | 
             | If what I'm requesting an improvement to an existing code,
             | I paste the whole code if practical, or if not, as much of
             | the code as possible, as context before making request for
             | additional functionality.
             | 
             | Often these days I add something like "preserve all
             | currently existing functionality." Weirdly, as the models
             | have gotten smarter, they have also gotten more prone to
             | delete stuff they view as unnecessary to the task at hand.
             | 
             | If what I'm doing is complex (a subjective judgement) I ask
             | it to lay out a plan for the intended code before starting,
             | giving me a chance to give it a thumbs up or clarify its
             | understanding of what I'm asking for if it's plan is off
             | base.
        
           | kragen wrote:
           | That's really interesting. What are the most important things
           | you've learned to do with the LLMs to get better results?
           | What do your problem descriptions look like? Are you going
           | back and forth many times, or crafting an especially-high-
           | quality initial prompt?
        
             | antirez wrote:
             | I'm posting a set of videos on my YT channel where I'll
             | show the process I follow. Thanks!
        
               | kragen wrote:
               | That's fantastic! I thought about asking if you had
               | streamed any of it, but I didn't want to sound demanding
               | and entitled :)
        
         | ignoramous wrote:
         | > _[David, Former staff engineer at Google ... CTO of
         | Tailscale,] doesn 't need LLMs. That he says LLMs make him more
         | productive at all as a hands-on developer, especially around
         | first drafts on a new idea, means a lot to me..._
         | 
         | Don't doubt for a second the pedigree of founding engs at
         | Tailscale, but David is careful to point out exactly why LLMs
         | work for them (but might not for others):                  I am
         | doing a particular kind of programming, product development,
         | which could be roughly described as trying to bring programs to
         | a user through a robust interface. That means I am building a
         | lot, throwing away a lot, and bouncing around between
         | environments. Some days I mostly write typescript, some days
         | mostly Go. I spent a week in a C++ codebase last month
         | exploring an idea, and just had an opportunity to learn the
         | HTTP server-side events format. I am all over the place,
         | constantly forgetting and relearning.            If you spend
         | more time proving your optimization of a cryptographic
         | algorithm is not vulnerable to timing attacks than you do
         | writing the code, I don't think any of my observations here are
         | going to be useful to you.
        
           | pplonski86 wrote:
           | I'm in similar situations, I jump between many environments,
           | mainly between Python and Typescript, however, currently
           | testing a new idea of learning algorithm in C++, and I simply
           | don't always remember all syntax. I was very skeptical about
           | LLMs at first. Now, I'm using LLMs daily. I can focus more on
           | thinking rather than searching stackoverflow. Very often I
           | just need simple function, that it is much faster to create
           | with chat.
        
             | JKCalhoun wrote:
             | And if anyone remembers: before Stack Overflow you more or
             | less had to specialize in a domain, become good using a
             | handful of frameworks/API, on one platform. Learning a new
             | language, a new API (god forbid a new platform) was to
             | sail, months long, into seas unknown.
             | 
             | In this regard, with first Stack Overflow and now LLMs, the
             | field has improved mightily.
        
           | big_youth wrote:
           | > If you spend more time proving your optimization of a
           | cryptographic algorithm is not vulnerable to timing attacks
           | than you do writing the code, I don't think any of my
           | observations here are going to be useful to you.
           | 
           | I am not a software dev I am a security researcher. LLM's are
           | great for my security research! It is so much easier and
           | faster to iterate on code like fuzzers to do security
           | testing. Writing code to do a padding oracle attack would
           | have taken me a week+ in the past. Now I can work with an LLM
           | to write code and learn and break within the day.
           | 
           | It has accelerated my security research 10 fold, just because
           | I am able to write code and parse and interpret logs at a
           | level above what I was able to a few years ago.
        
         | Vox_Leone wrote:
         | I have been using LLM to generate functional code from *pseudo-
         | code* with excellent results. I am starting to experiment with
         | UML diagrams, both with LLM and computer vision to actually
         | generate code from UML diagrams; for example a simple activity
         | diagram could be the prompt on LLM 's, and might look like:
         | 
         | Start -> Enter Credentials -> Validate -> [Valid] -> Welcome
         | Message -> [Invalid] -> Error Message
         | 
         | Corresponding Code (Python Example):
         | 
         | class LoginSystem:                   def
         | validate_credentials(self, username, password):             if
         | username == "admin" and password == "password":
         | return True             return False              def
         | login(self, username, password):             if
         | self.validate_credentials(username, password):
         | return "Welcome!"             else:                 return
         | "Invalid credentials, please try again."
         | 
         | *Edited for clarity
        
           | jonvk wrote:
           | This example illustrates one of the risks of using LLMs
           | without subject expertise though. I just tested this with
           | claude and got that exact same validation method back. Using
           | string comparison is dangerous from a security perspective
           | [1], so this is essentially unsafe validation, and there was
           | no warning in the response about this.
           | 
           | 1. https://sqreen.github.io/DevelopersSecurityBestPractices/t
           | im...
        
             | jpc0 wrote:
             | Are you talking about the timing based attacks on that
             | website which fails miserably at rendering a useable page
             | on mobile?
        
           | jpc0 wrote:
           | Could you add to the prompt that the password is stored in an
           | sqlite database using argon2 for encryption, the encryption
           | parameters are stored as environment variables.
           | 
           | You would like it to avoid timing based attacks as well as
           | dos attacks.
           | 
           | It should also generate the functions as pure functions so
           | that state is passed in and passed out and no side
           | effects(printing to the console) happen within the function.
           | 
           | Then also confirm for me that it has handled all error cases
           | that might reasonably happen.
           | 
           | While you are doing that, just think about how much implicit
           | knowledge I just had to type into the comment here and that
           | is still ignoring a ton of other knowledge that needs to be
           | considered like whether that password was salted before being
           | stored. All the error conditions for the sqlite
           | implementation in python, the argon2 implementation in the
           | library.
           | 
           | TLDR: that code is useless and would have taken me the same
           | amount of time to write as your prompt.
        
         | apwell23 wrote:
         | he is using llm for coding. you don't become staff engineer by
         | being a badass coder. Not sure how they are related.
        
         | HarHarVeryFunny wrote:
         | > His post reminds me of an old idea I had of a language where
         | all you wrote was function signatures and high-level control
         | flow
         | 
         | Regardless of language, that's basically how you approach the
         | design of a new large project - top down architecture first,
         | then split the implementation into modules, design the major
         | data types, write function signatures. By the time you are done
         | what is left is basically the grunt work of implementing it
         | all, which is the part that LLMs should be decent at,
         | especially if the functions/methods are documented to level
         | (input/output assertions as well as functionality) where it can
         | also write good unit tests for them.
        
           | dingnuts wrote:
           | > the grunt work of implementing it all
           | 
           | you mean the fun part. I can really empathize with digital
           | artists. I spent twenty years honing my ability to write code
           | and love every minute of it and you're telling me that in a
           | few years all that's going to be left is PM syncs and OKRs
           | and then telling the bot what to write
           | 
           | if I'm lucky to have a job at all
        
             | HarHarVeryFunny wrote:
             | I think it depends on the size of the project. To me, the
             | real fun of being a developer is the magic of being able to
             | conceive of something and then conjure it up out of thin
             | air - to go from an idea to reality. For a larger more
             | complex project the major effort in doing this is the
             | solution conception, top-down design (architecture), and
             | design of data structures and component interfaces... The
             | actual implementation (coding), test cases and debugging,
             | then does become more like drudgework, not the most
             | creative or demanding part of the project, other than the
             | occasional need for some algorithmic creativity.
             | 
             | Back in the day (I've been a developer for ~45 years!) it
             | was a bit different as hardware constraints (slow 8-bit
             | processors with limited memory) made algorithmic and code
             | efficiency always a primary concern, and that aspect was
             | certainly fun and satisfying, and much more a part of the
             | overall effort than it is today.
        
         | mahmoudimus wrote:
         | Isn't that the idea behind UML? Which didn't work out so well,
         | however, with the advent of LLMs today, I think that premise
         | could work.
        
       | agentultra wrote:
       | It seems nice for small projects but I wouldn't use it for
       | anything serious that I want to maintain long term.
       | 
       | I would write the tests first and foremost: they are the
       | specification. They're for future me and other maintainers to
       | understand and I wouldn't want them to be generated: write them
       | with the intention of explaining the module or system to another
       | person. If the code isn't that important I'll write unit tests.
       | If I need better assurances I'll write property tests at a
       | minimum.
       | 
       | If I'm working on concurrent or parallel code or I'm working on
       | designing a distributed system, it's gotta be a model checker.
       | I've verified enough code to know that even a brilliant human
       | cannot find 1-in-a-million programming errors that surface in
       | systems processing millions of transactions a minute. We're not
       | wired that way. Fortunately we have formal methods. Maths is an
       | excellent language for specifying problems and managing
       | complexity. Induction, category theory, all awesome stuff.
       | 
       | Most importantly though... you have to write the stuff and read
       | it and interact with it to be able to keep it in your head.
       | Programming is theory-building as Naur said.
       | 
       | Personally I just don't care to read a bunch of code and play,
       | "spot the error;" a game that's rigged for me to be bad at. It's
       | much more my speed to write code that obviously has no errors in
       | it because I've thought the problem through. Although I struggle
       | with this at times. The struggle is an important part of the
       | process for acquiring new knowledge.
       | 
       | Though I do look forward to algorithms that can find proofs of
       | trivial theorems for me. That would be nice to hand off...
       | although simp does a lot of work like that already. ;)
        
       | rafaelmn wrote:
       | I disagree about search. While LLM can give you an answer faster,
       | good doc (eg. MDN article in CSS example) will :
       | 
       | - be way more reliable
       | 
       | - probably be up to date on how you should solve it in
       | latest/recommend approach
       | 
       | - put you in a place where you can search for adjecent tech
       | 
       | LLM with search has potential but I'd like if current tools are
       | more oriented on source material rather than AI paraphrasing.
        
         | cruffle_duffle wrote:
         | One of my tricks is to paste the docs right into the context so
         | the model can't fuck it up.
         | 
         | Though I still wonder if that means I'm only tricking myself
         | into thinking the LLM is increasing my productivity.
        
           | rafaelmn wrote:
           | I likr this approach. Read the docs, figure out what you
           | want, get LLM to do the grunt work with all relevant context
           | and review.
        
         | EGreg wrote:
         | I have found LLMs to be 95% useful on documented software, from
         | everything eg Uniswap smart contracts to plugins in cordova to
         | setting up Mac or Linux administrative tools.
         | 
         | The problem for a regular person is that you have to copypasye
         | from chat. That is "the last mile". For terminal commands
         | that's fine but for programming you need a tool to automate
         | this.
         | 
         | Something like refactoring a function, given the entire
         | context, etc. And it happening in the editor and you seeing a
         | diff right away. The rest of the explanatory text should go
         | next to the diff in a separate display.
         | 
         | I bet someone can make a VSCode extension that chats with an
         | LLM and does exactly this. The LLM is told to provide all the
         | sections labeled clearly (code, explanation) and the editor
         | makes the diff.
         | 
         | Having said all that, good libraries that abstract away
         | differences are far superior to writing code with an LLM. The
         | only code that needs to be written is the interface and wiring
         | up between the libraries.
        
       | Ozzie_osman wrote:
       | One mode I felt was missed was "thought partner", especially
       | while debugging (aka rubber ducking).
       | 
       | We had an issue recently with a task queue seemingly randomly
       | stalling. We were able to arrive at the root cause much more
       | quickly than we would have because of a back-and-forth
       | brainstorming session with Claude, which involved describing the
       | issue we were seeing, pasting in code from library to ask
       | questions, asking it to write some code to add some missing
       | telemetry, and then probing it for ideas on what might be going
       | wrong. An issue that may have taken days to debug took about an
       | hour to identify.
       | 
       | Think of it as rubber ducking with a very strong generalist
       | engineer who knows about basically any technical concepts.
        
         | mmahemoff wrote:
         | The new video and screen-share capabilities in ChatGPT and
         | Gemini should make rubber-ducking smoother.
         | 
         | I feel like I've worn out my computer's clipboard and alt-tab
         | keys at this stage of the LLM experience.
        
           | fragmede wrote:
           | You may want to try any of the tools that can write to the
           | filesystem so you're at least not copy pasting code from a
           | chat window. CoPilot, Cursor, Aider, Tabnine, etc.
        
         | vendiddy wrote:
         | I found myself doing this with o1 recently for software
         | architecture.
         | 
         | I will evaluate design ideas with the model, express concerns
         | on trade-offs, ask for alternative ideas, etc.
         | 
         | Some of the benefit is having someone to talk to, but with
         | proper framing it is surprisingly good at giving balanced
         | takes.
        
       | simondotau wrote:
       | I've recently started using Cursor because it means I can now
       | write python where two weeks ago I couldn't write python. It
       | wrote the first pass of an API implementation by feeding it the
       | PDF documentation. I've spent a few days testing and massaging it
       | into a well formed, well structured library, pair-programming
       | style.
       | 
       | Then I needed to write a simple command line utility, so I wrote
       | it in Go, even though I've never written Go before. Being able to
       | make tiny standalone executables which do real work is
       | incredible.
       | 
       | Now if I ever need to write something, I can choose the language
       | most suited to the task, not the one I happen to have the most
       | experience with.
       | 
       | That's a superpower.
        
         | midasz wrote:
         | But you're not really writing python right? You're instructing
         | a tool to generate python. Kinda like saying I'm writing
         | bytecode while I'm actually just typing Java.
        
           | simondotau wrote:
           | I am really writing python. The LLM is a substitute for
           | having foreknowledge of this particular language's syntax and
           | grammar, but I'm still debugging like a "real" programmer and
           | I'm still editing/refining the code like a "real" programmer,
           | because I am.
           | 
           | Probably half the lines of code were written by me, because I
           | do know how to write code.
           | 
           | Here's what I wrote if you're curious:
           | https://github.com/sjwright/zencontrol-python/
        
       | yawnxyz wrote:
       | > I could not go a week without getting frustrated by how much
       | mundane typing I had to do before having a FIM model
       | 
       | For those not in-the-know, I just learned today that code
       | autocomplete is actually called "Fill-in-the-Middle" tasks
        
         | Guthur wrote:
         | Says who? I've been in the industry for nearly 25 years and
         | have heard auto complete throughout but not once have I heard
         | fill in the middle.
         | 
         | Stop taking these blogs as oracle's of truth, they are not.
         | These AI articles are full of this nonsense, to the point where
         | it would appear to me many responses might just be Nvidia bots
         | or whatever.
        
           | sunaookami wrote:
           | >I've been in the industry for nearly 25 years and have heard
           | auto complete throughout but not once have I heard fill in
           | the middle
           | 
           | Then you need to look harder. FiM is a common approach for
           | code generation LLMs.
           | 
           | https://openai.com/index/efficient-training-of-language-
           | mode...
           | 
           | https://arxiv.org/abs/2207.14255
           | 
           | This was before ChatGPT's release btw.
        
             | Guthur wrote:
             | Why, what was wrong with code completion, it was perfectly
             | valid before even when including some sort of fuzzing.
             | 
             | It's like everything to do with LLM marketing buzzword
             | nonsense.
             | 
             | I really want to just drop out of tech until all this
             | obnoxious hype BS is gone.
        
               | ascorbic wrote:
               | Autocomplete is the feature, fill in the middle is one
               | approach to implementing it. There are other ways to
               | providing it (which were used in earlier versions of
               | Copilot) and FIM can be used for tasks other than code
               | completion.
        
               | wruza wrote:
               | It's just a term that signals "completion in between"
               | rather than "after". Regular code completion usually
               | doesn't take the following blocks into account mostly
               | because these are grammatically vague due to an ongoing
               | edit.
               | 
               | Your comments may be sympathised to, but why on earth are
               | they addressed to the root commenter. They simply shared
               | their findings about an acronym.
        
               | Guthur wrote:
               | Because they mentioned it, why on earth would you think
               | that is not a valid response in a thread that mentions
               | it, from my observation that's pretty much how forum like
               | threads work.
               | 
               | More pressingly why do you think you should police it?
        
               | wruza wrote:
               | Apologies if my feedback annoyed you, it wasn't the goal.
               | I just care about HN and this didn't feel right.
        
           | crawshaw wrote:
           | Author here.
           | 
           | FIM is a term of art in LLM research for a style of tokens
           | used to implement code completion. In particular, it refers
           | to training an LLM with the extra non-printing tokens:
           | <|fim_prefix|>         <|fim_middle|>         <|fim_suffix|>
           | 
           | You would then take code like this:                   func
           | add(a, b int) int {             return <cursor>         }
           | 
           | and convert it to:                   <|fim_prefix|>func
           | add(a, b int) int {             return<|fim_suffix|>
           | }<|fim_middle|>
           | 
           | and have the LLM predict the next token.
           | 
           | It is, in effect, an encoding scheme for getting the prefix
           | and suffix into the LLM context while positioning the next
           | token to be where the cursor is.
           | 
           | (There are several variants of this scheme.)
        
       | ripped_britches wrote:
       | I'll say that the payoff for investing the time to learn how to
       | do this right is huge. Especially with cursor which allows me to
       | easily chat around context (docs, library files, etc)
        
         | Aeolun wrote:
         | I didn't believe it could be so good until I actually used it.
         | It's a shame some of their models are proprietary because that
         | means I can't use it for work. Would love if the thing worked
         | purely with Copilot Chat (like Zed does), or if Zed added a
         | similar composer mode.
        
       | brabel wrote:
       | What the author is asking about, a quick sketchpad where you can
       | try out code quickly and chat with the AI, already exists in the
       | JetBrains IDEs. It's called a scratch file[1].
       | 
       | As far as I know, the idea of a scratch "buffer" comes from
       | emacs. But in Jetbrains IDEs, you have the full IDE support even
       | with context from your current project (you can pick the
       | "modules" you want to have in context). Given the good
       | integration with LLMs, that's basically what the author seems to
       | want. Perhaps give GoLand[2] a try.
       | 
       | Disclosure: no, I don't work for Jetbrains :D just a very happy
       | customer.
       | 
       | [1] https://www.jetbrains.com/help/idea/scratches.html
       | 
       | [2] https://www.jetbrains.com/go/
        
         | ryanobjc wrote:
         | It's also available in emacs with packages like gptel which let
         | you send the content of any buffer to your LLM of choice.
         | 
         | I think emacs + LLM is a killer feature: the integration is
         | super deep, deeper than any IDE I've seen, and it's just
         | available... everywhere! Any text in emacs is sendable to a
         | LLM.
        
           | brabel wrote:
           | I need to try that, but I have a feeling that in emacs it
           | won't work as well because emacs has a bit more "trouble"
           | setting up workspaces and using context only from that.
           | Trying use use `project.el` now as it seems projectile has
           | been superseded by it, if you know how to easily set that up
           | with eglot support + AI would be helpful.
        
       | justinl33 wrote:
       | I've maintained several SDKs, and the 'cover everything' approach
       | leads to nightmare dependency trees and documentation bloat. imo,
       | the LLM paradigm shifts this even further - why maintain a
       | massive SDK when users can generate precisely what they need?
       | This could fundamentally change how we think about API
       | distribution.
        
       | golergka wrote:
       | I have written a small fullstack app over the holidays, mostly
       | with LLMs, to see how far would they get me. Turns out, they can
       | easily write 90% of the code, but you still need to review
       | everything, make the main architectural decisions and debug stuff
       | when AI cant solve the bug after 2-3 iterations. I get a huge
       | productivity boost and at the same time am not afraid that they
       | will replace me. At least not yet.
       | 
       | Can't recommend aider enough. I've tried many different coding
       | tools, but they all seem like a leaky abstraction over LLMs
       | medium of sequential text generation. Aider, on the other hand,
       | leans into it in the best possible way.
        
       | lysecret wrote:
       | Funny, he starts of dismissing an AI IDE to end with building an
       | AI IDE :D (Smells a little bit like not invented here syndrom)
       | Otherwise fascinating article!
        
         | cpursley wrote:
         | I joke about once per month here that half of hn is basically
         | "not invented here syndrome". And generally poor
         | reimplementations of existing erlang features ;)
        
       | bambax wrote:
       | > _There are three ways I use LLMs in my day-to-day programming:
       | 1 / Autocomplete 2/ Search 3/ Chat-driven programming_
       | 
       | I do mostly 2/ Search, which is like a personalized Stack
       | Overflow and sometimes feels incredible. You can ask a general
       | question about a specific problem and then dive into some
       | specific point to make sure you understand every part clearly.
       | This works best for things one doesn't know enough about, but has
       | a general idea of how the solution should sound or what it should
       | do. Or, copy-pasting error messages from tools like Docker and
       | have the LLM debug it for you really feels like magic.
       | 
       | For some reason I have always disliked autocomplete anywhere, so
       | I don't do that.
       | 
       | The third way, chat-driven programming, is more difficult,
       | because the code generated by LLMs can be large, and can also be
       | wrong. LLMs are too eager to help, and they will try to find a
       | solution even if there isn't one, and will invent it if
       | necessary. Telling them in the prompt to say "I don't know" or
       | "it's impossible" if need be, can help.
       | 
       | But, like the author says, it's very helpful to get started on
       | something.
       | 
       | > _That is why I still use an LLM via a web browser, because I
       | want a blank slate on which to craft a well-contained request_
       | 
       | That's also what I do. I wouldn't like having something in the
       | IDE trying to second guess what I write or suddenly absorbing
       | everything into context and coming up with answers that it thinks
       | make a lot of sense but actually don't.
       | 
       | But the main benefit is, like the author says, that it lets one
       | start afresh with every new question or problem, and save focused
       | threads on specific topics.
        
       | polotics wrote:
       | My main usage is in helping me approach domains and tools I don't
       | know enough to confidently know how best to get started.
       | 
       | So one thing that doesn't get a mention in the article but is
       | quite significant I think is the long lag of knowledge cutoff
       | dates: looking at even the latest and greatest, there is one year
       | or more of missing information.
       | 
       | I would love for someone more versed than me to tell us how best
       | to use RAG or LoRA to get the model to answer with fully up to
       | date knowledge on libraries, frameworks, ...
        
       | choeger wrote:
       | Essentially, an LLM is a compressed database with a universal
       | translator.
       | 
       | So what we can get out of it is everything that has been written
       | (and publicly released) before translated to any language it
       | knows about.
       | 
       | This has some consequences.
       | 
       | 1. Programmers still need to know what algorithms or interfaces
       | or models they want.
       | 
       | 2. Programmers do not have to know a language very well anymore,
       | to write code, but the have to for bug fixing. Consequently the
       | rift between garbage software and quality software will grow.
       | 
       | 3. New programming languages will face a big economical hurdle to
       | take off.
        
         | williamcotton wrote:
         | _3. New programming languages will face a big economical hurdle
         | to take off._
         | 
         | I bet the opposite. I've written a number of DSLs and tooling
         | around them over the last year as LLMs have allowed me to take
         | on much bigger projects.
         | 
         | I expect we see an explosion of languages over the next decade.
        
           | klibertp wrote:
           | Yes - the number of languages will grow, however, their
           | _adoption_ will be much slower and harder to enact than now
           | (and it 's already incredibly difficult).
           | 
           | You might have written the DSLs, but the LLMs are unaware of
           | this and will offer hallucinations when asked to generate
           | code using that DSL.
           | 
           | For the past few weeks I've been slowly getting back to
           | Common Lisp. Even though there's plenty of CL code on the
           | net, its volume is dwarfed by Python or JS. In effect, both
           | Github Copilot and ChatGPT (4o) have an accuracy of 5%. I'm
           | not kidding: they're unable to generate even very simple
           | snippets correctly, hallucinating packages and functions.
           | 
           | It's of course (I think?) possible to make a GPT specialized
           | for Lisp, but if the generic model performs poorly, it'll
           | probably make people wary and stay away from the language.
           | So, unless you're ready to fine-tune a model for your
           | language and somehow distribute it to your users, you'll see
           | adoption rates dropping (from already minuscule ones!)
        
       | stevage wrote:
       | This is a great article with lots of useful insights.
       | 
       | But I'm completely unconvinced by the final claim that LLM
       | interfaces should be separate from IDE's, and should be their own
       | websites. No thanks.
        
       | dxuh wrote:
       | Currently a lot of my work consists of looking at large, (to me)
       | unknown code bases and figuring out how certain things work. I
       | think LLMs are currently very bad at this and it is my
       | understanding that there are problems in increasing context
       | window sizes to multiple millions of tokens, so I wonder if LLMs
       | will ever get good at this.
        
         | AnnKey wrote:
         | I would speculate that for learning unknown codebases, fine-
         | tuning might work better than relying on context window size.
        
       | jmull wrote:
       | LLM auto-complete is good -- it suggests more of what I was going
       | to type, and correctly (or close enough) often enough that it's
       | useful. Especially in the boilerplate-y languages/code I have to
       | use for $dayjob.
       | 
       | Search has been neutral. For finding little facts it's been about
       | the same as regular search. When digging in, I want
       | comprehensive, dense, reasonably well-written reference
       | documentation. That's not exactly wide-spread, but LLMs don't
       | provide this either.
       | 
       | Chat-driven generates too much buggy/incomplete code to be
       | useful, and the chat interface is seriously clunky.
        
       | Ygg2 wrote:
       | > Search. If I have a question about a complex environment, say
       | "how do I make a button transparent in CSS" I will get a far
       | better answer asking any consumer-based LLM, than I do using an
       | old fashioned web search engine.
       | 
       | I don't think this is about LLMs getting better, but search
       | becoming worse. In no small thanks to LLMs polluting the results.
       | Do search images for terms and count how many are AI generated.
       | 
       | I can say I got better result from Google X years ago vs Google
       | of today.
        
         | wizzard0 wrote:
         | Google gets money from showing you ads, not because you pay
         | them for quality search results.
         | 
         | When you have to come over and over, and visit more pages to
         | finally find what you needed, they get much more cash from
         | advertisers than when you get everything instantly.
        
       | EGreg wrote:
       | Can't we just use test-driven development with AI Agents?
       | 
       | 1) Idea
       | 
       | 2) Tests
       | 
       | 3) Code until all tests pass
        
       | ianpurton wrote:
       | I've been coding professionally for 30 years.
       | 
       | I'm probably in the same place as the author, using Chat-GPT to
       | create functions etc, then cut and pasting that into VSCode.
       | 
       | I've started using cline which allows me to code using prompts
       | inside VSCode.
       | 
       | i.e. Create a new page so that users can add tasks to a tasks
       | table.
       | 
       | I'm getting mixed results, but it is very promising. I create a
       | clinerules file which gets added to the system prompt so the AI
       | is more aware of my architecture. I'm also looking at overiding
       | the cline system prompt to both make it fit my architecture
       | better and also to remove stuff I don't need.
       | 
       | I jokingly imagine in the future we won't get asked how long a
       | new feature will take, rather, how many tokens will it take.
        
         | thomasfromcdnjs wrote:
         | Love the token joke!
        
       | assimpleaspossi wrote:
       | Since all these AI products just put together things they pull
       | from elsewhere, I'm wondering if, eventually, there could be
       | legal issues involving software products put together using such
       | things.
        
       | sublimefire wrote:
       | I've been doing that for a while as well and mostly agree.
       | Although one thing that I find useful is to build the local
       | infrastructure to be able to collect useful prompts and the
       | ability to work with files and urls. Web interface is limiting
       | alone.
       | 
       | I like gptresearcher and all of the glue put in place to be able
       | to extend prompts and agents etc. Not to mention the ability to
       | fetch resources from the web and do research type summaries on
       | it.
       | 
       | All in all it reminds me the work of security researchers,
       | pentesters and analysts. Throughout the career they would build a
       | set of tools and scripts to solve various problems. LLMs kind of
       | force the devs to create/select tools for themselves to ease the
       | burden of their specific line of work as well. You could work
       | without LLMs but maybe it will be a bit more difficult to stand
       | out in the future.
        
       | denvermullets wrote:
       | this is almost exactly how ive been using llms. i dont like the
       | code complete in the ide, personally, and prefer all llm usage to
       | be narrow specific blocks of code. it helps as i bounce between a
       | lot of side projects, projects at work, and freelance projects.
       | not to mention with context switching it really helps keep things
       | moving, imo
        
       | owebmaster wrote:
       | I thought his project, sketch.dev is of very poor quality. I
       | wouldn't ship something like this - the auth process is awful and
       | broke, I still can't login. If after 14 hours of the post the
       | service is still rugged to death, it also means the scalability
       | of the app is bad. If we are going to use LLMs to replace hours
       | of programming, we should aim for quality too.
        
         | lm28469 wrote:
         | It's really bad, much less useful than even the first public
         | version of chatgpt. Even once you manage to log in, most of the
         | time it doesn't even give something that compiles, it calls
         | functions/variables which don't exist. The first line of the
         | main had 2 errors...
        
       | cratermoon wrote:
       | But the question must be asked: At what cost?
       | 
       | Are the results a paradigm shift so much better that it's worth
       | the hundreds of billions sunk into the hardware and data centers?
       | Is spicy autocomplete worth the equivalent of flying from New
       | York to London while guzzling thousands of liters of water?
       | 
       | It might work, for some definition of useful, but what happens
       | when the AI companies try to claw back some of that half a
       | trillion dollars they burnt?
        
         | ryanobjc wrote:
         | That's why open research (which "open" ai has never really
         | contributed to!) and foundational models that everyone can
         | contribute to are essential.
         | 
         | This stuff is a pretty neat magical evolution and it should not
         | be the domain of any single company.
         | 
         | Also a lot of the hardware and so on has/is being paid for. AWS
         | gcloud, etc aren't taking massive losses on their H100 and
         | other compute services. This bubble is no different than any
         | prior bubble ultimately, and bankruptcy will recycle useful
         | assets into new companies and new purposes.
         | 
         | Which btw why the US is still a huge winner and will continue
         | to be -> robust and functioning bankruptcy laws and courts.
        
       | nunez wrote:
       | I definitely respect David's opinion given his caliber, but
       | pieces like this make me feel strange that I just don't have a
       | burning desire to use them.
       | 
       | Like, yesterday I made some light changes to a containerized VPN
       | proxy that I maintain. My first thought wasn't "how would Claude
       | do this?" Same thing with an API I made a few weeks ago that
       | scrapes a flight data website to summarize flights in JSON form.
       | 
       | I knew I would need to write some boilerplate and that I'd have
       | to visit SO for some stuff, but asking Claude or o1 to write the
       | tests or boilerplate for me wasn't something I wanted or needed
       | to do. I guess it makes me slower, sure, but I actually enjoy the
       | process of making the software end to end.
       | 
       | Then again, I do all of my programming on Vim and, technically,
       | writing software isn't my day job (I'm in pre-sales, so, best
       | case, I'm writing POC stuff). Perhaps I'd feel differently if I
       | were doing this day in, day out. (Interestingly, I feel the same
       | way about AI in this sense that I do about VSCode. I've used it;
       | I know what's it capable of; I have no interest in it at all.)
       | 
       | The closest I got to "I'll use LLMs for something real" was using
       | it in my backend app that tracks all of my expenses to parse
       | pictures of receipts. Theoretically, this will save me 30 seconds
       | per scan, as I won't need to add all of the transaction metadata
       | myself. Realistically, this would (a) make my review process
       | slower, as LLMs are not yet capable of saying "I'm not sure" and
       | I'd have to manually check each transaction at review time, (b)
       | make my submit API endpoint slower since it takes relatively-
       | forever for it to analyze images (or at least it did when I
       | experimented with this on GPT4-turbo last year), and (c) drive my
       | costs way up (this service costs almost nothing to run, as I run
       | it within Lambda's free tier limit).
        
         | uludag wrote:
         | I think there's a big selection bias on hackernews that you
         | wouldn't get elsewhere. There's still "elite" software
         | developers I see who really aren't into the whole LLM tooling
         | space. I found use in the autocomplete and search workflows
         | that the author mentioned but I stopped using these tools, out
         | of curiosity for things were before. It turns out I don't need
         | it to be productive and I too probably enjoy working more
         | without it.
        
         | ge96 wrote:
         | I'm an avg dev, I was never into LLMs/co-pilot etc mocking
         | prompt engineering but... my current job is working with an LLM
         | framework so idk... future proofs me I guess. I do like
         | computer vision and ML on dataset eg. training hand writing IMU
         | by gestures that's cool.
         | 
         | The embeddings I feel like there is something there even if it
         | doesn't actually understand. My journey has just begun.
         | 
         | I scoff every time someone says "this + AI". AI is this thing
         | they just throw in there. Last time I didn't want to work with
         | some tech I quit my job was not a good move not being
         | financially independent. Anyway yeah I'll keep digging into
         | this. I still don't use co-pilot right now but I'm reading up
         | more on the embedding stuff for cross training or some case
         | like RAG.
        
       | 999900000999 wrote:
       | I still find most LLMS to be extremely poor programmers .
       | 
       | Claude will often generate tons and tons of useless code quickly
       | using up it's limit. I often find myself yelling at it to stop.
       | 
       | I was just working with it last night.
       | 
       | "Hi Claude, can you add tabs here.": <div>
       | 
       | <MainContent/>
       | 
       | <div/>
       | 
       | Claude will then start generating MainContent.
       | 
       | DeepSeek, despite being free does a much better job than Claude.
       | I don't know if it's smarter, but whatever internal logic it has
       | is much more to the point.
       | 
       | Claude also has a very weird bias towards a handful of UI
       | libraries that has installed, even if those wouldn't be good for
       | your project. I wasted hours on shancn UI which requires a very
       | particular setup to work.
       | 
       | LLM's are generally great at common tasks using a top 5(
       | popularity) language.
       | 
       | Ask it to do something in a Haxe UI library and it'll make up
       | functions that *look* correct.
       | 
       | Overall I like them, they definitely speed things up. I don't
       | think most experienced software engineers have much to worry
       | about for now. But I am really worried about juniors. Why higher
       | a junior engineer, when you can just tell your seniors they need
       | to use Copilot to crank out more code
        
         | joseda-hg wrote:
         | Assuming I know roughly what it will generate, I usually
         | prepend my chats with previsions against this kind of thing
         | 
         | "Add tabs here, assume the rest of the page will work with no
         | futher modification, limit your changes so that any existing
         | code keeps working"
         | 
         | I also do stuff like "Project is using {X} libraries, keep
         | dependencies minimal
         | 
         | Generate a method takes {Z} parameters, return {Y}, using {A},
         | {B} and {C} do {thing}"
         | 
         | I'll add stuff like Language version, frameworks or specific
         | requests based on this, but then I just reuse the setup , So I
         | like to keep the first message with as much context as
         | possible, ideally separating project context from specific
         | request
        
       | btbuildem wrote:
       | The search part really resonates with me. I do a lot of
       | odd/unusual/one-off things for my side projects, and I use LLMs
       | extensively in helping me find a path forward. It's like an
       | infinitely patient, all-knowing expert that pulls together info
       | from any and all domain. Sometimes it will have answers that I am
       | unable to find another way (eg, what's the difference between
       | "busy s..." and "busy p..." AT command response on the esp8285?).
       | It saves me hours of struggle, and I would not want to go back to
       | the old ways.
        
       | fassssst wrote:
       | They're pretty great for printf debugging. Yesterday I was
       | confounded by a bug so I rapidly added a ton of logging that the
       | LLM wrote instantly, then I had the LLM analyze the state
       | difference between the repro and non repro logs. It found
       | something instantly that it would have taken me a few hours to
       | find, which led me to a fix.
        
       | hansvm wrote:
       | That quartile reservoir sampler example is ... intriguing?
       | 
       | My experience with LLM code is that it can't come up with
       | anything even remotely novel. If I say "make it run in amortized
       | O(1)" then 99 times out of 100 I'll get a solution so wildly
       | incorrect (but confidently asserting its own correctness) that it
       | can't possibly be reshaped into something reasonable without a
       | re-write. The remaining 1/100 times aren't usually "good" either.
       | 
       | For the reservoir sampler -- here, it did do the job. David
       | almost certainly knows enough to know the limits of that code and
       | is happy with its limitations. I've solved that particular
       | problem at $WORK though (reservoir sampling for percentile
       | estimates), and for the life of me I can't find a single LLM
       | prompt or sequence of prompts that comes anywhere close to
       | optimality unless that prompt also includes the sorts of insights
       | which lead to an amortized O(1) algorithm being possible (and,
       | even then, you still have to re-run the query many times to get a
       | useful response).
       | 
       | Picking on the article's solution a bit, why on earth is `sorted`
       | appearing in the quantile estimation phase? That's fine if you're
       | only using the data structure once (init -> finalize), but it's
       | uselessly slow otherwise, even ignoring splay trees or anything
       | else you could use to speed up the final inference further.
       | 
       | I personally find LLMs helpful for development when either (1)
       | you can tolerate those sorts of mishaps (e.g., I just want to run
       | a certain algorithm through Scala and don't really care how slow
       | it is if I can run it once and hexedit the output), or (2) you
       | can supply all the auxilliary information so that the LLM has a
       | decent chance of doing it right -- once you've solved the hard
       | problems, the LLM can often get the boilerplate correct when
       | framing and encapsulating your ideas.
        
       | LouisSayers wrote:
       | The use of LLMs reminds me a bit of how people use search
       | engines.
       | 
       | Some years ago I gave a task to some of my younger (but
       | intelligent) coworkers.
       | 
       | They spent about 50 minutes searching in google and came back to
       | me saying they couldn't find what they were looking for.
       | 
       | I then typed in a query, clicked one of the first search results
       | and BAM! - there was the information they were unable to find.
       | 
       | What was the difference? It was the keywords / phrases we were
       | using.
        
       | highfrequency wrote:
       | > _A lot of the value I personally get out of chat-driven
       | programming is I reach a point in the day when I know what needs
       | to be written, I can describe it, but I don't have the energy to
       | create a new file, start typing, then start looking up the
       | libraries I need... LLMs perform that service for me in
       | programming. They give me a first draft, with some good ideas,
       | with several of the dependencies I need, and often some mistakes.
       | Often, I find fixing those mistakes is a lot easier than starting
       | from scratch._
       | 
       | This to me is the biggest advantage of LLMs. They dramatically
       | reduce the activation energy of _doing something you are
       | unfamiliar with_. Much in the way that you 're a lot more likely
       | to try kitesurfing if you are at the beach standing next to a
       | kitesurfing instructor.
       | 
       | While LLMs may not yet have human-level _depth_ , it's clear that
       | they already have vastly superhuman _breadth_. You can argue
       | about the current level of expertise (does it have undergrad
       | knowledge in every field? PhD level knowledge in every field?)
       | but you can 't argue about the breadth of fields, nor that the
       | level of expertise improves every year.
       | 
       | My guess is that the programmers who find LLMs useful are people
       | who do a lot of different _kinds_ of programming every week (and
       | thus are constantly going from incompetent to competent in things
       | that other people already know), rather than domain experts who
       | do the same kind of narrow and specialized work every day.
        
         | otteromkram wrote:
         | I think your biggest takeaway should be that they person
         | writing the blog post is extremely well-known versed in
         | programming and has labored over code for hours, along with
         | writing tests, debugging, etc. He knows what he would like
         | because it's second nature. He was able to get the best from
         | the LLM because his vision of what the code should look like
         | helped craft a solid prompt.
         | 
         | Newer people into programming might not have as good of a time
         | because they may skip actually learning something fundamentals
         | and rely on LLMs as a crutch. Nothing wrong with that, I
         | suppose, but there might be at some point when everything goes
         | up in smoke and the LLM is out of answers.
         | 
         | No amount of _italic font_ is going to change that.
        
           | highfrequency wrote:
           | My experience is opposite - I get the most value out of LLMs
           | for topics that I have less expertise in. It's become vastly
           | easier up to speed in a new field because you can immediately
           | answer basic questions, have the holes in your understanding
           | pointed out, and be directed to the concepts you are missing.
        
       | charlieyu1 wrote:
       | I'm a hobby programmer who never worked a programming job. Last
       | week I was bored, I asked o1 to help me to write a Solitaire card
       | game using React because I'm very rusty with web development.
       | 
       | The first few steps were great. Guided me to install things and
       | setup a project structure. The model even generated codes for a
       | few files.
       | 
       | Then something went wrong, the model kept telling me what to do
       | in vague, but didn't output codes anymore. So I asked for further
       | help, and now it started contradicting itself, rewriting business
       | logic that were implemented in the first response, 3-4 pieces of
       | code snippets of the same file that aren't compatible etc, and it
       | all fell apart.
        
         | jarsin wrote:
         | My first program ever was a windows calculator. My roomates
         | would sit down and find bugs after I thought I perfected it. I
         | learned so much spending weeks trying to get that damn thing
         | working.
         | 
         | I'm not too optimistic about the future of software development
         | if juniors are turning to AI to do those early projects for
         | them.
        
         | mocamoca wrote:
         | LLMs contexts are fast to overload, as the article states.
         | That's why he writes smaller, specific packages, one at a time,
         | and uses a web UI instead of something like cursor.
         | 
         | I had the same issue as you a few days ago. By separating the
         | problem in smaller parts and addressing each parts one by one
         | it got easier.
         | 
         | In your specific case I would try to fully complete the
         | business logic one side. Reset the context. Then provide the
         | logic to a new context and ask for an interface. Difficulty
         | will arise when discovering that the logic is wrong or not
         | suited to the UI, but i would keep using the same process to
         | edit the code. Maybe two different contexts, one for logic, one
         | for UI?
         | 
         | How did you do?
        
         | cpursley wrote:
         | Yeah, you wanna use Claude for code. That's the problem. Try
         | Cursor or Bolt.
        
       | aerhardt wrote:
       | His experience mirrors mine. I'm happy he explicitly mentions
       | search, when people have been shouting "this is not meant for
       | search" for a couple years now. Of course it helps with search. I
       | also love the tech for producing first drafts, and it greatly
       | lowers the energy and cognitive load when attacking new tasks,
       | like others are repeating on this thread.
       | 
       | I think at the same time, while the author says this is the
       | second most impressive technology he's seen in his lifetime, it's
       | still a far cry from the bombastic claims being made by the
       | titans of industry regarding its potential. Not uncommon to see
       | claims here on HN of 10x improvements in productivity, or teams
       | of dozens of people being axed, but nothing in the article or in
       | my experience lines up with that.
        
       | jordanmorgan10 wrote:
       | The more experienced the engineer the less CSS is on the page.
       | This seems to be a universal truth, I want to learn from these
       | people - but my goodness, but could we at least use margins to
       | center content.
        
       | dboreham wrote:
       | Interesting that he had the same thought initially as I did
       | (after running a model myself on my own hardware) : this is like
       | the first time I ran a traceroute across the planet.
        
       | ryanobjc wrote:
       | I have been getting more value out of LLMs recently, and the
       | great irony is it is because of a few different packages in emacs
       | and the wonderful CLI LLM chat programming tool 'aider'.
       | 
       | My workflow puts LLM chat at my fingertips, and I can control the
       | context. Pretty much any text in emacs can be sent to a LLM of
       | your choice via API.
       | 
       | Aider is even better, it does a bunch of tricks to improve
       | performance, and is rapidly becoming a 'must have' benchmark for
       | LLM coding. It integrates with git so each chat modification
       | becomes a new git commit. Easy to undo changes, redo changes,
       | etc. It also has a bunch of hacks because while o1 is good as
       | reasoning, it (apparently) doesn't do code modification well.
       | Aider will send different types of requests to different
       | 'strengths' of LLMs etc. Although if you can use sonnet, you can
       | just use that and be done with it.
       | 
       | It's pretty good, but ultimately it's still just a tool for
       | transforming words into code. It won't help you think or
       | understand.
       | 
       | I feel bad for new kids who won't develop muscle and sight
       | strength to read/write code. Because you still need to read/write
       | code, and can't rely on the chat interface for everything.
        
       | Balgair wrote:
       | I'm not a 'programmer'. At best, I'm a hacker, _at best_. I don
       | 't work in a team. All my code is mostly one time usage to just
       | get some little thing done, sometimes a bit of personal stuff
       | too. I mostly use Excel anyways, and then python, and even then,
       | I hate python because half the time I'm just dealing with library
       | issues (not a joke, I measured it (and, no, I'm not learning
       | another language, but thank you)). I'm in biotech, a very non
       | code-y section of it too.
       | 
       | LLMs are just a life saver. Literally.
       | 
       | They take my code time down from weeks to an afternoon, sometimes
       | less. Any they're _kind_.
       | 
       | I'm trying to write a baseball simulator on my own, as a stretch
       | goal. I'm writing my own functions now, a step up for me. The
       | code is to take in real stats, do Monte Carlo, get results. Basic
       | stuff. Such a task was _impossible_ for me before LLMs. I 've
       | tried it a few times. No go. Now with LLMs, I've got the skeleton
       | working and should be good to go before opening day. I'm hoping
       | that I can use it for some novels that I am writing to get more
       | realistic stats (don't ask).
       | 
       | I know a lot of HN is very dismissive of LLMs as code help. But
       | to me, a non programmer, they've opened it up. I can do things I
       | never imagined that I could. Is it prod ready? Hell no, please
       | God no. But is it good enough for me to putz with and get _just_
       | working? Absolutely.
       | 
       | I've downloaded a bunch of free ones from huggingface and Meta
       | just to be sure they can't take them away from me. I'm _never_
       | going back to that frustration, that  'Why can't I just be not so
       | stupid?', that self-hating, that darkness. They have liberated
       | me.
        
       | averus wrote:
       | I think the author is really on the right path with his vision
       | for LLMs as tool for software development. Last week I tried
       | probably all of them with something like a code challenge.
       | 
       | I have to say that I am impressed with sketch.dev, it got me a
       | working example from the first try and it looked cleaner form all
       | the others, similar but cleaner somehow in terms of styling.
       | 
       | The whole time I was using those tools I was thinking that I want
       | exactly this a LLM trained specifically on the Go official
       | documentation, or whatever your favourite language is, ideally
       | fined tuned by the maintainers of the language.
       | 
       | I want the LLM to show me an idiomatic way to write an API using
       | the standard library I don't necessarily want it to do it instead
       | of me, or to be trained on all of the scrapped data they could
       | scrape. Show me a couple of examples maybe explain a concept,
       | give me steps by step guidance.
       | 
       | I also share his frustrations with the chat based approach what
       | annoys me personally the most is the anthropomorphization of the
       | LLMs, yesterday Gemini was even patronizing me...
        
       | theptip wrote:
       | This lines up well with my experience. I've tried coming at
       | things from the IDE and chat side, and I think we need to merge
       | tooling more to find the sweet spot. Claude is amazing at
       | building small SPAs, and then you hit the context window cutoff
       | and can't do anything except copy your file out. I suspect IDEs
       | will figure this out before Claude/ChatGPT learn to be good
       | enough at the things folks need from IDEs. But long-term, i
       | suppose you don't want to have to drop down to code at all and so
       | the constraints of chat might force the exploration of the new
       | paradigm more aggressively.
       | 
       | Hot take of the day, I think making tests and refactors easier is
       | going to be revolutionary for code quality.
        
       ___________________________________________________________________
       (page generated 2025-01-08 23:02 UTC)