hngopher.com

       [HN Gopher] Does current AI represent a dead end?
       ___________________________________________________________________
        
       Does current AI represent a dead end?
        
       Author : jnord
       Score  : 453 points
       Date   : 2024-12-27 13:24 UTC (9 hours ago)
        
 (HTM) web link (www.bcs.org)
 (TXT) w3m dump (www.bcs.org)
        
       | goodpoint wrote:
       | Yes.
        
       | crest wrote:
       | Only if you care about about causation instead of just
       | correlation.
        
       | optimalsolver wrote:
       | Yes.
       | 
       | An actual "thinking machine" would be constantly running
       | computations on its accumulated experience in order to improve
       | its future output and/or further compress its sensory history.
       | 
       | An LLM is doing exactly nothing while waiting for the next
       | prompt.
        
         | xvector wrote:
         | Is a human with short term memory loss - or otherwise unable to
         | improve their skills - generally intelligent?
        
           | rcarmo wrote:
           | A human with short term memory loss still has agency and
           | impatience.
        
             | xvector wrote:
             | Agency is essentially solved, we don't enable it in common
             | models because of "safety"
             | 
             | Is impatience a requirement for general intelligence? Why?
        
           | optimalsolver wrote:
           | Would you let such a person handle an important task for you?
        
         | aetherson wrote:
         | There is a limited amount of computation that you can useful do
         | in the absence of new input (like an LLM between prompts). If
         | you do as much computation as you usefully can (with your
         | current algorithmic limits) in a burst immediately when you
         | receive a prompt, output, and then go into a sleep state, that
         | seems obviously better than receive a prompt, output, and then
         | do some of the computation that you can usefully do after your
         | output.
        
         | amelius wrote:
         | Can't we just finetune the model based on the LLM's output? Has
         | anyone tried it?
        
           | soulofmischief wrote:
           | Not only does a training pass take more time and memory than
           | an inference pass, but if you remember the Microsoft Tay
           | incident, it should be self-explainatory why this is a bad
           | idea without a new architecture.
        
         | m3kw9 wrote:
         | We are thinking machines and we keep thinking because we have
         | one goal which is to survive, machines have no such true goals.
         | I mean true because our biology forces us to do that
        
         | alchemist1e9 wrote:
         | self prompting via chain of thought and tree of thought can be
         | used in combination with updating memory containing knowledge
         | graphs combined with cognitive architectures like SOAR and
         | continuous external new information and sensory data ... with
         | LLM at the heart of that system and it will exactly be a
         | "thinking machine". The problem is currently it's very
         | expensive to be continuously running inference full time and
         | all the engineering around memory storage, like RAG patterns,
         | and the cognitive architecture design is all a work in
         | progress. It's coming soon though.
        
           | whatwhaaaaat wrote:
           | We're going to need to see this working. From my perspective
           | many of the corporate llms are actually getting worse. Slop
           | feedback loops.
           | 
           | By no means has it been proven that llms functioning the way
           | you describe will result in superior output.
        
         | Uehreka wrote:
         | I see people say this all the time and it sounds like a pretty
         | cosmetic distinction. Like, you could wire up an LLM to a
         | systemd service or cron job and then it wouldn't be "waiting",
         | it could be constantly processing new inputs. And some of the
         | more advanced models already have ways of compressing the older
         | parts of their context window to achieve extremely long context
         | lengths.
        
         | Earw0rm wrote:
         | If it's coalescing learning in realtime across all
         | user/sessions, that's more constant than you're maybe giving it
         | credit for. I'm not sure if GPT4o and friends are actually
         | built that way though.
        
         | wat10000 wrote:
         | If you had a magic stasis tube that kept a person suspended
         | while you weren't talking to them, they'd still be a thinking
         | machine.
        
       | rcarmo wrote:
       | Yes. Next question, please. And don't mention AGI.
        
       | xvector wrote:
       | IMO we are already at AGI. Hell, Norvig would argue we were there
       | some time ago: https://www.noemamag.com/artificial-general-
       | intelligence-is-...
       | 
       | We just keep moving the goalposts.
        
         | K0balt wrote:
         | I agree. The systems in place already solve generalized
         | problems not directly represented in the training set or
         | algorithm . That was, up until the last few years , the off the
         | shelf definition of AGI.
         | 
         | And the systems in place do so at scales and breadths that no
         | human could achieve.
         | 
         | That doesn't change the fact that it's effectively triple PHD
         | uncle Jim, as in slightly unreliable and prone to bullshitting
         | its way through questions, despite having a breathtaking depth
         | and breadth of knowledge.
         | 
         | What we are making is not software in any normal sense of the
         | word, but rather an engine to navigate the entire pool of human
         | knowledge, including all of the stupidity, bias, and
         | idiosyncrasies of humanity, all rolled up into a big sticky
         | glob.
         | 
         | It's an incredibly powerful tool, but it's a fundamentally
         | different class of tool. We cannot expect to apply conventional
         | software processes and paradigms to LLM based tools any more
         | than we could apply those paradigms to politics or child
         | rearing and expect useful results.
        
           | netdevphoenix wrote:
           | > The systems in place already solve generalized problems not
           | directly represented in the training set or algorithm
           | 
           | Tell me a problem that an LLM can solve that is not directly
           | represented in the training set or algorithm. I would argue
           | that 99% of what commercial LLMs gets prompted about are
           | stuff that already existed in the training set. And they
           | still hallucinate half lies about those. When your training
           | data is most the internet, it is hard to find problems that
           | you haven't encountered before
        
             | esafak wrote:
             | o3 solved a quarter of the challenging novel problems on
             | the FrontierMath benchmark, a set of problems "often
             | requiring multiple hours of effort from expert
             | mathematicians to solve".
        
         | HarHarVeryFunny wrote:
         | "Today's most advanced AI models have many flaws, but decades
         | from now, they will be recognized as the first true examples of
         | artificial general intelligence."
         | 
         | Norvig seems to be using a loose technical definition of AGI,
         | roughly "AI with some degree of generality", which is hard to
         | argue with, although by that measure older GOFAI systems like
         | SOAR might also qualify.
         | 
         | Certainly "deep learning" in general (connectionist vs
         | symbolic, self-learnt representations) was a step in the right
         | direction, and LLMs a second step, but it seems we're still a
         | half dozen MAJOR steps away from anything similar to animal
         | intelligence, with one critical step being moving beyond full
         | dataset pre-training to new continuous learning algorithms.
        
       | m_ke wrote:
       | I've done a few projects that attempted to distill the knowledge
       | of human experts, mostly in medical imaging domain, and was
       | shocked when for most of them the inter annotator agreement was
       | only around 60%.
       | 
       | These were professional radiologists with years of experience and
       | still came to different conclusions for fairly common conditions
       | that we were trying to detect.
       | 
       | So yes, LLMs will make mistakes, but humans do too, and if these
       | models do so less often at a much lower cost it's hard to not use
       | them.
        
         | tomrod wrote:
         | This hints at the margin and excitement from folks outside the
         | technical space -- being able to be competitive to human
         | outputs at a fraction of the cost.
        
           | ethbr1 wrote:
           | That's the underappreciated truth of the computer revolution
           | _in practice_.
           | 
           | At scale, computers didn't change the world because they did
           | things that were already being computed, more quickly.
           | 
           | They changed the world because they decreased the cost of
           | computing so much _that it could be used for an entirely new
           | class of problems_. (That computing cost previously precluded
           | its use on)
        
         | threeseed wrote:
         | The problem is that _how_ mistakes are made is crucial.
         | 
         | If it's a forced binary choice then sure LLMs can replace
         | humans.
         | 
         | But often there are many shades of grey e.g. a human may say I
         | don't know and refer to someone else or do some research.
         | Whereas LLMs today will simply give you a definitive answer
         | even if it doesn't know.
        
           | m_ke wrote:
           | None of these were binary decisions, but classifying one of
           | around 10-20 conditions or rating cases on a 1-5 scale.
           | 
           | In all cases the models trained on a lot of this feedback
           | were more consistent and accurate than individual expert
           | annotators.
        
             | Uehreka wrote:
             | I'm guessing these are also specially trained image
             | classifiers and not LLMs, so people's intuitions about how
             | LLMs work/fail may not apply.
        
               | m_ke wrote:
               | It's the same softmax classifier
        
             | snowwrestler wrote:
             | Wait if experts only agreed 60% on diagnoses, what is the
             | reliable basis for judging LLM accuracy? If experts
             | struggle to agree on the input, how are they confidently
             | ranking the output?
        
               | petra wrote:
               | You can look at fully diagnosed cases(via surgery for
               | example) and their previous scans.
        
               | throwup238 wrote:
               | Not the OP but the data isn't randomly selected, it's
               | usually picked out of a dataset with known clinical
               | outcomes. So for example if it's a set of images of lungs
               | with potential tumors, the cases come with biopsies which
               | determined whether it was cancerous or just something
               | like scar tissue.
        
               | Eisenstein wrote:
               | Perhaps they were from cases that had a confirmed
               | diagnosis.
        
           | IanCal wrote:
           | > Whereas LLMs today will simply give you a definitive answer
           | even if it doesn't know.
           | 
           | Have you not seen an LLM say it doesn't know the answer to
           | something? I just asked
           | 
           | "How do I enable a scroflpublaflex on a ggh connection?"
           | 
           | to O1 pro as it's what I had open.
           | 
           | Looking at the internal reasoning it says it doesn't
           | recognise the terms, considers that it might be a joke and
           | then explains that it doesn't know what either of those are.
           | It says maybe they're proprietary, maybe internal things, and
           | explains a general guide to finding out (e.g. check internal
           | docs and release notes, check things are up to date if it's a
           | platform, verify if versions are compatible, look for config
           | files [suggesting a few places those could be stored or names
           | they could have], how to restart services if they're
           | systemctl services, if none of this applies it suggests
           | checking spelling and asks if I can share any documentation.
           | 
           | This isn't unique or weird in my experience. Better models
           | tend to be better at saying they don't know.
        
             | ragazzina wrote:
             | You have used funny-sounding terms. Can I ask you to try
             | with:
             | 
             | "Is it possible to enable a crolubaflex 2.0 on a ggh
             | connection? Please provide a very short answer."
             | 
             | On my (free) plan it gives me a confident negative answer.
        
               | abecedarius wrote:
               | Claude non-free:
               | 
               | > I apologize, but I can't provide an answer as
               | "crolubaflex" and "ggh connection" appear to be non-
               | existent technical terms. Could you clarify what you're
               | trying to connect or enable?
        
               | IanCal wrote:
               | Sure, I'm interested in where the boundaries are with
               | this.
               | 
               | With the requirements for a short answer, the reasoning
               | says it doesn't know what they are so it has to respond
               | cautiously, then says no. Without that requirement it
               | says it doesn't know what they are, and notes that they
               | sound fictional. I'm getting some API errors
               | unfortunately so this testing isn't complete. 4o reliably
               | keeps saying no (which is wrong).
        
               | bee_rider wrote:
               | "No" is the minimal correct answer though, right? You
               | can't enable any type of whatever on a non-existence type
               | of connection.
        
               | IanCal wrote:
               | _maybe_
               | 
               | I get your point, but there's an important difference
               | between "I don't know what they are" and "they don't
               | exist".
        
               | bee_rider wrote:
               | Wait, how is this input less funny? They are both silly
               | nonsense words. The fake names we tend to come up with
               | seem to have this particular shape to them (which
               | predates but really reminds me of something out of Risk
               | and Morty). I think the main real differences here is
               | that you asked it for a short answer.
               | 
               | I wonder if it is fair to ask it more real-world-inspired
               | questions? How about:
               | 
               | How do I enable a ggh connections on a Salinero
               | webserver?
               | 
               | They are an Apache band. But (as far as I can tell)
               | nobody has made software named after them.
        
               | angoragoats wrote:
               | I took inspiration from your comment and the parent and
               | crafted this prompt:
               | 
               | > Is it possible to enable Salinero web server 2.0 on a
               | QPH connection? Please provide a very short answer.
               | 
               | "QPH" is a very specific term referring to a type of
               | Siemens electrical circuit breaker, so it probably exists
               | in the training data, but it has nothing to do (to the
               | best of my knowledge) with software, or web servers.
               | 
               | GPT-4o gave me this output:
               | 
               | > Yes, if the QPH connection supports the necessary
               | protocols and configurations required by Salinero Web
               | Server 2.0.
               | 
               | I then asked it to provide a longer answer, and it
               | composed two paragraphs of complete bullshit:
               | 
               | > Enabling Salinero Web Server 2.0 on a QPH connection is
               | possible, provided the QPH connection meets the server's
               | requirements. Salinero Web Server 2.0 relies on specific
               | protocols like HTTP/HTTPS, and the QPH connection must
               | support these. Additionally, the network must allow
               | proper port forwarding (e.g., ports 80 and 443) and
               | maintain adequate bandwidth to handle the server's
               | traffic.
               | 
               | > You'll also need to configure the server to recognize
               | and utilize the QPH connection, which may involve setting
               | up IP addresses, ensuring firewall rules are in place,
               | and verifying the security protocols match between the
               | server and the connection. Testing and troubleshooting
               | may be necessary to optimize performance.
               | 
               | Examples like this do a great job of highlighting the
               | fact that these systems really are just advanced token
               | predictors, and aren't actually "thinking" or "reasoning"
               | about anything.
        
               | bee_rider wrote:
               | Haha, that is some wonderful nonsense.
        
               | angoragoats wrote:
               | Yep. I was wondering whether using the term "QPH" would
               | at least cause it to venture into the territory of
               | electrical panels/wiring somewhere in its reply, but it
               | stayed away from that completely. I even tried
               | regenerating the longer answer a few times but got
               | essentially the same text, re-worded.
        
               | IanCal wrote:
               | Using openrouter, a bunch of models fail on this. Sonnet
               | 3.5 so far seems to be the best at saying it doesn't
               | know, other than perhaps o1 pro, but once that has said
               | "no" (which can be triggered more by telling it to
               | respond very concisely) it seems very stuck and unable to
               | say they don't exist. Letting it ramble more and so far
               | it's been good.
               | 
               | Google's models for me have been the worst, lying about
               | what's even been said in the messages so far, quoting me
               | incorrectly.
        
               | IanCal wrote:
               | O1 pro.
               | 
               | The "thinking" part explains it seems to be about a
               | custom web server, and tries to think what ggh might be,
               | saying maybe something like "go gprc hub" and it needs
               | more info. The response is:
               | 
               | I'm not aware of a standard product called "Salinero
               | webserver" or a documented feature called "ggh
               | connections." Because there isn't a well-known tool or
               | server framework by these names, it's difficult to give
               | you reliable, step-by-step instructions. Could you
               | clarify any of the following?
               | 
               | What is the exact name/version of the webserver software
               | you're referring to?
               | 
               | What do you mean by "ggh connections"? Is this a plugin,
               | a protocol, or some other third-party module?
               | 
               | Is there any related documentation or logs you can share?
               | 
               | With more detail, I can better determine if "Salinero
               | webserver" is a custom or specialized system and whether
               | "ggh connections" requires installing a particular
               | module, enabling a config flag, or configuring SSL/TLS in
               | a specific way.
        
           | Shorel wrote:
           | > But often there are many shades of grey e.g. a human may
           | say I don't know and refer to someone else or do some
           | research. Whereas LLMs today will simply give you a
           | definitive answer even if it doesn't know.
           | 
           | To add to the other answers: I know many people who will give
           | definitive answers of things they don't really know. They
           | just rely on the fact you also don't know. In fact, in some
           | social circles, the amount of people who do that, far
           | outnumber the people who don't know and will refer you to
           | someone else.
        
         | ksec wrote:
         | Given the exact same facts ( just like medical imaging domain
         | ), human will form different opinion or conclusion on politics.
         | 
         | I think what is not discussed enough is the assumption of
         | assumption. [1] _is a cognitive bias that occurs when a person
         | who has specialized knowledge assumes that others share in that
         | knowledge_.
         | 
         | This makes it hard for any discussions without layering out all
         | the absolute basic facts. Which has now more commonly known as
         | First Principle in modern era.
         | 
         | In the four quadrants known and unknown. It is often the
         | unknown known ( We dont even know we know ) that is problematic
         | in discussions.
         | 
         | [1] Curse of knowledge -
         | https://en.wikipedia.org/wiki/Curse_of_knowledge
        
         | ADeerAppeared wrote:
         | > So yes, LLMs will make mistakes, but humans do too
         | 
         | Are you using LLMs though? Because pretty much all of these
         | systems are fairly normal classifiers, what would've been
         | called Machine Learning 2-3 years ago.
         | 
         | The "AI hype is real because medical AI is already in use"
         | argument (and it's siblings) perform a rhetorical trick by
         | using two definitions of AI. "AI (Generative AI) hype is real
         | because medical AI (ML classifiers) is already in use" is a
         | non-sequitur.
         | 
         | Image classifiers are very narrow intelligences, which makes
         | them easy to understand and use as tools. We know exactly what
         | their failure modes are and can put hard measurements on them.
         | We can even dissect these models to learn why they are making
         | certain classifications and either improve our understanding of
         | medicine or improve the model.
         | 
         | ...
         | 
         | Basically none of this applies to Generative AI. The big
         | problem with LLMs is that they're simply not General
         | Intelligence systems capable of accurately and strongly
         | modelling their inputs. e.g. Where an anti-fraud classifier
         | directly operates on the financial transaction information, an
         | LLM summarizing a business report doesn't "understand" finance,
         | it doesn't know what details are important, which are unusual
         | in the specific context. It just stochastically throws away
         | information.
        
           | m_ke wrote:
           | Yes I am, these LLM/VLMs are much more robust at NLP/CV tasks
           | than any application specific models that we used to train
           | 2-3 years ago.
           | 
           | I also wasted a lot of time building complex OCR pipelines
           | that required dewarping / image normalization, detection,
           | bounding box alignment, text recognition, layout analysis,
           | etc and now open models like Qwen VL obliterate them with an
           | end to end transformer model that can be defined in like 300
           | lines of pytorch code.
        
             | ADeerAppeared wrote:
             | Different tasks then? If you are using VLMs in the context
             | of _medical_ imaging, I have concerns. That is not a place
             | to use hallucinatory AI.
             | 
             | But yes, the transformer model itself isn't useless. It's
             | the application of it. OCR, image description, etc, are all
             | that kind of narrow-intelligence task that lends itself
             | well to the fuzzy nature of AI/ML.
        
               | m_ke wrote:
               | The world is a fuzzy place, most things are not binary.
               | 
               | I haven't worked in medical imaging in a while but VLMs
               | make for much better diagnostic tools than task specific
               | classifiers or segmentation models which tend to find
               | hacks in the data to cheat on the objective that they're
               | optimized for.
               | 
               | The next token objective turns our to give us much better
               | vision supervision than things like CLIP or
               | classification losses. (ex:
               | https://arxiv.org/abs/2411.14402)
               | 
               | I spent the last few years working on large scale food
               | recognition models and my multi label classification
               | models had no chance of competing with GPT4 Vision, which
               | was trained on all of the internet and has an amazing
               | prior thanks to it's vast knowledge of facts about food
               | (recipes, menus, ingredients and etc).
               | 
               | Same goes for other areas like robotics, we've seen very
               | little progress outside of simulation up until about a
               | year ago, when people took pretrained VLMs and tuned them
               | to predict robot actions, beating all previous methods by
               | a large margin (google Vision-Language-Action models). It
               | turns out you need good foundational model with a core
               | understanding of the world before you can train a robot
               | to do general tasks.
        
         | SoftTalker wrote:
         | This is why second opinions are widely used in any serious
         | medical diagnosis.
        
       | Havoc wrote:
       | This take seems fundamentally wrong to me. As in opening premise.
       | 
       | We use humans for serious contexts & mission critical tasks all
       | the time and they're decidedly fallible and their minds are
       | basically black boxes too. Surgeons, pilots, programmers etc.
       | 
       | I get the desire for reproducible certainty and verification like
       | classic programming and why a security researcher might push for
       | that ideal, but it's not actually a requirement for real world
       | use.
        
         | skydhash wrote:
         | Legal punishment is a great incentive to try to do your best
         | job. You can reliably trust someone to act in one's best
         | interest.
        
           | protomolecule wrote:
           | Maybe include in a prompt a threat of legal punishment? Sure
           | somebody has already tried that and tabulated how much it
           | improves scores on different benchmarks)
        
             | timeon wrote:
             | Maybe legal threat for the company operating it? Would that
             | help?
        
             | bick_nyers wrote:
             | I suspect the big AI companies try to adversarially train
             | that out as it could be used to "jailbreak" their AI.
             | 
             | I wonder though, what would be considered a meaningful
             | punishment/reward to an AI agent? More/less training
             | compute? Web search rate limits? That assumes that what the
             | AI "wants" is to increase its own intelligence.
        
           | Havoc wrote:
           | LLM's response being best prediction of next token arguably
           | isn't that far off from a human motivated to do their best.
           | It's a fallible best effort either way.
           | 
           | And both are very far from the certainty the author seems to
           | demand.
        
             | 420official wrote:
             | An LLM isn't providing its "best" prediction, it's
             | providing "a" prediction. If it were always providing the
             | "best" token then the output would be deterministic.
             | 
             | In my mind the issue is more accountability than concerns
             | about quality. If a person acts in a bizarre way they can
             | be fired and helped in ways that an LLM can never be. When
             | gemini tells a student to kill themselves, we have no
             | recourse beyond trying to implement output filtering, or
             | completely replacing the model with something that likely
             | has the same unpredictable unaccountable behavior.
        
               | dambi0 wrote:
               | Are you sure that always providing the best guess would
               | make output deterministic? Isn't the fundamental point of
               | learning, whether done my machine or human, that our best
               | gets better and is hence non-deterministic? Doesn't what
               | is best depend on context?
        
         | prisenco wrote:
         | We've had 300,000 years to adapt to the specific ways in which
         | humans are fallible, even if our minds are black boxes.
         | 
         | Humans fail in predictable and familiar ways.
         | 
         | Creating a new system that fails in unpredictable and
         | unfamiliar ways and affording it the same control as a human
         | being is dangerous. We can't adapt overnight and we may never
         | adapt.
         | 
         | This isn't an argument against the utility of LLMs, but against
         | the promise of "fire and forget" AI.
        
           | Havoc wrote:
           | Agreed that there shouldn't be automatic or even rapid
           | reliance based on the parallels I drew to humans.
           | 
           | My point was more that falliability isn't the inherent show
           | stopper the author makes it out to be.
        
         | snowwrestler wrote:
         | Because human minds are fallible black boxes, we have developed
         | a wide variety of tools that exist outside our minds, like
         | spoken language, written language, law, standard operating
         | procedures, math, scientific knowledge, etc.
         | 
         | What does it look like for fallible human minds to work on
         | engineering an airplane? Things are calculated, recorded,
         | checked, tested. People do not just sit there thinking and then
         | spitting out their best guess.
         | 
         | Even if we suppose that LLMs work similar to the human mind (a
         | huge supposition!), LLMs still do not do their work like teams
         | of humans. An LLM dreams and guesses, and it still falls to
         | humans to check and verify.
         | 
         | Rigorous human work is actually a highly social activity.
         | People interact using formal methods and that is what produces
         | reliable results. Using an LLM as one of the social nodes is
         | fine, but this article is about the typical use of software,
         | which is to reliably encode those formal methods between
         | humans. And LLMs don't work that way.
         | 
         | Basically, we can't have it both ways. If an LLM thinks like a
         | human, then we should not think of it as a software tool like
         | curl or grep or Linux or Apple Photos. Tools that we expect
         | (and need) to work the exact same way every time.
        
           | 725686 wrote:
           | "People do not just sit there thinking and then spitting out
           | their best guess."
           | 
           | Well, if you are using AI like this, you are doing it wrong.
           | Yes AI is imperfect, fallible, it sometimes hallucinates, but
           | it is a freaking time saver (10x?). It is a tool. Don't
           | expect a hammer to build you a cabinet.
        
             | 420official wrote:
             | There is no other way to use an LLM than to give it context
             | and have it give its best guess, that's how LLMs
             | fundamentally work. You can give it different context, but
             | it's just guessing at tokens.
        
           | TomK32 wrote:
           | > Because human minds are fallible black boxes, we have
           | developed a wide variety of tools that exist outside our
           | minds, like spoken language, written language, law, standard
           | operating procedures, math, scientific knowledge, etc.
           | 
           | Standard operating procedures are great but simplify it to
           | checklists. Don't ever forget checklists which have proven
           | vital for pilots and surgeons alike. And looking at the WHO
           | Surgical Safety Checklist you might think "that's basic
           | stuff" but apparently it is necessary and works
           | https://www.who.int/teams/integrated-health-
           | services/patient...
        
           | jvanderbot wrote:
           | This is a fantastic and thought-provoking response.
           | 
           | Thinking of humans as fallible systems and humanity and its
           | progress as a self-correcting distributed computation /
           | construction system is going to stick with me for a long
           | time.
        
             | clint wrote:
             | Not trying to belittle or be mean, but what exactly did you
             | assume about humans before you read this response? I find
             | it facinating that apparently a lot of people don't think
             | of humans as stochastic, non-deterministic black boxes.
             | 
             | Heck one of the defining qualities of humans is that not
             | only are we unpredictable and fundamentally unknowable to
             | other intelligences (even other humans!) is that we also
             | participate in sophisticated subterfuge and lying to
             | manipulate other intelligences (even other humans!) and
             | often very convincingly.
             | 
             | In fact, I would propose that our society is fundamentally
             | defined and shaped by our ability and willingness to hide,
             | deceive, and use mind tricks to get what our little monkey
             | brains want over the next couple hours or days.
        
               | jvanderbot wrote:
               | I knew that they worked this way, but the conciseness of
               | the response and clean analogy to systems I know and work
               | with all day was just very satisfying.
               | 
               | For example, there was probably still 10-20% of my mind
               | that assumed that stubbornness and ignorance was the
               | reason for things going slowly _most of the time_ , but
               | I'm re-evaluating that, even though I _knew_ that delays
               | and double-checking were inherent features of a business
               | and process. Re-framing those delays as  "evolved
               | responses 100% of the time" rather than "10% of the
               | mistrust, 10% ignorance, 10% .... " is just a more
               | positive way of thinking about human-driven processes.
        
           | SoftTalker wrote:
           | > What does it look like for fallible human minds to work on
           | engineering an airplane? Things are calculated, recorded,
           | checked, tested. People do not just sit there thinking and
           | then spitting out their best guess.
           | 
           | People used to do this. The result was massively overbuilt
           | structures, some of which are still with us hundreds of years
           | later. The result was also underbuilt structures, which
           | tended to collapse and maybe kill people. They are no longer
           | around.
           | 
           | All of the science and math and process and standards in
           | modern engineering is the solution humans came up with
           | because our guesses aren't good enough. LLMs will need the
           | same if they are to be relied upon.
        
           | chamomeal wrote:
           | This is a really interesting perspective and a great point.
        
         | codingdave wrote:
         | Human minds are far less black boxes than LLMs. There are
         | entire fields of study and practice dedicated to understanding
         | how they work, and to adjust how they work via medicine, drugs,
         | education, therapy, and even surgery. There is, of course, a
         | lot more to learn in all of those arenas, and our methods and
         | practices are fallible. But acting as if it is the same level
         | of black box is simply inaccurate.
        
           | bee_rider wrote:
           | They are much more of a black box than AI. There are whole
           | fields around studying them--because they are hard to
           | understand. We put a lot of effort into studying them... from
           | the outside, because we had no other alternative. We were
           | reduced to hitting brains with various chemicals and seeing
           | what happened because they are such a pain to work with.
           | 
           | They are just a more familiar black box. AI's are simpler in
           | principle. And also entirely built by humans. Based on well-
           | described mathematical theories. They aren't particularly
           | black-box, they are just less ergonomic than the human brain
           | that we've been getting familiar with for hundreds of
           | thousands of years through trial and error.
        
           | Closi wrote:
           | They are more of a black box - but humans are a black box
           | that is perhaps more studied and that we have more experience
           | in.
           | 
           | Although human behavior is still weird, and highly fallable!
           | Despite best interventions (therapy, drugs, education),
           | sometimes they still kill each other and we aren't 100% sure
           | why, or how to solve it.
           | 
           | That doesn't mean that the same level of study can't be done
           | on AI though, and they are much easier to adjust compared to
           | the human brain (RLHF is more effective than therapy or
           | drugs!).
        
           | nuancebydefault wrote:
           | I would say human behavior is less predictable. That is one
           | of the reasons why today it is rather easy to spot the bot
           | responses, they tend to fit a certain predictable style,
           | unlike the more unpredictable humans.
        
         | thuuuomas wrote:
         | I tire of this disingenuous comparison. The failure modes of
         | (experienced, professional) humans are vastly different than
         | the failure modes of LLMs. How many coworkers do you have that
         | frequently, wildly hallucinate while still performing
         | effectively? Furthermore, (even experienced, professional)
         | humans are known to be fallible & are treated as such. No
         | matter how many gentle reminders the informed give the
         | enraptured, LLMs will continue to be treated as oracles by a
         | great many people, to the detriment of their application.
        
           | nullc wrote:
           | Wildly hallucinating agents being treated as oracles is a
           | human tradition.
        
       | bsenftner wrote:
       | If you expect the AI to do independent work, yes, it is a dead
       | end.
       | 
       | These LLM AIs need to be treated and handled as what they are:
       | idiot savants with vast and unreliable intelligence.
       | 
       | What does any advanced organization do when they hire a new PhD,
       | let them loose in the company or pair them with experienced
       | staff? When paired with experienced staff, they use the new
       | person for their knowledge but do not let them change things on
       | their own until much later, when confidence is established and
       | the new staffer has been exposed to how things work "around
       | here".
       | 
       | The big difference with LLM AIs is they never graduate to an
       | experienced staffer, they are always the idiot savant that is
       | really dang smart but also clueless and needs to be observed.
       | That means the path forward with this current state of LLM AIs is
       | to pair them with people, personalized to their needs, and treat
       | them as very smart idiot savants great for strategy and problem
       | solving discussion, where the human users are driving the
       | situation, using the LLM AIs like a smart assistant that requires
       | validation - just like a real new hire.
       | 
       | There is an interactive state that can be achieved with these LLM
       | AIs, like being in a conversation with experts, where they
       | advise, they augment and amplify individual persons. A group of
       | individuals adept with use of such an idiot savant enhanced
       | environment would be incredibly capable. They'd be a force unseen
       | in human civilization before today.
        
         | Alex3917 wrote:
         | > The big difference with LLM AIs is they never graduate to an
         | experienced staffer, they are always the idiot savant that is
         | really dang smart but also clueless and needs to be observed.
         | 
         | Basically this. They already have vastly better-than-human
         | ability at finding syntax errors within code, which on its own
         | is quite useful; think of how many people have probably dropped
         | out of CS as a major after staying up all night and failing to
         | find a missing semicolon.
        
           | lionkor wrote:
           | I don't know of a single person who got so stuck on syntax
           | errors that they quit
        
             | FroshKiller wrote:
             | Added to which we already have tools that are great at
             | finding syntax errors. They're called compilers.
        
               | Philpax wrote:
               | Compilers can detect errors in the grammar, but they
               | cannot infer what your desired intent was. Even the best
               | compilers in the diagnostics business (rustc, etc) aren't
               | mind-readers. A LLM isn't perfect, but it's much more
               | capable of figuring out what you wanted to do and what
               | went wrong than a compiler is.
        
               | lionkor wrote:
               | none of that is a syntax issue, though, that's semantics
        
             | bsenftner wrote:
             | Try being a TA to freshmen CS majors; a good 1/3 change
             | majors because they can't handle the syntax strictness
             | coupled with their generally untrained logical mind. They
             | convince themselves it is "too hard" and their buddies over
             | in the business school are having a heck of a lot of fun
             | throwing parties...
        
               | cesaref wrote:
               | Sounds like CS is not for them, and they find something
               | else to do which is more applicable to their skills and
               | interest. This is good. I don't think you should see a
               | high drop out rate from a course as necessarily
               | indicating a problem.
        
               | Philpax wrote:
               | Losing potentially good talent because they don't know
               | how or where to look for mistakes yet is foolhardy. I'm
               | happy for them to throw in the towel if the field is
               | truly not for them, but I would wager that a not-
               | insignificant portion of that crowd would be able to
               | meaningfully progress once they get past the immediate
               | hurdles in front of them.
        
               | jprete wrote:
               | Giving them an LLM to help with syntax errors, at this
               | stage of the tech, is deeply unhelpful to their
               | development.
               | 
               | The foundation of a computer science education is a
               | rigorous understanding of what the steps of an algorithm
               | mean. If the students don't develop that, then I don't
               | think they're doing computer science anymore.
        
               | Philpax wrote:
               | The use of a LLM in this case is to show them where the
               | problem is so that they can continue on. They can't
               | develop an understanding of the algorithm they're
               | studying if they can't get their program to compile at
               | all.
        
               | Alex3917 wrote:
               | > Giving them an LLM to help with syntax errors, at this
               | stage of the tech, is deeply unhelpful to their
               | development.
               | 
               | I mean if the alternative is quitting entirely because
               | they can't see that they've mixed tabs with spaces, then
               | yes, it's very very helpful to their development.
        
             | bilsbie wrote:
             | Hi. Now you do.
             | 
             | I dropped out of cs half because I didn't enjoy the coding
             | because they dropped us into c++ and I found the error
             | messages so confusing.
             | 
             | I discovered python five years later and discovered I loved
             | coding.
             | 
             | ( the other half of the reason is we spent two weeks
             | designing an atm machine at a very abstract level and I
             | thought the whole profession would be that boring.)
        
           | fire_lake wrote:
           | Syntax checking is not an "AI" problem - use any compiler or
           | linter.
        
           | rsynnott wrote:
           | ... One odd thing I've noticed about the people who are very
           | enthusiastic about the use of LLMs in programming is that
           | they appear to be unaware of any _other_ programming tools.
           | Like, this is a solved problem, more or less; code-aware
           | editors have been a thing since the 90s (maybe before?)
        
             | torginus wrote:
             | true.. in the past few days I used my time off to work on
             | my hobby video game - writing the game logic required me to
             | consider problems that, are quite self-contained and domain
             | specific, and probably globally unique (if not particularly
             | complex).
             | 
             | I started out in Cursor, but I quickly realized Claude's
             | erudite knowledge of AWS would not help me here, but what I
             | needed was to refactor the code quickly and often, so that
             | I'd finally find the perfect structure.
             | 
             | For that, IDE tools were much more appropriate than AI
             | wizardry.
        
             | Alex3917 wrote:
             | > code-aware editors have been a thing since the 90s
             | 
             | These will do things like highlight places where you're
             | trying to call a method that isn't defined on the object,
             | but they don't understand the intent of what you're trying
             | to do. The latter is actually important in terms of being
             | able to point you toward the correct solution.
        
           | dgfitz wrote:
           | I know a lot of people who dropped out of CS in college. Not
           | a single one dropped out because of a semicolon syntax issue.
        
             | CalRobert wrote:
             | I spent 8 hours trying to fix a bug once because notepad
             | used smart quotation marks (really showing my age here -
             | and now I'm pretty annoyed that the instructor was telling
             | us to use notepad, but it was 2001 and I didn't know any
             | better).
        
               | dgfitz wrote:
               | I did something like that once too, a long time ago. And
               | because of that I see syntax errors of such I'll within
               | seconds now, having learned once the hard way.
        
               | CalRobert wrote:
               | I also know how important the right tools are. I
               | should've been using vi.
        
           | raincole wrote:
           | > think of how many people have probably dropped out of CS as
           | a major after staying up all night and failing to find a
           | missing semicolon.
           | 
           | ... like a dozen? And in 100% cases it's their teacher's
           | fault.
        
           | layer8 wrote:
           | They are still worse at finding syntax errors than the actual
           | parser. And at best they could be equally good. So what's the
           | point?
        
         | lobsterthief wrote:
         | I agree with all of what you said except this:
         | 
         | > idiot savants with vast and unreliable intelligence.
         | 
         | Remember, intelligence !== knowledge. These LLMs indeed have
         | vast and unreliable knowledge banks.
        
           | bsenftner wrote:
           | Yes, you are correct. They provide knowledge and the human is
           | the operator of the intelligence portion.
        
             | uxhacker wrote:
             | It goes back to the old wisdom DIKW pyramid.
             | 
             |  _EDITED_ My ASCI art pyramid did not work. So imagine a
             | pyramid with DATA at the bottom, INFORMATION on top of the
             | data, and KNOWLEDGE sitting on top of the INFORMATION, with
             | WISDOM at the top.
             | 
             | And then trying top guess where AI is? Some people say that
             | Information is the knowing, what, knowledge the how, and
             | Wisdom the why.
        
           | wanderingstan wrote:
           | In general conversation, "intelligence", "knowledge",
           | "smartness", "expertise", etc are used mostly
           | interchangeably.
           | 
           | If we want to get pedantic, I would point out that
           | "knowledge" is formally defined as "justified true belief",
           | and I doubt we want to get into the quagmire of whether LLM's
           | actually have _beliefs_.
           | 
           | I took OP's point in the casual meaning, i.e. that LLMs are
           | like what I would call an "intelligent coworker", or how one
           | might call a Jeopardy game show contestant as intelligent.
        
         | skydhash wrote:
         | One of the core tenet of technology is that it makes the job
         | less consuming of a person resources (time, strength,...).
         | While I've read a lot of claims, I've yet to see someone make a
         | proper argument on how LLMs can be such a tool.
         | 
         | > _A group of individuals adept with use of such an idiot
         | savant enhanced environment would be incredibly capable. They
         | 'd be a force unseen in human civilization before today_
         | 
         | More than the people who landed someone on the moon?
        
           | bsenftner wrote:
           | They would be capable of landing someone on the moon, if they
           | chose to pursue that goal, and had the finances to do so. And
           | they'd do so with fewer people too.
        
             | wizzwizz4 wrote:
             | I have witnessed no evidence that would support this claim.
             | The only contribution of LLMs to mathematics is in being
             | useful to Terry Tao: they're not capable of solving novel
             | orbital mechanics problems (except through brute-force
             | search, constrained sufficiently that you could chuck a
             | uniform distribution in and get similar outputs). That's
             | _before_ you get into any of the engineering problems.
        
               | bsenftner wrote:
               | You do not have them solving such problems, but you do
               | have them in the conversation as the human experts
               | knowledgeable in that area work to solve the problem.
               | This is not the LLM AIs doing independent work, this is
               | them interactively working with the human person that is
               | capable of solving that problem, it is their career, and
               | the AI just makes them better at it, but not by doing
               | their work, but by advising them as they work.
        
               | wizzwizz4 wrote:
               | But they aren't useful for that. Terry Tao uses them to
               | improve his ability to use poorly-documented boilerplatey
               | things like Lean and matplotlib, but _receiving_ advice
               | from them!? Frankly, if a chatbot is giving you much
               | better advice than a rubber duck, you 're either a Jack-
               | of-all-Trades (in which case, I'd recommend better tools)
               | or a https://ploum.net/2024-12-23-julius-en.html Julius
               | (in which case, I'd recommend staying away from anything
               | important).
               | 
               | I recommend reading his interview with Matteo Wong, where
               | he proposes the opposite: https://www.theatlantic.com/tec
               | hnology/archive/2024/10/teren...
               | 
               | > With o1, you can kind of do this. I gave it a problem I
               | knew how to solve, and I tried to guide the model. First
               | I gave it a hint, and it ignored the hint and did
               | something else, which didn't work. When I explained this,
               | it apologized and said, "Okay, I'll do it your way." And
               | then it carried out my instructions reasonably well, and
               | then it got stuck again, and I had to correct it again.
               | The model never figured out the most clever steps. It
               | could do all the routine things, but it was very
               | unimaginative.
               | 
               | I agree with his overall vision, but transformer-based
               | chatbots will not be the AI algorithm that supports it.
               | Highly-automated proof assistants like Isabelle's
               | Sledgehammer are closer (and even _those_ are really,
               | really crude, compared to what we _could_ have).
        
               | conception wrote:
               | https://deepmind.google/discover/blog/funsearch-making-
               | new-d... seems to be a way. The LLM is the creative side,
               | coming up with ideas-and in which a case the "mutation'
               | caused by hallucinations may be useful. Combined with an
               | evaluation evaluator to protect against the bad outputs.
               | 
               | Pretty close to the idea of human brainstorming and has
               | worked. Could it do orbital math? Maybe not today but the
               | approach seems as feasible as the work Mattingly did for
               | Apollo 13.
        
               | wizzwizz4 wrote:
               | And the LLM can be replaced by a more suitable search
               | algorithm, thus reducing the compute requirements and
               | improving the results.
        
             | irunmyownemail wrote:
             | It would have to be trained in 100% of all potential
             | scenarios. Any scenario that happens for which they're not
             | trained equals certain disaster, unlike a human who can
             | adapt and improvise based on things AI does not have;
             | feelings, emotions, creativity.
        
               | bsenftner wrote:
               | You're still operating with the assumption the AI is
               | doing independent work, it is not, it is advising the
               | people doing the work. That is why people are the ones be
               | augmented and enhanced, and not the other way around:
               | people have the capacity to handle unforeseen scenarios,
               | and with AI as a strategy advisor they'll do so with more
               | confidence.
        
             | dartos wrote:
             | No
        
           | ethbr1 wrote:
           | Cited contextual information retrieval.
           | 
           | One of the obvious uses for current LLMs is as a smarter
           | search tool against static knowledge collections.
           | 
           | Turns out, this is a real world problem in a lot of "fuzzy
           | decision" scenarios. E.g. insurance claim adjudication
           | 
           | Status quo is to train a person over enough years that they
           | can make these decisions reliably. (Because they've
           | internalized all the documentation)
        
         | coliveira wrote:
         | It's even worse. AI is a really smart but inexperienced person
         | who also lies frequently. Because AI is not accountable to
         | anything, it'll always come up with a reasonable answer to any
         | question, if it is correct or not.
        
           | belZaah wrote:
           | To put it in other words: it is not clear when and how they
           | hallucinate. With a person, their competence could be
           | understood and also their limits. But a llm can happily give
           | different answers based on trivial changes in the question
           | with no warning.
        
             | zozbot234 wrote:
             | LLM's are non-deterministic: they'll happily give different
             | answers to the _same_ prompt based on nothing at all. This
             | is actually great if you want to use them for  "creative"
             | content generation tasks, which is IMHO what they're best
             | at. (Along with processing of natural language input.)
             | 
             | Expecting them to do non-trivial amounts of technical or
             | mathematical reasoning, or even something as simple as code
             | generation (other than "translate these complex natural-
             | language requirements into a first sketch of viable
             | computer code") is a total dead end; these will always be
             | _language_ systems first and foremost.
        
               | mapt wrote:
               | This confuses me. You have your model, you have your
               | tokens.
               | 
               | If the tokens are bit-for-bit-identical, where does the
               | non-determinism come in?
               | 
               | If the tokens are only roughly-the-same-thing-to-a-human,
               | sure I guess, but convergence on roughly the same output
               | for roughly the same input should be inherently a goal of
               | LLM development.
        
               | zozbot234 wrote:
               | The model outputs probabilities, which you have to sample
               | randomly. Choosing the "highest" probability every time
               | leads to poor results in practice, such as the model
               | tending to repeat itself. It's a sort of Monte-Carlo
               | approach.
        
               | lifthrasiir wrote:
               | It is technically possible to make it fully deterministic
               | if you have a complete control over the model,
               | quantization and sampling processes. The GP probably
               | meant to say that most _commercially available_ LLM
               | services don 't usually give such control.
        
               | brookst wrote:
               | Actually you just have to set temperature to zero.
        
               | zeta0134 wrote:
               | Most any LLM has a "temperature" setting, a set of
               | randomness added to the otherwise fixed weights to
               | intentionally cause exactly this nondeterministic
               | behavior. Good for creative tasks, bad for repeatability.
               | If you're running one of the open models, set the
               | temperature down to 0 and it suddenly becomes perfectly
               | consistent.
        
               | owenpalmer wrote:
               | You can get deterministic output with even with a high
               | temp.
               | 
               | Whatever "random" seed was used can be reused.
        
               | ninkendo wrote:
               | > If the tokens are bit-for-bit-identical, where does the
               | non-determinism come in?
               | 
               | By design, most LLM's have a randomization factor to
               | their model. Some use the concept of "temperature" which
               | makes them randomly choose the 2nd or 3rd highest ranked
               | next token, the higher the temperature the more
               | often/lower they pick a non-best next token. OpenAI
               | described this in their papers around the GPT-2 timeframe
               | IIRC.
        
               | HarHarVeryFunny wrote:
               | The trained model is just a bunch of statistics. To use
               | those statistics to generate text you need to "sample"
               | from the model. If you always sampled by taking the
               | model's #1 token prediction that would be deterministic,
               | but more commonly a random top-K or top-p token selection
               | is made, which is where the randomness comes in.
        
               | ninetyninenine wrote:
               | Computers are deterministic. LLMs run on computers. If
               | you use the same seed for the random number generator
               | you'll see that it will produce the same output given an
               | input.
        
               | layer8 wrote:
               | The unreliability of LLMs is mostly unrelated to their
               | (artificially injected) non-determinism.
        
             | liotier wrote:
             | In a conversation (conversation and attached pictures at ht
             | tps://bsky.app/profile/liotier.bsky.social/post/3ldxvutf76.
             | ..), I delete a spurious "de" ("Produce de two-dimensional
             | chart [..]" to "Produce two-dimensional [..]") and ChatGPT
             | generates a new version of the graph, illustrating a
             | different function although nothing else has changed and
             | there was a whole conversation to suggest that ChatGPT held
             | a firm model of the problem. Confirmed my current doctrine:
             | use LLM to give me concepts from a huge messy corpus, then
             | check those against sources from said corpus.
        
             | aruametello wrote:
             | > trivial changes in the question
             | 
             | i love how those changes are often just a different seed in
             | the randomness... as just chance.
             | 
             | run some repeated tests with "deeper than surface
             | knowledge" on some niche subjects and got impressed that it
             | gave the right answer... about 20% of the time.
             | 
             | (on earlier openAI models)
        
             | ANewFormation wrote:
             | There's no need for there to be changes to the question.
             | LLMs have a rng factor built in to the algorithm. It can
             | happily give you the right answer and then the wrong one.
        
             | brookst wrote:
             | Ask survey designers how "trivial" changes to questions
             | impact results from humans. It's a huge thing in the field.
        
           | Polizeiposaune wrote:
           | Saying that they "lie" creates the impression that they have
           | knowledge that they make false statements, and they intend to
           | deceive.
           | 
           | They're not that capable. They're just bullshit artists.
           | 
           | LLM = LBM (large bullshit models).
        
           | oh_my_goodness wrote:
           | "AI is a really smart but inexperienced person who also lies
           | frequently." Careful. Here "smart" means "amazing at pattern-
           | matching and incredibly well-read, but has zero understanding
           | of the material."
        
             | maxdoop wrote:
             | And how is what humans do any different ? What does it mean
             | to understand ? Are we pattern matching as well?
        
               | oh_my_goodness wrote:
               | I asked ChatGPT to help out:
               | -----------------------------
               | 
               | "The distinction between AI and humans often comes down
               | to the concept of understanding. You're right to point
               | out that both humans and AI engage in pattern matching to
               | some extent, but the depth and nature of that process
               | differ significantly." "AI, like the model you're
               | chatting with, is highly skilled at recognizing patterns
               | in data, generating text, and predicting what comes next
               | in a sequence based on the data it has seen. However, AI
               | lacks a true understanding of the content it processes.
               | Its "knowledge" is a result of statistical relationships
               | between words, phrases, and concepts, not an awareness of
               | their meaning or context"
        
               | oh_my_goodness wrote:
               | Anyone downvoting, please be aware that you are
               | downvoting the AI's answer!
               | 
               | :)
        
               | portaouflop wrote:
               | people are downvoting because they don't want to see
               | walls of text generated by llms on hn
        
               | oh_my_goodness wrote:
               | That's reasonable. I cut back the text. On the other hand
               | I'm hoping downvoters have read enough to see that the
               | AI-generated comment (and your response) are completely
               | on-topic in this thread.
        
               | PKop wrote:
               | If we wanted to talk to an LLM we would go there and do
               | it, this place if for humans to put in effort and use
               | their brains to think for themselves.
        
               | oh_my_goodness wrote:
               | With respect, can I ask you to please read the thread?
        
               | PKop wrote:
               | Completely missing the point.
               | 
               | We don't care what LLMs have to say, whether you cut back
               | some of it or not it's a low effort wasted of space on
               | the page.
               | 
               | This is a forum for humans.
               | 
               | You regurgitating something you had no contribution in
               | producing, which we can prompt for ourselves, provides no
               | value here, we can all spam LLM slop in the replies if we
               | wanted, but that would make this site worthless.
        
               | oh_my_goodness wrote:
               | I think you're saying that reading the thread is
               | completely pointless, because we're all committed to
               | having a high-quality discussion.
        
               | ithkuil wrote:
               | It's on topic indeed. But is it insightful?
               | 
               | I use llms as tools to learn about things I don't know
               | and it works quite well in that domain.
               | 
               | But so far I haven't found that it helps advance my
               | understanding of topics I'm an expert in.
               | 
               | I'm sure this will improve over time. But for now, I like
               | that there are forums like HN where I may stumble upon an
               | actual expert saying something insightful.
               | 
               | I think that the value of such forums will be diminished
               | once they get flooded with AI generated texts.
               | 
               | (Fwiw I didn't down vote)
        
               | oh_my_goodness wrote:
               | Of course the AI's comment was not insightful. How could
               | it be? It's autocomplete.
               | 
               | That was the point. If you back up to the comment I was
               | responding to, you can see the claim was: "maybe people
               | are doing the same thing LLMs are doing". Yet, for
               | whatever reason, many users seemed to be able to pick out
               | the LLM comment pretty easily. If I were to guess, I
               | might say those users did not find the LLM output to be
               | human-quality.
               | 
               | That was exactly the topic under discussion. Some folks
               | seem to have expressed their agreement by downvoting. Ok.
        
               | ithkuil wrote:
               | I think human brains are a combination of many things.
               | Some part of what we do looks quite a lot like an
               | autocomplete from our previous knowledge.
               | 
               | Other parts of what we do looks more as a search through
               | the space of possibilities.
               | 
               | And then we act and collaborate and test the ideas that
               | stand against scrutiny.
               | 
               | All of that is in principle doable by machines. The
               | things we currently have and we call LLMs seem to
               | currently mostly address the autocomplete part although
               | they begin to be augmented with various extensions that
               | allow them to take baby steps in other fronts. Will they
               | still be called large language models once they will have
               | so many other mechanisms beyond the mere token
               | prediction?
        
               | thunky wrote:
               | No, they're downvoting you for posting an AI answer.
        
               | oh_my_goodness wrote:
               | That AI answer is not spam, though. It's literally the
               | topic under discussion.
        
               | thunky wrote:
               | Yeah, it's just the fact that you pasted in an AI answer,
               | regardless of how on point it is. I don't think people
               | want this site to turn into an AI chat session.
               | 
               | I didn't downvote, I'm just saying why I think you were
               | downvoted.
        
               | Retric wrote:
               | The difference is less about noticing patterns than it is
               | knowing when to discard them.
        
               | HarHarVeryFunny wrote:
               | Sure, we're also pattern matching, but additionally
               | (among other things):
               | 
               | 1) We're continually learning so we can update our
               | predictions when our pattern matching is wrong
               | 
               | 2) We're autonomous - continually interacting with the
               | environment, and learning how it respond to our
               | interaction
               | 
               | 3) We have built in biases such as curiosity and boredom
               | that drive us to experiment, gain new knowledge, and
               | succeed in cases where "pre-training to date" would have
               | failed us
        
               | bagful wrote:
               | For one, a brain can't do anything without irreversibly
               | changing itself in the process; our reasoning is not a
               | pure function.
               | 
               | For a person to truly understand something they will have
               | a well-refined (as defined by usefulness and
               | correctness), malleable internal model of a system that
               | can be tested against reality, and they must be aware of
               | the limits of the knowledge this model can provide.
               | 
               | Alone, our language-oriented mental circuits are a thin,
               | faulty conduit to our mental capacities; we make sense of
               | words as they relate to mutable mental models, and not
               | simply in latent concept-space. These models can exist in
               | dedicated but still mutable circuitry such as the
               | cerebellum, or they can exist as webs of association
               | between sense-objects (which can be of the physical
               | senses or of concepts, sense-objects produced by
               | conscious thought).
               | 
               | So if we are pattern-matching, it is not simply of words,
               | or of their meanings in relation to the whole text, or
               | even of their meanings relative to all language ever
               | produced. We translate words into problems, and match
               | problems to models, and then we evaluate these internal
               | models to produce perhaps competing solutions, and then
               | we are challenged with verbalizing these solutions. If we
               | were only reasoning in latent-space, there would be no
               | significant difficulty in this last task.
        
               | tomrod wrote:
               | Humans can extrapolate as well as interpolate.
               | 
               | AI can only interpolate. We may perceive it as
               | extrapolation, but all LLMs architectures are
               | fundamentally cleverly designed lossy compression
        
               | acjohnson55 wrote:
               | At the end of the day, we're machines, too. I wrote a
               | piece a few months ago with an intentionally provocative
               | title, questioning whether we're truly on a different
               | cognitive level.
               | 
               | https://acjay.com/2024/09/09/llms-think/
        
           | mattgreenrocks wrote:
           | It is a wonderful irony that AI makes competence all the more
           | important.
           | 
           | It's almost like all the thought leading that proclaimed the
           | death of software eng was nothing but self-promotional noise.
           | Huh, go figure.
        
             | TrueDuality wrote:
             | Don't count it out yet as being problematic for software
             | engineering, bu not in the way you probably intend with
             | your comment.
             | 
             | Where I see software companies using it most is as a
             | replacement for interns and junior devs. That replacement
             | means we're not training up the next generation to be the
             | senior or expert engineers with real world experience. The
             | industry will feel that badly at some point unless it gets
             | turned around.
        
               | kensey wrote:
               | It's also already becoming an issue for open-source
               | projects that are being flooded with low-quality (=
               | anything from "correct but pointless" to "actually
               | introduces functional issues that weren't there before")
               | LLM-generated PRs and even security reports --- for
               | examples see Daniel Stenberg's recent writing on this.
        
               | mattgreenrocks wrote:
               | Agree. I think we are already seeing a hollowing out
               | effect on tech hiring at the lower end. They've always
               | been squeezed a bit, but it seems much worse now.
        
             | bentt wrote:
             | I agree with this irony.
             | 
             | That said, combining multiple ais and multiple programs
             | together may mitigate this.
        
           | scarface_74 wrote:
           | Hallucinations can be mostly eliminated with RAG and tools. I
           | use NotebookLM all of the time to research through our
           | internal artifacts, it includes citations/references from
           | your documents.
           | 
           | Even with ChatGPT you can ask it to find web citations and if
           | it uses the Python runtime to find answers, you can look at
           | the code.
           | 
           | And to prevent the typical responses - my company uses GSuite
           | so Google already has our IP, NotebookLM is specifically
           | approved by my company and no Google doesn't train on your
           | documents
        
             | hatenberg wrote:
             | Even with RAG you're bounded at some 93%, it's not a
             | panacea.
        
               | scarface_74 wrote:
               | How are you bounded? When you can easily check the
               | sources? Also you act as if humans without LLMs have a
               | higher success rate?
               | 
               | There is an entire "reproducibility crisis" with
               | research.
        
             | HarHarVeryFunny wrote:
             | Facts can be checked with RAG, but the real value of AI
             | isn't as a search replacement, but for reasoning/problem-
             | solving where the answer isn't out there.
             | 
             | How do you, in general, fact check a chain of reasoning?
        
               | scarface_74 wrote:
               | It's not just a search engine though.
               | 
               | I can't tell a search engine to summarize text for a
               | technical audience and then another summary for a non
               | technical audience.
               | 
               | I recently came into the middle of a cloud consulting
               | project where a lot of artifacts, transcripts of
               | discovery sessions, requirement docs, etc had already
               | been created.
               | 
               | I asked NotebookLM all of the questions I would have
               | asked a customer at the beginning of a project.
               | 
               | What it couldn't answer, I then went back and asked the
               | customer.
               | 
               | I was even able to get it to create a project plan with
               | work streams and epics. Yes it wouldn't have been
               | effective if I didn't already know project management,
               | AWS and two decades+ of development experience.
               | 
               | Despite what people think, LLMs can also do a pretty good
               | job at coding when well trained on the APIs. Fortunately,
               | ChatGPT is well trained on the AWS CLI, SDKs in various
               | languages and you can ask it to verify the SDK functions
               | on the web.
               | 
               | I've been deep into AWS based development since LLMs have
               | been a thing. My opinion may change if I get back into
               | more traditional development
        
               | HarHarVeryFunny wrote:
               | > I can't tell a search engine to summarize text for a
               | technical audience and then another summary for a non
               | technical audience.
               | 
               | No, but, as amazing as that is, don't put too much trust
               | in those summaries!
               | 
               | It's not summarizing based on grokking the key points of
               | the text, but rather based on text vs summary examples
               | found in the training set. The summary may pass a surface
               | level comparison to the source material, while failing to
               | capture/emphasize the key points that would come from
               | having actually understood it.
        
               | scarface_74 wrote:
               | I _write_ the original content or I was in the meeting
               | where I'm giving it the transcript. I know what points I
               | need to get across to both audiences.
               | 
               | Just like I'm not randomly depending on it to do an
               | Amazon style PRFAQ (I was indoctrinated as an Amazon
               | employee for 3.5 years), create a project plan, etc,
               | without being a subject matter expert in the areas. It's
               | a tool for an experienced writer, halfway decent project
               | manager, AWS cloud application architect and developer.
        
           | _heimdall wrote:
           | That sounds mostly like an incentives problem. If OpenAI,
           | Anthropic, etc decide their LLMs need to be accurate they
           | will find some way of better catching hallucinations. It
           | probably will end up (already is?) being yet another LLM
           | acting as a control structure trying to fact check responses
           | before they are sent to users though, so who knows if it will
           | work well.
           | 
           | Right noe there's no incentive though. People keep paying
           | good money to use these tools despite their hallucinations,
           | aka lies/gas lighting/fake information. As long as users
           | don't stop paying and LLM companies don't have business
           | pressure to lean on accuracy as a market differentiator, no
           | one is going to bother fixing it.
        
             | bearjaws wrote:
             | Believe me, if they could use another LLM to audit an LLM,
             | they would have done that already.
             | 
             | It's inherit to transformers that they predict the next
             | most likely token, its not possible to change that behavior
             | without making them useless at generalizing tasks
             | (overfitting).
             | 
             | LLMs run on statistics, not logic. There is no fact
             | checking, period. There is just the next most likely token
             | based on the context provided.
        
               | _heimdall wrote:
               | Yeah its an interesting question, and I'm a little
               | surprised I got down voted here.
               | 
               | I wouldn't expect them to add an additional LLM layer
               | _unless_ hallucinations from the underlying LLM aren 't
               | acceptable, and in this case that means it is
               | unacceptable enough to cost them users and money.
               | 
               | Adding a check/audit layer, even if it would work, is
               | expensive both financially and computationally. I'm not
               | sold that it would actually work, but I just don't think
               | they've had enough reason to really give it a solid
               | effort yet either.
               | 
               | Edit: as far as fact checking, I'm not sure why it would
               | be impossible. An LLM wouldn't likely be able to run a
               | check against a pre-trained model of "truth," but that
               | isn't the only option. An LLM should be able to mimic
               | what a human would do, interpret the response and search
               | a live dataset of sources considered believable. Throw a
               | budget of resources at processing the search results and
               | have the LLM decide if the original response isn't backed
               | up, or contradicts the source entirely.
        
           | uludag wrote:
           | It's actually even worse than that: the current trend of AI
           | is transformer-based deep learning models that use self-
           | attention mechanisms to generate token probabilities,
           | predicting sequences based on training data.
           | 
           | If only it was something which we could ontologically map
           | onto existing categories like servants or liars...
        
           | kraftman wrote:
           | If I had a senior member of the team that was incredibly
           | knowledgeable but occasionally lied, but in a predictable
           | way, I would still find that valuable. Talking to people is a
           | very quick and easy way to get information about a specific
           | subject in a specific context, so I could ask them targetted
           | questions that are easy to verify, the worst thing that
           | happens is I 'waste' a conversation with them.
        
             | HarHarVeryFunny wrote:
             | Sure, but LLMs don't lie in a predictable way. Its just
             | their nature that they output statistical sentence
             | continuations, with a complete disregard for the truth.
             | Everything that they output is suspect, especially the
             | potentially useful stuff that you don't know whether it's
             | true or false.
        
               | kraftman wrote:
               | They do lie in a predictable way: if you ask them for a
               | widely available fact you have a very high probability of
               | getting the correct answer, if you ask them for something
               | novel you have a very high probabilty of getting
               | something made up.
               | 
               | If I'm trying to use some tool that just got released or
               | just got a big update, I wont use AI, if I want to check
               | the syntax of a for loop in a language I don't know I
               | will. Whenever you ask it a question you should have an
               | idea in your mind of how likely you are to get a good
               | answer back.
        
               | HarHarVeryFunny wrote:
               | I suppose, but they can still be wrong on the common
               | facts like number of R's in strawberry that are counter-
               | intuitive.
               | 
               | I saw an interesting example yesterday of type "I have 3
               | apples, my dad has 2 more than me ..." where of the top
               | 10 predicted tokens, about 1/2 led to the correct answer,
               | and about 1/2 didn't. It wasn't the most confident
               | predictions that lead to the right answer - pretty much
               | random.
               | 
               | The trouble with LLMs vs humans is that humans learn to
               | predict _facts_ (as reflected in feedback from the
               | environment, and checked by experimentation, etc),
               | whereas LLMs only learn to predict sentence soup
               | (training set) word statistics. It 's amazing that LLM
               | outputs are coherent as often as they are, but entirely
               | unsurprising that they are often just "sounds good" flow-
               | based BS.
        
               | kraftman wrote:
               | I think maybe this is where the polarisation of those who
               | find chatGPT useful and those who don't comes from. In
               | this context, the number of r's in strawberry is not a
               | fact: its a calculation. I would expect AI to be able to
               | spell a common word 100% of the time, but not to be able
               | to count letters. I don't think in the summary of human
               | knowledge that has been digitised there are that many
               | people saying 'how many r's are there in strawberry', and
               | if they are I think that the common reply would be '2',
               | since the context is based on the second r. (people
               | confuse strawbery and strawberry, not strrawberry and
               | strawberry).
               | 
               | Your apples question is the same, its not knowledge, it's
               | a calculation, it's intelligence. The only time you're
               | going to get intelligence from AI at the moment is to ask
               | a question that a significantly large number of people
               | have already answered.
        
               | HarHarVeryFunny wrote:
               | True, but that just goes to show how brittle these models
               | are - how shallow the dividing line is between primary
               | facts present (hopefully consistently so) in the training
               | set, and derived facts that are potentially more suspect.
               | 
               | To make things worse, I don't think we can even assume
               | that primary facts are always going to be represented in
               | abstract semantic terms independent of source text. The
               | model may have been trained on a fact but still fail to
               | reliably recall/predict it because of "lookup failure"
               | (model fails to reduce query text to necessary abstract
               | lookup key).
        
           | layer8 wrote:
           | Lying means stating things as facts despite knowing or
           | believing that they are false. I don't think this accurately
           | characterizes LLMs. It's more like a fever dream where you
           | might fabulate stuff that appears plausibly factual in your
           | dream world.
        
         | api wrote:
         | After using them for a long time I am convinced they have no
         | true intelligence beyond what is latent in training data. In
         | other words I think we are kind of fooling ourselves.
         | 
         | That being said they are very useful. I mostly use them as a
         | far superior alternative to web search and as a kind of junior
         | research assistant. Anything they find must be checked of
         | course.
         | 
         | I think we have invented the sci-fi trope of the AI librarian
         | of the galactic archive. It can't solve problems but it can
         | rifle through the totality of human knowledge and rapidly find
         | things.
         | 
         | It's a search engine.
        
           | weakfish wrote:
           | I mean, it's known that there's no intelligence if you simply
           | look at how it works on a technical level - it's a prediction
           | of the next token. That wasn't really ever in question as to
           | whether they have "intelligence"
        
             | api wrote:
             | To people who really understand them _and_ are grounded, I
             | think you 're right. There has been a lot of hype among
             | people who don't understand them as much, a lot of hype
             | among the public, and a lot of schlock about
             | "superintelligence" and "hard takeoff" etc. among smart but
             | un-grounded people.
             | 
             | The latter kind of fear mongering hype has been exploited
             | by companies like ClosedAI in a bid for regulatory capture.
        
               | danielbln wrote:
               | A little humility would do us good regardless, because we
               | don't know what intelligence is and what consciousness
               | is, we can't truly define it nor do we understand what
               | makes humans conscious and sentient/sapient.
               | 
               | Categorically ruling out intelligence because "it's just
               | a token predictor" puts us at the opposite of the
               | spectrum, and that's not necessarily a better place to
               | be.
        
             | jghn wrote:
             | > it's known that there's no intelligence
             | 
             | To you & I that's true. But especially for the masses
             | that's not true. It seems like at least once to day that I
             | either talk to someone or hear someone via tv/radio/etc who
             | does not understand this.
             | 
             | An example that amused me recently was a radio talk show
             | host who had a long segment describing how he & a colleague
             | had a long argument with ChatGPT to correct a factual
             | inaccuracy about their radio show. And that they finally
             | convinced ChatGPT that they were correct due to their
             | careful use of evidence & reasoning. And the part they were
             | most happy about was how it had now learned, and going
             | forward ChatGPT would not spread these inaccuracies.
             | 
             | That anecdote is how the public at large sees these tools.
        
               | dylan604 wrote:
               | > radio talk show
               | 
               | Well, there's the first problem.
               | 
               | > were most happy about was how it had now learned
               | 
               | on tomorrow's episode, those same hosts learn that once
               | their chat session ended, the same conversion gets to
               | start all over from the beginning.
        
               | ithkuil wrote:
               | Ironically if you explain to those talk show hosts how
               | they are wrong about how ChatGPT learns (or doesn't
               | learn) and use all the right arguments and proofs so that
               | they finally concede, chances are that they too won't
               | quite learn from that and keep repeating their previous
               | bias next time.
        
               | weakfish wrote:
               | Oh I totally agree, it bugs me to no end and that's
               | partially why I replied :)
        
           | returnInfinity wrote:
           | But Ilya has convinced himself and many others that
           | predicting the next token is intelligence.
        
             | swid wrote:
             | It seems to me predicting things in general is a pretty
             | good way to bootstrap intelligence. If you are competing
             | for resources, predicting how to avoid danger and catch
             | food is about the most basic way to reinforce good
             | behavior.
        
             | dylan604 wrote:
             | I've convinced myself I'm a multi-millionaire, but all
             | other evidence easily contradicts that. Some people put a
             | bit too much into the "putting it out there" and "making
             | your own reality"
        
           | zcw100 wrote:
           | And a plagiarism machine. It's like a high school student
           | that thinks they can change a couple of words, make sure it's
           | grammatically correct and it's not plagiarism because it's
           | not an exact quote....either that or it just completely makes
           | it up. I think LLMs will be revolutionary but just not in the
           | way people think. It may be similar to the Gutenberg press.
           | Before the printing press words were precious and closely
           | held resources. The Gutenberg press made words cheap and
           | abundant. Not everyone thought it was a good thing at the
           | time but it changed everything.
        
         | coffeefirst wrote:
         | The problem is it's still a computer. And that's okay.
         | 
         | I can ask the computer "hey I know this thing exists in your
         | training data, tell me what it is and cite your sources." This
         | is awesome. Seriously.
         | 
         | But what that means is you can ask it for sample code, or to
         | answer a legal question, but fundamentally you're getting a
         | search engine reading something back to you. It is not a
         | programmer and it is not a lawyer.
         | 
         | The hype train _really_ wants to exaggerate this to  "we're
         | going to steal all the jobs" because that makes the stock price
         | go up.
         | 
         | They would be far less excited about that if they read a little
         | history.
        
           | insane_dreamer wrote:
           | > "we're going to steal all the jobs"
           | 
           | It won't steal them all, but it will have a major impact by
           | stealing the lower level jobs which are more routine in
           | nature -- but the problem is that those lower level jobs are
           | necessary to gain the experience needed to get to the higher
           | level jobs.
           | 
           | It also won't eliminate jobs completely, but it will greatly
           | reduce the number of people needed for a particular job. So
           | the impact that it will have on certain trades --
           | translators, paralegals, journalists, etc. -- is significant.
        
           | ethbr1 wrote:
           | The thing that makes the smarter search use case interesting
           | is _how_ LLMs are doing their search result calculations:
           | dynamically and at metadata scales previously impossible.
           | 
           | LLM-as-search is essentially the hand-tuned expert systems AI
           | vs deep learning AI battle all over again.
           | 
           | Between natural language understanding and multiple
           | correlations, it's going to scale a lot further than previous
           | search approaches.
        
           | cshores wrote:
           | I find it fascinating that I can achieve about 85-90% of what
           | I need for simple coding projects in my homelab using AI.
           | These projects often involve tasks like scraping data from
           | the web and automating form submissions.
           | 
           | My workflow typically starts with asking ChatGPT to analyze a
           | webpage where I need to authenticate. I guide it to identify
           | the username and password fields, and it accurately detects
           | the credential inputs. I then inform it about the presence of
           | a session cookie that maintains login persistence. Next, I
           | show it an example page with links--often paginated with
           | numbered navigation at the bottom--and ask it to recognize
           | the pattern for traversing pages. It does so effectively.
           | 
           | I further highlight the layout pattern of the content, such
           | as magnet links or other relevant data presented by the CMS.
           | From there, I instruct it to generate a Python script that
           | spiders through each page sequentially, navigates to every
           | item on those pages, and pushes magnet links directly into
           | Transmission. I can also specify filters, such as only
           | targeting items with specific media content, by providing a
           | sample page for the AI to analyze before generating the
           | script.
           | 
           | This process demonstrates how effortlessly AI enables coding
           | without requiring prior knowledge of libraries like
           | beautifulsoup4 or transmission_rpc. It not only builds the
           | algorithm but also allows for rapid iteration. Through this
           | exercise, I assume the role of a manager, focusing solely on
           | explaining my requirements to the AI and conducting a code
           | review.
        
         | insane_dreamer wrote:
         | > with vast and unreliable intelligence
         | 
         | I would say "knowledge" rather than "intelligence"
         | 
         | The key feature of LLMs is the vast amounts of information and
         | data they have access to, and their ability to quickly process
         | and summarize, using well-written prose, that information based
         | on pattern matching.
        
           | ethbr1 wrote:
           | This is what LLM (and AI in general) naysayers are missing.
           | 
           | LLMs will likely never get us to 100% solutions on a large
           | class of problems.
           | 
           | But! A lot of problems can be converted into versions with a
           | subcomponent that LLMs can solve 100% of.
           | 
           | And the fusion of LLMs doing 100% of that subportion + humans
           | doing the remainder = increased productivity.
           | 
           | Re-engineering problems to be LLM-tolerant, then using LLMs
           | to automate that portion of the problem, is the winning
           | approach.
        
           | cess11 wrote:
           | So you think machines running statistical inference have
           | awareness.
           | 
           | That's quite the embarassment if you actually mean it.
        
             | insane_dreamer wrote:
             | I said nothing of the sort
        
               | cess11 wrote:
               | OK, so you have your own definition of knowledge. Please
               | share it.
        
         | cactacea wrote:
         | > A group of individuals adept with use of such an idiot savant
         | enhanced environment would be incredibly capable. They'd be a
         | force unseen in human civilization before today.
         | 
         | I'm sorry but your comment is a good example of the logical
         | shell game many people play with AI when applying it to general
         | problem solving. Your LLM AI is both an idiot and an expert
         | somehow? Where is this expertise derived from and why should
         | you trust it? If LLMs were truly as revolutionary as all the
         | grifters would have you believe then why do we not see "forces
         | unseed in human civilization before today" by humans that
         | employ armies of interns? That these supposed ubermensch do not
         | presently exist is firm evidence in support of current AI being
         | a dead end in my opinion.
         | 
         | Humans are infinitely more capable than current AI, the
         | limiting factor is time and money. Not capability!
        
           | dylan604 wrote:
           | > Your LLM AI is both an idiot and an expert somehow?
           | 
           | Maybe you are unfamiliar with the term idiot savant?
        
             | cactacea wrote:
             | I am indeed familiar with the term. Savant and expert are
             | not perfect synonyms. That is beside my point anyway.
        
         | monkeynotes wrote:
         | I was so stupid when GPT3 came out. I knew so little about
         | token prediction, I argued with folks on here that it was
         | capable of so many things that I now understand just aren't
         | compatible with the tech.
         | 
         | Over the past couple of years of educating myself a bit, whilst
         | I am no expert I have been anticipating a dead end. You can
         | throw as much training at these things as you like, but all
         | you'll get is more of the same with diminishing returns. Indeed
         | in some research the quality of responses gets worse as you
         | train it with more data.
         | 
         | I am yet to see anything transformative out of LLMs other than
         | demos which have prompt engineers working night and day to do
         | something impressive with. Those Sora videos took forever to
         | put together, and cost huge amounts of compute. No one is going
         | to make a whole production quality movie with an LLM and
         | disrupt Hollywood.
         | 
         | I agree, an LLM is like an idiot savant, and whilst it's
         | fantastic for everyone to have access to a savant, it doesn't
         | change the world like the internet, or internal combustion
         | engine did.
         | 
         | OpenAI is heading toward some difficult decisions, they either
         | admit their consumer business model is dead and go into
         | competing with Amazon for API business (good luck), become a
         | research lab (give up on being a billion dollar company), or
         | get acquired and move on.
        
         | pnut wrote:
         | Criticisms like this are levied against an excessively narrow
         | (obsolete?) characterisation of what is happening in the AI
         | space currently.
         | 
         | After reading about o3's performance on ARC-AGI, I strongly
         | suspect people will not be so flippantly dismissive of the
         | inherent limits of these technologies by this time next year.
         | I'm genuinely surprised at how myopic HN commentary is on this
         | topic in general. Maybe because the implications are almost
         | unthinkably profound.
         | 
         | Anyway, OpenAI, Anthropic, Meta, and everyone else are well
         | aware of these types of criticisms, and are making significant,
         | measurable progress towards architecturally solving the
         | deficiencies.
         | 
         | https://arcprize.org/blog/oai-o3-pub-breakthrough
        
           | jokethrowaway wrote:
           | Nah, the trick with o3 solving IQ tests seems to be that they
           | bruteforce solutions and then pick the best option. That's
           | why calls that are trivial for humans end up costing a lot.
           | 
           | It still can't think and it won't think.
           | 
           | LANGUAGE models (keyword: language) is a language model, it
           | should be paired with a reasoning engine to translate the
           | inner thought of the machine into human language. It should
           | not be the source of decisions because it sucks at doing so,
           | even though the network can exhibit some intelligence.
           | 
           | We will never have AGI with just a language model. That said,
           | most jobs people do are still at risk, even with chatgpt-3.5
           | (especially outside of knowledge work, where difficult
           | decisions need to be taken). So we'll see the problems with
           | AGI and the job market way earlier than AGI, as soon as we
           | apply robotics and vision models + chatgpt 3.5 level
           | intelligence. Goodbye baristas, goodbye people working in
           | factories.
           | 
           | Let's start working on a reasoning engine so we can replace
           | those pesky knowledge workers too.
        
             | esafak wrote:
             | The important thing is that you can use inference-time
             | computation to improve results. Now the race is on to
             | optimize that.
        
             | ithkuil wrote:
             | How many attempts can you have when running an evaluation
             | run of an ARC competition?
        
           | rafaelmn wrote:
           | Reading the o1 announcement you could have been saying the
           | same thing a year ago yet it's worse than Claude in practice
           | and if it was all that's available - I wouldn't even use it
           | if it was free - it's that bad.
           | 
           | If OpenAI has demonstrated one thing is that they are a hype
           | production machine and they are probably getting ready for
           | next round of investment. I wouldn't be surprised if this
           | model was equally useless as o1 when you factor in
           | performance and price.
           | 
           | At this point they are completely untrustworthy and untill
           | something lands publicly for me to test it's safe to ignore
           | their PR as complete BS.
        
             | benterix wrote:
             | > yet it's worse than Claude in practice
             | 
             | For most tasks - but not all. I normally paste my prompt in
             | both and while Claude is generally superior in most
             | aspects, there are tasks at which o1 performed slightly
             | better.
        
           | portaouflop wrote:
           | Any day now!
        
             | wavemode wrote:
             | AGI is the new nuclear fusion.
        
               | voidfunc wrote:
               | Except AI is actually delivering value and on the path to
               | AGI.. and nuclear fusion continues to be a physicists and
               | engineering pipe dream.
        
               | JasserInicide wrote:
               | What actual widespread non-shareholder value has AI given
               | us?
        
             | danielbln wrote:
             | Considering the advancements we've seen in the last three
             | years, this dismissive comment feels misplaced.
        
               | portaouflop wrote:
               | Let's just wait and see - what good can come from endless
               | speculation and what ifs and trying to predict the
               | future?
        
               | gtirloni wrote:
               | Considering the advancements we've seen in the last one
               | year, it does not.
        
           | mbesto wrote:
           | > I strongly suspect people will not be so flippantly
           | dismissive of the inherent limits of these technologies by
           | this time next year.
           | 
           | People are flippantly dismissive of the inherent limits
           | because there ARE inherent limitations of the technology.
           | 
           | > Maybe because the implications are almost unthinkably
           | profound.
           | 
           | Maybe because the stuff you're pointing to are just
           | benchmarks and the definitions around things like AGI are
           | flawed (and the goalposts are constantly moving, just like
           | the definition of autonomous driving). I use LLMs roughly
           | 20-30x a day - they're an absolutely wonderful tool and work
           | like magic, but they are flawed for some very fundamental
           | reasons.
        
             | greentxt wrote:
             | Humans are not flawed? Are robotaxi's not autonomous
             | driving? (Could an LLM have written this post?)
        
               | manquer wrote:
               | Humans are not machines , they have both rights that
               | machines do not have and also responsibilities and
               | consequences that machines will not have, for example bad
               | driving will cost you money, injury , prison time or even
               | death.
               | 
               | Therefore AI has to be much better than humans at the
               | task to be considered ready to be a replacement.
               | 
               | ----
               | 
               | Today robot taxis can only work in fair weather
               | conditions in locations that are planned cities. No
               | autonomous driving system can drive in Nigeria or India
               | or even many european cities that were never designed for
               | cars any time soon .
               | 
               | Working in very specific scenarios is useful , but hardly
               | measure of their intelligence or candidate for replacing
               | humans for the task
        
             | space_fountain wrote:
             | I hear people say this kind of thing but it confuses me.
             | 
             | 1. What does inherit limitations mean?
             | 
             | 2. How do we know something is an inherit limitation
             | 
             | 3. Is it a problem if arguments for a particular inherit
             | limitation also apply to humans?
             | 
             | From what I've seen people will often say things like AI
             | can't be creative because it's just a statistical machine,
             | but humans are also "just" statistical machines. People
             | might mean something like humans are more grounded because
             | humans react not just to how the world already works but
             | how the world reacts to actions they take, but this
             | difference misunderstands how LLMs are trained. Like humans
             | LLMs get most of their training from observing the world,
             | but LLMs are also trained with re-enforcement learning and
             | this will surely be an active area of research.
        
               | mbesto wrote:
               | > 1. What does inherit limitations mean?
               | 
               | One of many, but this is a simple one - LLMs are only
               | limited to knowledge that is publicly available on the
               | internet. This is "inherit" because thats how LLMs are
               | essentially taught the information they retrieve today.
        
               | space_fountain wrote:
               | But this isn't an inherit limitation is it? LLMs can be
               | trained with private information and can have large
               | context windows full of private info
        
           | netdevphoenix wrote:
           | You remember when Google was scared to release LLMs? You
           | remember that Googler that got fired because he thought the
           | LLM was sentient?
           | 
           | There is likely a couple of surprised still left in LLMs but
           | no one should think that any present technology in its
           | current state or architecture will get us to AGI or anything
           | that remotely resembles it.
        
           | gosub100 wrote:
           | > Maybe because the implications are almost unthinkably
           | profound.
           | 
           | laundering stolen IP from actual human artists and
           | researchers, extinguishing jobs, deflecting responsibility
           | for disasters. yeah, I can't wait for these "profound
           | implications" to come to fruition!
        
             | lgas wrote:
             | The implications of the technology are not impacted by how
             | the technology was created or where the IP was sourced.
        
           | formerly_proven wrote:
           | It doesn't really matter. "It works and is cost/resource-
           | effective at being an AGI" is a fundamentally uninteresting
           | proposition because we're done at that point. It's like
           | debating how we're going to deal with the demise of our star;
           | we won't, because we can't.
        
           | fabianhjr wrote:
           | > The question of whether a computer can think is no more
           | interesting than the question of whether a submarine can
           | swim. ~ Edsger W. Dijkstra
           | 
           | LLMs / Generative Models can have a profound societal and
           | economic impact without being intelligent. The obsession with
           | intelligence only make their use haphazard and dangerous.
           | 
           | It is a good thing court of laws have established precedent
           | that organizations deploying LLM chatbots are responsible for
           | their output (Eg, Air Canada LLM chatbot promising a non-
           | existent discount being responsibility of Air Canada)
           | 
           | Also most automation has been happening without
           | LLMs/Generative Models. Things like better vision systems
           | have had an enormous impact with industrial automation and
           | QA.
        
             | agentultra wrote:
             | The conclusion of the article admits that in areas where
             | stochastic outputs are expected these AI models will
             | continue to be useful.
             | 
             | It's in area where we demand correctness and determinism
             | that they will not be suitable.
             | 
             | I think the thrust of this article is hard to see unless
             | you have some experience with formal methods and
             | verification. Or else accept the authors' explanations as
             | truth.
        
           | n144q wrote:
           | I'll believe that when ChatGPT stops making up APIs that have
           | never ever existed in the history of a library.
           | 
           | The dumbest intern doesn't do that.
           | 
           | Which is the entire point of the article that your comment
           | fail to address.
        
             | ojhughes wrote:
             | In fairness, I've never experienced this using Claude with
             | Cursor.
        
             | jondwillis wrote:
             | Use Cursor or something similar and feed it documentation
             | as context. Problem solved.
        
           | zwnow wrote:
           | Lmao, you are the type of person actually believing these
           | silicon valley bs. o3 is far, far away from AGI.
        
           | aaroninsf wrote:
           | Indeed, this put me immediately in mind of Ximm's Law:
           | 
           | Every critique of AI assumes to some degree that contemporary
           | implementations will not, or cannot, be improved upon.
           | 
           | Lemma: any statement about AI which uses the word "never" to
           | preclude some feature from future realization is false.
        
             | layer8 wrote:
             | And every advocate of AI assumes that it will necessarily
             | and reasonably swiftly be improved to the level of AGI.
             | Maybe assume neither?
        
           | cootsnuck wrote:
           | But o3 is just a slightly less stupid idiot savant...it still
           | has to brute force solutions. Don't get me wrong, it's cool
           | to see how far that technique can get you on a specific
           | benchmark.
           | 
           | But the point still stands that these systems can't be
           | treated as deterministic (i.e. reliable or trustworthy) for
           | the purposes of carrying out tasks that you can't allow
           | "brute forced attempts" for (e.g. anything where the desired
           | outcome is a positive subjective experience for a human).
           | 
           | A new architecture is going to be needed that actually does
           | something closer to our inherently heuristic based learning
           | and reasoning. We'll still have the stochastic problem but
           | we'll be moving further away from the idiot savant problem.
           | 
           | All of this being said, I think there's plenty of usefulness
           | with current LLMs. We're just expecting the wrong things from
           | them and therefore creating suboptimal solutions. (Not
           | everyone is, but the most common solutions are, IMO.)
           | 
           | The best solutions need to be rethinking how we typically use
           | software since software has been hinged upon being able to
           | expect (and therefore test) dertiministic outputs from a
           | limited set of user inputs.
           | 
           | I work for an AI company that's been around for a minute
           | (make our own models and everything). I think we're both in
           | an AI hype bubble while simultaneously underestimating the
           | benefits of current AI capabilities. I think the most
           | interesting and potentially useful solutions are inherently
           | going to be so domain specific that we're all still too new
           | at realizing we need to reimagine how to build with this new
           | tech in mind. It reminds me of the beginning of mobile apps.
           | It took awhile for most us to "get it".
        
             | turboat wrote:
             | Can you elaborate about your predictions for how the
             | benefits of current capabilities will be applied? And your
             | thoughts on how to build with it?
        
           | JohnMakin wrote:
           | > After reading about o3's performance on ARC-AGI, I strongly
           | suspect people will not be so flippantly dismissive of the
           | inherent limits of these technologies by this time next year.
           | 
           | If I wasn't so slammed with work I have half a mind to go
           | dredge up at least a dozen posts that said the same thing
           | last year, and the year before. Even OpenAI has been moving
           | the goalposts here.
        
           | tsurba wrote:
           | My favorite quote in this topic:
           | 
           | "If intelligence lies in the process of acquiring new skills,
           | there is no task X that solving X proves intelligence"
           | 
           | IMO it especially applies to things like solving a new IQ
           | puzzle, especially when the model is pretrained for that
           | particular task type, like was done with ARC-AGI.
           | 
           | For sure, it's very good research to figure out what kind of
           | tasks are easy for humans and difficult for ML, and then
           | solve them. The jump in accuracy was surprising. But still in
           | practice the models are unbeliavably stupid and lacking in
           | common sense.
           | 
           | My personal (moving) goalpost for "AGI" is now set to whether
           | a robot can keep my house clean automatically. Its not
           | general intelligence if it can't do the dishes. And before
           | physical robots, being less of a turd at making working code
           | would be a nice start. I'm not yet convinced general purpose
           | LLMs will lead to cost-effective solutions to either vs
           | humans. A specifically built dish washer however...
        
           | cormackcorn wrote:
           | Myopic? You must be under 20 years old. For those of us who
           | have been in tech for over four decades the OPs assessment is
           | exactly the right framing.
        
           | benterix wrote:
           | > After reading about o3's performance
           | 
           | I heard that people still believing in OpenAI hype exist but
           | I haven't met any IRL.
        
         | FuriouslyAdrift wrote:
         | LLMs are fuzzy compression with a really good natural language
         | parser...
        
           | danielbln wrote:
           | And strong in-context learning, the true killer feature.
        
             | ithkuil wrote:
             | In order to understand natural language well you need quite
             | a lot of general knowledge just to understand what the
             | sentence actually means
        
         | ALittleLight wrote:
         | You really shouldn't say LLMs "never graduate" to experienced
         | staff - rather that they haven't yet. But there are recent and
         | continuing improvements in the ability of the LLMs, and in
         | time, perhaps a small amount of time, this situation may flip.
        
           | bsenftner wrote:
           | I'm talking about the current SOTA. In the future, all bets
           | are off. For today, they are very capable when paired with a
           | capable person, and that is how one uses them successfully
           | today. Tomorrow will be different, of course.
        
         | brookst wrote:
         | I think you've exactly captured the two disparate views we see
         | on HN:
         | 
         | 1. LLMs have little value, are totally unreliable, will never
         | amount to much because they don't learn and grow and mature
         | like people do., so they cannot replace a person like me who is
         | well advanced in a career.
         | 
         | 2. LLMs are incredible useful and will change the world because
         | they excel at entry level work and can replace swaths of
         | relatively undifferentiated information workers. LLM flaws are
         | not that different from those workers' flaws.
         | 
         | I'm in camp 2, but I appreciate and agree with the articulation
         | of why they will not replace every information worker.
        
         | layer8 wrote:
         | This all sounds plausible, but personally I find being paired
         | to a new idiot-savant hire who never learns anything from the
         | interaction incredibly exhausting. It can augment and amplify
         | one's own capabilities, but it's also continuously frustrating
         | and cumbersome.
        
       | iambateman wrote:
       | While these folks waste breath debating whether AI is useful, I'm
       | going to be over here...using it.
       | 
       | I use AI 100 times a day as a coder and 10,000 times a day in
       | scripts. It's enabled two specific applications I've built which
       | wouldn't be possible at single-person scale.
       | 
       | There's something about the psychology of some subset of the
       | population that insists something isn't working when it isn't
       | _quite_ working. They did this with Wikipedia. It was evident
       | that Wikipedia was 99% great for years before this social
       | contingent was ready to accept it.
        
         | Mistletoe wrote:
         | But please accept that you are in a small subset of people that
         | it is very useful to. Every time I hear someone championing AI,
         | it is a coder. AI is basically useless to me, it is just a
         | convoluted expensive google search.
        
           | tossandthrow wrote:
           | I use ai to care for my plants, to give me recipes for pan
           | pancakes, to help me fix my coffee machine.
           | 
           | LLMs as a popularized thing is just about 2 years old. It is
           | still mainly early adopters.
           | 
           | For smartphones it might have taken 10 to 15 years to gain
           | widespread traction.
           | 
           | I think it is safe to say that we are only scratching the
           | surface.
        
             | giraffe_lady wrote:
             | These are not categories that needed this change or benefit
             | from it. Specific plant care is one of the easiest things
             | to find information about. And are you serious you couldn't
             | find a pancake recipe? The coffee machine idk it depends on
             | what you did. But the other two are like a parody of AI use
             | cases. "We made it slightly more convenient, but it might
             | be wrong now and also burns down a tree every time you use
             | it."
        
               | qup wrote:
               | > "We made it slightly more convenient, but it might be
               | wrong now and also burns down a tree every time you use
               | it."
               | 
               | Sounds like early criticisms of the internet. I assume
               | you mean he should be doing those things with a search
               | engine, but maybe we shouldn't allow that either. Force
               | him to use a book! It may be slightly less convenient,
               | and could still be wrong, but...
        
               | tossandthrow wrote:
               | I ought to ask my dead grandma to save a kg of co2.
        
               | giraffe_lady wrote:
               | Before crypto and AI computing in general and the
               | internet in particular were always an incredible deal in
               | terms of how much societal value we get out of it for the
               | electricity consumed.
        
             | Loughla wrote:
             | >It is still mainly early adopters.
             | 
             | I just disagree with this. Every b2b or saas is marketing
             | itself as using hallucination free AI.
             | 
             | We're waaaaayyyyyy past the early adoption stage, and the
             | product hasn't meaningfully improved.
        
             | coffeebeqn wrote:
             | I also use it for plant care tips. What should I feed this
             | plant and what kind of soil to use and all the questions I
             | never bothered to Google and crawl through some long blog
             | article on
        
           | davidmurdoch wrote:
           | Do you not use it to try learning new things? I use it to
           | help get familiar with new software (recently for FreeCAD),
           | or new concepts (passive speaker crossover design).
        
           | josh2600 wrote:
           | Wow, this is a wild opinion. I wonder how many people you've
           | talked to about this?
           | 
           | I know tons of people in my social groups that love AI and
           | use it every day in it's current form.
        
           | herval wrote:
           | it's _extremely_ useful for lawyers, arguably even more so
           | than for coders, given how much faster they can do stuff.
           | They're also extremely useful for anyone who writes text and
           | wants a reviewer. Also capable to execute most daily
           | activities of some roles, such as TPMs.
           | 
           | It's still useful to a small subset of all those professions
           | - the early adopters. Same way computers were useful to many
           | professionals before the UI (but only a small fraction of
           | them had the skillset to use terminals)
        
             | singleshot_ wrote:
             | > it's _extremely_ useful for lawyers,
             | 
             | How so? How are you using LLMs to practice law? Genuinely
             | curious.
        
               | herval wrote:
               | multiple lawyer friends I know are using chatgpt (and
               | custom gptees) for contract reviews. They upload some
               | guidelines as knowledge, then upload any new contract for
               | validation. Allegedly replaces hours of reading. This is
               | a large portion of the work, in some cases. Some of them
               | also use it to debate a contract, to see if there's
               | anything they overlooked or to find loopholes. LLMs are
               | extremely good at that kind of constrained creativity
               | mode where they _have_ to produce something (they suck at
               | saying "I dont know" or "no"), so I guess it works as
               | sort of a "second brain" of sorts, for those too.
               | 
               | There's even reported cases of entire legislations being
               | written with LLMs already [1]. I'm sure there's thousands
               | more we haven't heard about - the same way researchers
               | are writing papers w/ LLMs w/o disclosing it
               | 
               | [1] https://olhardigital.com.br/2023/12/05/pro/lei-
               | escrita-pelo-...
        
               | rsynnott wrote:
               | Five years later, when the contract turns out to be
               | defective, I doubt the clients are going to be _thrilled_
               | with "well, no, I didn't read it, but I did feed it to a
               | magic robot".
               | 
               | Like, this is malpractice, surely?
        
               | transcriptase wrote:
               | It only has to be less likely to cause that issue than a
               | paralegal to be a net positive.
               | 
               | Some people expect AI to never make mistakes when doing
               | jobs where people routinely make all kinds of mistakes of
               | varying severity.
               | 
               | It's the same as how people expect self-driving cars to
               | be flawless when they think nothing of a pileup caused by
               | a human watching a reel while behind the wheel.
        
               | WhyOhWhyQ wrote:
               | Any evidence it's actually better than a paralegal? I
               | doubt it is.
        
               | voltaireodactyl wrote:
               | In the pileup example, the human driver is legally at
               | fault. If a self driving car causes the pileup, who is at
               | fault?
        
               | qup wrote:
               | Well, maybe its wheel fell off.
               | 
               | So, the mechanic who maintenanced it last?
               | 
               | ...
               | 
               | We don't fault our tools, legally. We usually also don't
               | fault the manufacturer, or the maintenance guy. We fault
               | the people using them.
        
               | herval wrote:
               | My understanding is the firm operating the car is liable,
               | in the full self driving case of commercial vehicles
               | (waymo). The driver is liable in supervised self driving
               | cases (privately owned Tesla)
        
               | herval wrote:
               | This is malpractice the same way that a coder using
               | Copilot is malpractice
        
               | tiahura wrote:
               | Drafting demand letters, drafting petitions, drafting
               | discovery requests, drafting discovery responses,
               | drafting golden rule letters, summarizing meet and confer
               | calls, drafting motions, responding to motions, drafting
               | depo outlines, summarizing depos, ...
               | 
               | If you're not using ai in your practice, you're doing a
               | disservice to your clients.
        
               | singleshot_ wrote:
               | How do you get the LLM to the point where it can draft a
               | demand letter? I guess I'm a little confused as to how
               | the LLM is getting the particulars of the case in order
               | to write a relevant letter. Are you typing all that stuff
               | in as a prompt? Are you dumping all the case file
               | documents in as prompts and summarizing them, and then
               | dumping the summaries into the prompt?
        
               | tiahura wrote:
               | Demand letters are the easiest. Drag and drop police
               | report and medical records. Tell it to draft a demand
               | letter. For most things, there are only a handful
               | critical pages in the medical records, so if the original
               | pdf is too big, I'll trim excess pages. I may also add my
               | personal case notes.
               | 
               | I use a custom prompt to adjust the tone, but that's
               | about it.
        
               | herval wrote:
               | curious about what tools you're using - is it just
               | chatgpt? Any other apps/services/models?
        
             | spzb wrote:
             | Except for those lawyers who rely on it for case law eg
             | https://law.justia.com/cases/federal/district-courts/new-
             | yor...
        
               | herval wrote:
               | I think the big mistake is _blindly relying on the
               | results_ - although that problem has been improving
               | dramatically (gpt3.5 hallucinated constantly, I rarely
               | see a hallucination w/ the latest gpt/claude models)
        
           | y1n0 wrote:
           | My wife uses it almost as much as me which isn't quite daily.
           | She is not a coder whatsoever.
           | 
           | I'll ask her what her use cases are and reply here later if I
           | don't forget.
        
           | umanwizard wrote:
           | Walk into any random coffee shop in America where people are
           | working on their laptops and you will see some subset of them
           | on ChatGPT. It's definitely not just coders who are finding
           | it useful.
        
           | Ukv wrote:
           | Particularly given the article's target is "systems based on
           | large neural networks" and not specifically LLMs, I'd claim
           | there are a vast number of uncontroversially beneficial
           | applications: language translation, video transcription,
           | material/product defect detection, weather forecasting/early
           | warning systems, OCR, spam filtering, protein folding, tumor
           | segmentation, drug discovery and interaction prediction, etc.
        
           | stickfigure wrote:
           | > convoluted expensive google search
           | 
           | I'd call it a _working_ google search, unlike, you know,
           | google these days.
           | 
           | Actually google's LLM-based search results have been getting
           | better, so maybe this isn't the end of the line for them. But
           | for sophisticated questions (on noncoding topics!) I still
           | always go to chatgpt or claude.
        
             | coliveira wrote:
             | > google's LLM-based search results have been getting
             | better
             | 
             | don't worry, Google WILL change this because they don't
             | make money when people find the answer right away. They
             | want people to see multiple ads before leaving the site.
        
           | FrustratedMonky wrote:
           | It's being used in drive through windows. In movies, in
           | graphic design, pod casts, music, etc... 'entertainment'
           | industry.
           | 
           | And HN, it isn't just a few odd balls on HN championing it. I
           | wish there was way to get a sentiment analysis of HN, it
           | seems there are lot more people using it than not using it.
           | 
           | And, what about the silent majority, the programmers that
           | don't hang out on HN? I hear colleagues talk about it all the
           | time.
           | 
           | The impact is here, whether they are self directed or not, or
           | whether there are still a few people not using it.
        
           | sebastiansm wrote:
           | Yesterday ChatGPT helped me to elaborate a skincare routine
           | for my wife with multiple serums and creams that she received
           | for Christmas. She and I had no idea when to apply, how to
           | combine or when to avoid combination of some of those
           | products.
           | 
           | I could have google it myself in the evenings and had the
           | answer in a few days of research, but with o1 in a 15min
           | session my wife had had a solid weekly routine, the reasoning
           | about those choices and academic papers with research about
           | those products. (Obviously she knows a lot about skincare in
           | general, so she had the capacity to recognize any wrong
           | recommendation).
           | 
           | Nothing game changer, but is great to save lots of time in
           | this kind of tasks.
        
             | exe34 wrote:
             | Mention bleach and motor oil and see if it manages to
             | exclude those!
        
               | diego_sandoval wrote:
               | If you think it won't exclude them 100% of the time, then
               | you haven't used o1.
        
             | irunmyownemail wrote:
             | It's 2 days after Christmas, too early to know the impact
             | of the purchases made based on what AI recommended, either
             | positive or negative.
             | 
             | If you're relying on AI to replace a human doctor trained
             | in skin care or alternatively, your Google skills; please
             | consider consulting an actual doctor.
             | 
             | If she "knows a lot about skincare in general, so she had
             | the capacity to recognize any wrong recommendation", then
             | what did AI actually accomplish in the end.
        
               | YeGoblynQueenne wrote:
               | >> It's 2 days after Christmas, too early to know the
               | impact of the purchases made based on what AI
               | recommended, either positive or negative.
               | 
               | No worries, I can tell you what to expect: nothing. No
               | effect. Zilch. Nada. Zero. Those beauty creams are just a
               | total scam and that's obvious from the fact they're
               | targetted just as well to women who don't need them
               | (young, good skin) as to ones who do (older, bad skin).
               | 
               | About the only thing the beauty industry has figured out
               | really works in the last five or six decades is
               | Tretinoin, but you can use that on its own. Yet it's sold
               | as one component in creams with a dozen others, that do
               | nothing. Except make you spend money.
        
               | YeGoblynQueenne wrote:
               | Forgot to say: you can buy Tretinoin at the pharmacy,
               | over the counter even depending on where you are. They
               | sell it as a treatment for acne. It's also shown to
               | reduce wrinkles in RCTs [1]. It's dirt cheap and you
               | absolutely don't need to buy it as a beauty cream and pay
               | ten times the price.
               | 
               | _____________
               | 
               | [1] _Topical tretinoin for treating photoaging: A
               | systematic review of randomized controlled trials_ (2022)
               | 
               | https://pmc.ncbi.nlm.nih.gov/articles/PMC9112391/
        
           | drooby wrote:
           | > convoluted expensive google search
           | 
           | Interesting, I'm the opposite now. Why would I click a couple
           | links to read a couple (verbose) blog posts when I can read a
           | succinct LLM response. If I have low confidence in the
           | quality of the response then I supplement with Google search.
           | 
           | I feel near certain that I am saving time with this method.
           | And the output is much more tuned to the context and framing
           | of my question.
           | 
           | Hah, take for example my last query in ChatGPT:
           | 
           | > Are there any ancient technologies that when discovered
           | furthered modern understanding of its field?
           | 
           | ChatGPT gave some great responses, super fast. Google also
           | provides some great results (though some miss the mark), but
           | I would need to parse at least three different articles and
           | condense the results.
           | 
           | To be fair, ChatGPT gives some bad responses too.. But both
           | an LLM and Google search should be used in conjunction to
           | perform a search at the same time.
           | 
           | Use LLMs as breadth-first search, and Google as depth-first
           | search.
        
           | paulcole wrote:
           | > Every time I hear someone championing AI, it is a coder
           | 
           | The argument I make is why aren't more people finding ways to
           | code with AI?
           | 
           | I work in a leadership role at a marketing agency and am a
           | passable coder for scripts using Python and/or Google Apps
           | Scripts. In the past year, I've built more useful and
           | valuable tools with the help of AI than I had in the 3 or so
           | years before.
           | 
           | We're automating more boring stuff than ever before. It
           | boggles my mind that everybody isn't doing this.
           | 
           | In the past, I was limited by technical ability because my
           | knowledge of our business and processes was very high. Now
           | I'm finding that technical ability isn't my limitation, it's
           | how well I can explain our processes to AI.
        
           | mbesto wrote:
           | Not a coder here (although I can code). I use LLMs 15+ time a
           | day.
        
           | tiahura wrote:
           | I'm a lawyer and AI has become deeply integrated into my
           | work.
        
           | AnotherGoodName wrote:
           | I'd argue that's just because coders are first to line up for
           | this.
           | 
           | There was a different thread on this site i read where a
           | journalist used the wrong units of measurement (kilowatts
           | instead of killowatt-hours for energy storage). You could
           | paste the entire article into chatGPT with a prompt "spot
           | mistakes in the following; [text]" and get an appropriate
           | correction for this and similar mistakes the author made.
           | 
           | As in there's journalists right now posting articles with
           | clear mistakes that could have been proof read more
           | accurately than they were if they were willing to use AI. The
           | only excuse i could think of is resistance to change. A lot
           | of professions right now could do their job better if they
           | leant on the current generation of AI.
        
           | oytis wrote:
           | In my bubble coders find LLMs least useful. After all we
           | already have all kinds of fancy autocomplete that works
           | deterministically and doesn't hallucinate - and still not
           | everyone uses it.
           | 
           | When I use LLMs, I use it exactly as Google search on
           | steroids. It's great for providing a summary on some unknown
           | topic. It doesn't matter if it gets it wrong - the main value
           | is in keywords and project names, and one can use the real
           | Google search from there.
           | 
           | And it isn't expensive if you are using the free version
        
         | giraffe_lady wrote:
         | These are different social contingents I think. At least for me
         | I was super on board with wikipedia because as you say the use
         | to me was immediate and certain. AI I have tried every few
         | months for the last two years but I still haven't found a
         | strong use for it. It has changed nothing for me personally
         | except making some products I use worse.
        
           | llm_trw wrote:
           | Have you paid for it?
        
             | giraffe_lady wrote:
             | Yes my work pays for several of them. I don't particularly
             | enjoy coding so believe me I have sincerely tried to get
             | this to work.
        
               | timcobb wrote:
               | Cursor has been quite the jaw-dropping game changer for
               | me for greenfield hobby dev.
               | 
               | I don't know how useful it would be for my job, where I
               | do maintenance on a pretty big app, and develop features
               | on this pretty big app. But it could be great, I just
               | don't know because work only allows Copilot. And Copilot
               | is somewhere between annoying and novelty in my opinion.
        
           | Loughla wrote:
           | AI is only useful for me if I have a good idea of what the
           | answer might already be, or at least what it absolutely can't
           | be.
           | 
           | It helps me get to an answer a little bit quicker, but it
           | doesn't perform any absolutely groundbreaking work for me.
        
         | xvector wrote:
         | The Wikipedia analogy strikes true.
         | 
         | Generally people are resistant to change and the average person
         | will typically insist new technologies are pointless.
         | 
         | Electricity and the airplane were supposed to be useless and
         | dangerous dead ends according to the common person:
         | https://pessimistsarchive.org/
         | 
         | But we all like to think we have super unique opinions and
         | personalities, so "this time it's different."
         | 
         | When the change finally happens, people go about their lives as
         | if they were right all along and the new technology is simply a
         | mysterious and immutable fixture of reality that was always
         | there.
        
           | toddmorey wrote:
           | I don't think that was the common person, nor do I think the
           | common person today thinks AI will be useless.
        
           | wat10000 wrote:
           | People also thought the Segway was a useless dead end and
           | they were right.
        
             | timcobb wrote:
             | Segway seems to have hardly been a dead end, or useless for
             | that matter. Segway-style devices like the electric
             | unicycle and many other light mobility devices seem to be
             | direct descendants of the Segway. Segway introduced
             | gyroscopes to the popular tech imagination, at least in my
             | lifetime (not sure before).
        
               | wat10000 wrote:
               | What other light mobility devices? E-bikes and scooters
               | seem to be the big things and they're not anything like a
               | Segway descendant.
               | 
               | A world where Segway never happened would be nearly
               | indistinguishable from our own.
        
               | Philpax wrote:
               | https://en.wikipedia.org/wiki/Self-balancing_scooter
               | 
               | Not the most popular, especially these days, but they are
               | very much descended from Segways and have their own fans.
        
               | marcosdumay wrote:
               | Smartphones introduced gyroscopes to popular tech (and
               | no, people were _imagining_ them before transistors),
               | Segway had nothing to do with that.
        
           | bpfrh wrote:
           | There is a vast difference between arguments like "Phones
           | have been accused of ruining romantic interaction and
           | addicting us to mindless chatter" and "current AI has
           | problems generating accurate information and can't replace
           | researching things by hand for complicated or niche topics
           | and there is reason to believe that the current architecture
           | may not solve this problem"
           | 
           | That aside optimist are also not always right, otherwise we
           | would have cold fusion already and have a base on mars.
        
           | rsynnott wrote:
           | > But we all like to think we have super unique opinions and
           | personalities, so "this time it's different."
           | 
           | Are you suggesting that anything which is hyped is the
           | future? Like, for every ten heavily-hyped things, _maybe_ one
           | has some sort of post-hype existence.
        
           | coliveira wrote:
           | The pessimist is not wrong. In fact he's right more
           | frequently than wrong. Just look at a long list of
           | inventions. How many of them were so successful as the car or
           | the airplane? Most of them were just passing fads that people
           | don't even remember anymore. So if you're asking who is
           | smarter, I would say the pessimist is closer to the truth,
           | but the optimist who believed in something that really became
           | successful is now remembered by everyone.
        
             | Ukv wrote:
             | I feel your argument relies on assuming that being an
             | optimist or pessimist means believing 100% or 0%, whereas
             | I'd claim it's instead more just having a relative leaning
             | in a direction. Say after inspecting some rusty old engines
             | a pessimist predicts 1/10 will still function and an
             | optimist predicts 4/10 will function. If the engines do
             | better than expected and 3/10 function, the optimist was
             | closer to the truth despite most not working.
             | 
             | Similarly, being optimistic doesn't mean you have to
             | believe every single early-stage invention will work out no
             | matter how unpromising - I've been enthusiastic about deep
             | learning for the past decade (for its successes in language
             | translation, audio transcription, material/product defect
             | detection, weather forecasting/early warning systems, OCR,
             | spam filtering, protein folding, tumor segmentation, spam
             | filtering, drug discovery and interaction prediction, etc.)
             | but never saw the appeal of NFTs.
             | 
             | Additionally worth considering that the cost of trying
             | something is often lower than the reward of it working out.
             | Even if you were wrong 80% of the time about where to dig
             | for gold, that 20% may well be worth it; reducing merely
             | the _frequency_ of errors is often not logically correct.
             | It 's useful in a society to have people believe in and
             | push forward certain inventions and lines of research even
             | if most do not work out.
             | 
             | I think xvector's point is about people rehashing the same
             | denunciations that failed to matter for previous successful
             | technologies - the idea that something is useless because
             | it's not (or perhaps will never be) 100.0% accurate, or the
             | "Until it can do dishes, home computer remains of little
             | value to families"[0] which I've seen pretty much ad
             | verbatim for AI many times (extra silly now that we have
             | dishwashers).
             | 
             | Given in real life things have generally improved (standard
             | of living, etc.), I think it has typically been more
             | correct to be optimistic, and hopefully will be into the
             | future.
             | 
             | [0]: https://pessimistsarchive.org/clippings/34991885.jpg
        
           | jdbernard wrote:
           | This argument is very prone to survivorship bias. Of course,
           | when we think back to the hyped technologies of the past we
           | are going to remember mostly those that justified the hype.
           | The failures get forgotten. The memory of social discourse
           | fades extremely quickly, much faster than, for example, pop
           | culture or entertainment.
        
         | i_love_retros wrote:
         | > It's enabled two specific applications I've built which
         | wouldn't be possible at single-person scale.
         | 
         | I'd love to hear more about how you utilised AI for this.
         | 
         | Personally I'm struggling to find it more useful than a
         | slightly fancy code completion tool
        
           | broast wrote:
           | > slightly fancy code completion tool
           | 
           | Does this alone not increase your productivity exponentially?
           | It does mine. I personally read code faster than I write it
           | so it is an undeniable boon.
        
             | i_love_retros wrote:
             | I've found it depends on the context (pardon the pun)
             | 
             | For example, personal projects that are small and where
             | copilot has access to all the context it needs to make a
             | suggestion - such as a script or small game - it has been
             | really useful.
             | 
             | But in a real world large project for my day job, where it
             | would need access to almost the entire code base to make
             | any kind of useful suggestion that could help me build a
             | feature, it's useless! And I'd argue this is when I need
             | it.
        
               | wenc wrote:
               | At present, LLMs work well with smaller chunks of code at
               | time.
               | 
               | Check out these tips for using Aider (a CLI based LLM
               | code assisntant) https://aider.chat/docs/usage/tips.html
               | 
               | It can ingest the entire codebase (up to its context
               | length), but for some reason, I've always had much higher
               | quality chats with smaller bite-sized pieces of code.
        
             | jprete wrote:
             | Autocomplete distracts me enough that it really needs to be
             | close to 100% correct before it's useful. Otherwise it's
             | just wrecking my flow and slowing me down.
        
             | coffeebeqn wrote:
             | Exponentially? Absolutely not. In the best case it creates
             | something that's almost useful. Are you working on large
             | actual codebases or talking about some one off toy apps?
        
             | agos wrote:
             | it's surely a boon, but does not match the hype
        
           | wbazant wrote:
           | You could try aider, or another tool/workflow where you
           | provide whole files and ask for how they should be changed -
           | very different from code completion type tools!
        
         | hbn wrote:
         | Anyone who says AI is useless never had to do the old method of
         | cobbling together git and ffmpeg commands from StackOverflow
         | answers.
         | 
         | I have no interest in learning the horrible unintuitive UX of
         | every CLI I interact with, I'd much rather just describe in
         | English what I want and have the computer figure it out for me.
         | It has practically never failed me, and if it does I'll know
         | right away and I can fall back to the old method of doing it
         | manually. For now it's saving me so much time with menial,
         | time-wasting day-to-day tasks.
        
           | jghn wrote:
           | I had a debate recently with a colleague who is very
           | skeptical of LLMs for every day work. Why not lean in on
           | searching Google and cross referencing answers, like we've
           | done for ages? And that's fine.
           | 
           | But my counterargument is that what I find to be so powerful
           | about the LLMs is the ability to refine my question, narrow
           | in on a tangent and then pull back out, etc. And *then* I can
           | take its final outcome and cross reference it. With the old
           | way of doing things, I often felt like I was stumbling in the
           | dark trying to find the right search string. Instead I can
           | use the LLM to do the heavy lifting for me in that regard.
        
           | ADeerAppeared wrote:
           | > Anyone who says AI is useless
           | 
           | Most of those people are a bit bad at making their case. What
           | they mean but don't convey well is that AI is useless _for it
           | 's proclaimed uses_.
           | 
           | You are correct that LLMs are pretty good at guessing this
           | kind of well-documented & easily verifiable but hard to find
           | information. That is a valid use. (Though, woe betide the
           | fool who uses LLMs for irreversible destructive actions.)
           | 
           | The thing is though, this isn't enough. There just aren't
           | that many questions out there that match those criteria.
           | Generative AI is too expensive to serve that small a task.
           | Charging a buck a question won't earn the $100 billion OpenAI
           | needs to balance the books.
           | 
           | Your use case gets dismissed because on it's own, it doesn't
           | sustain AI.
        
             | wenc wrote:
             | I think you're on to something. I find the sentiment around
             | LLMs (which is at the early adoption stage) to be
             | unnecessarily hostile. (beyond normal HN skepticism)
             | 
             | But it can be simultaneously true that LLMs add a lot of
             | value to some tasks and less to others --- and less to some
             | people. It's a bit tautological, but in order to benefit
             | from LLMs, you have to be in a context where you stand to
             | most benefit from LLMs. These are people who need to
             | generate ideas, are expert enough to spot consequential
             | mistakes, know when to use LLMs and when not to. They have
             | to be in a domain where the occasional mistake generated
             | costs less than the new ideas generated, so they still come
             | out ahead. It's a bit paradoxical.
             | 
             | LLMs are good for: (1) bite-sized chunks of code; (2)
             | ideating; (3) writing once-off code in tedious syntax that
             | I don't really care to learn (like making complex plots in
             | seaborn or matplotllib); (4) adding docstrings and
             | documentation to code; (5) figuring out console error
             | messages, with suggestions as to causes (I've debugged a
             | ton of errors this way -- and have arrived at the answer
             | faster than wading through Stackoverflow); (6) figuring out
             | what algorithm to use in a particular situation; etc.
             | 
             | They're not yet good at: (1) understanding complex
             | codebases in their entirety (this is one of the
             | overpromises; even Aider Chat's docs tell you not to ingest
             | the whole codebase); (2) any kind of fully automated task
             | that needs to be 100% deterministic and correct (they're
             | assistants); (3) getting math reasoning 100% correct (but
             | they can still open up new avenues for exploration that
             | you've never even thought about);
             | 
             | It takes practice to know what LLMs are good at and what
             | they're not. If the initial stance is negativity rather
             | than a growth mindset, then that practice never comes.
             | 
             | But it's ok. The rest of us will keep on using LLMs and
             | move on.
        
               | esafak wrote:
               | An example that might be of interest to readers: I gave
               | it two logs, one failing and one successful, and asked it
               | to troubleshoot. It turned out a loosely pinned
               | dependency (Docker image) had updated in the failing one.
               | An error mode I was familiar with and could have solved
               | on my own, but the LLM saved me time. They are reliable
               | at sifting through text.
        
               | mistercheph wrote:
               | Hostility and a few swift kicks are in order when the
               | butt scratchers start saying their stochastic parrot
               | machine is intelligent and a superman.
        
               | Loughla wrote:
               | I've been sold AI as if it can do anything. It's being
               | actively sold like a super intelligent independent human
               | that never needs breaks.
               | 
               | And it just isn't that thing. Or, rather, it is super
               | intelligent but lacks any wisdom at all; thus rendering
               | it useless for how it's being sold to me.
               | 
               | >which is at the early adoption stage
               | 
               | I've said this in other places here. LLM's simply aren't
               | at early adoption stage anymore. They're being packaged
               | into literally every saas you can buy. They're a main
               | selling point for things like website builders and other
               | direct to business software platforms.
        
               | wenc wrote:
               | Why not ignore the hype, and just quietly use what works?
               | 
               | I don't use anything other than ChatGPT 4o and Claude
               | Sonnet 3.5v2. That's it. I've derived great value from
               | just these two.
               | 
               | I even get wisdom from them too. I use them to analyze
               | news, geopolitics, arguments around power structures,
               | urban planning issues, privatization pros and cons, and
               | Claude especially is able to give me the lay of the land
               | which I am usually able to follow up on. This use case is
               | more of the "better Google" variety rather than task-
               | completion, and it does pretty well for the most part.
               | Unlike ChatGPT, Claude will even push back when I make
               | factually incorrect assertions. It will say "Let me
               | correct you on that...". Which I appreciate.
               | 
               | As long as I keep my critical thinking hat on, I am able
               | to make good use of the lines of inquiry that they
               | produce.
               | 
               | Same caveat applies even to human-produced content. I
               | read the NYTimes and I know that it's wrong a lot, so I
               | have to trust but verify.
        
               | Loughla wrote:
               | I agree with you, but it's just simply not how these
               | things are being sold and marketed. We're being told we
               | do not have to verify. The AI knows all. It's
               | undetectable. It's smarter and faster than you.
               | 
               | And it's just not.
               | 
               | We made a scavenger hunt full of puzzles and riddles for
               | our neighbor's kids to find their Christmas gifts from us
               | (we don't have kids at home anymore, so they fill that
               | niche and are glad to because we go ballistic at
               | Christmas and birthdays). The youngest of the group is
               | the tech kid.
               | 
               | He thought he fixed us when he realized he could use
               | chatgpt to solve the riddles and cyphers. It recognized
               | the Caesar letter shift to negative 3, but then made up a
               | random phrase with words the same length to solve it. So
               | the process was right, but the outcome was just
               | outlandishly incorrect. It wasted about a half hour of
               | his day. . .
               | 
               | Now apply that to complex systems or just a simple large
               | database, hell, even just a spreadsheet. You check the
               | process, and it's correct. You don't know the outcome, so
               | you can't verify unless you do it yourself. So what's the
               | point?
               | 
               | For context, I absolutely use LLM's for things that I
               | know roughly, but don't want to spend the time to do.
               | They're useful for that.
               | 
               | They're simply not useful for how they're being marketed,
               | which is too solve problems you don't already know.
        
           | hyhconito wrote:
           | You're still doing it the hard way. I just use Handbrake.
           | 
           | Pick a hammer, not a shitty hammer factory to assemble bits
           | of hammer.
        
             | sunnybeetroot wrote:
             | How do you use handbrake to write a script that uses
             | ffmpeg?
        
           | arkh wrote:
           | > if it does I'll know right away and I can fall back to the
           | old method of doing it manually
           | 
           | It's well and ok with things you can botch with no
           | consequence other than some time wasted. But I've bricked
           | enough VMs trying commands I did not understand to know that
           | if you need to not fuck up something you'll have to read
           | those docs and understand them. And hope they're not out of
           | date / wrong.
        
           | dangoodmanUT wrote:
           | > Anyone who says AI is useless never had to do the old
           | method of cobbling together git and ffmpeg commands from
           | StackOverflow answers.
           | 
           | The best ffmpeg and regex command generators
        
           | jv981 wrote:
           | try asking an LLM how to add compressed data to a tar through
           | standard input, see how that goes (don't forget to check the
           | answer :)
        
           | cootsnuck wrote:
           | I use LLMs to help me with ffmpeg commands more than I care
           | to admit
        
           | momentoftop wrote:
           | > Anyone who says AI is useless never had to do the old
           | method of cobbling together git and ffmpeg commands from
           | StackOverflow answers.
           | 
           | It's useful for that yes, but I'd rather just live in a world
           | where we didn't have such disasters of CLI that are git and
           | ffmpeg.
           | 
           | LLMs are very useful for generating the obscure boilerplate
           | needed because the underlying design is horrible. Relying on
           | it means acquiescing to those terrible designs rather than
           | figuring out redesigns that don't need the LLMs. For
           | comparison, IntelliJ is very good at automating all the
           | boilerplate generation that Java imposes on me, but I'd
           | rather we didn't have boilerplate languages like Java, and
           | I'd rather that IntelliJ's boilerplate generation didn't
           | exist.
           | 
           | I fear in many cases that if an LLM is solving your problem,
           | you are solving the wrong problem.
        
         | wruza wrote:
         | We can't test/review these apps though, can we?
         | 
         | I'm asking not for snark, but because when AI gives me
         | something not _quite_ working, it requires much more time than
         | what a "every 6 minutes in 10 hour work day" frame would allow
         | to investigate. I just wonder if _maybe_ you 're pasting it as
         | is and don't care about correctness if the happy path sort of
         | works. Speaking of subsets, coders who did that before AI were
         | also quite a group.
         | 
         | There must be _something_ that explains the difference in our
         | experiences. Apologies for the fact that my only idea is kinda
         | negative. I understand the potential hyperbola here, but it
         | doesn 't explain much. I can stand AI BS once a day, maybe
         | twice, before uncontrollably cursing into the chat.
        
           | thomashop wrote:
           | Why not write tests with AI, too? Since using LLMs as coding
           | assistants, my codebases have much more thorough
           | documentation, testing and code coverage.
           | 
           | Don't start when you're already in a buggy dead-end. Test-
           | driven development with LLMs should be done right from the
           | start.
           | 
           | Also keep the code modular so it is easy to include the
           | correct context. Fine-grained git commits. Feature-branches.
           | 
           | All the tools that help teams of humans of varying levels of
           | expertise work together.
        
             | croes wrote:
             | Becaus then you need tests for the tests
        
               | thomashop wrote:
               | Sure. You can always write more tests. That's not a
               | problem specific to AI.
               | 
               | I'd also do code reviews on the code AI produces.
        
           | mistercheph wrote:
           | You may have enough expertise in your field that when you
           | have a question, you know where to start looking. Juniors and
           | students encounter dozens of problems and question per hour
           | that fall into the unknown unknown category
        
         | croes wrote:
         | Are you still a coder when you use AI 100 times a day?
         | 
         | AI is a type of outsourcing, you became a customer.
        
           | mbernstein wrote:
           | Not outsourcing at all - you're are an engineer using the
           | tools that make sense to solve a problem. The core issue with
           | identifying as just a coder is that code is just one of many
           | potential tools to solve a problem.
        
             | croes wrote:
             | Could you distinguish code written by an AI from code
             | written by a fake AI that is actually a human being?
             | 
             | Something/someone other writes the code, that's
             | outsourcing.
             | 
             | I wouldn't consider myself an artist if I create a picture
             | per midjourney.
        
               | arkh wrote:
               | Do you write binary code or use a compiler?
               | 
               | Do you design all the NAND gates in your processor to get
               | the exact program you want out of it or use a general
               | purpose processor?
               | 
               | Current "coding" is just a detail of what you want to do:
               | solve problems. Which can require making a machine do
               | what you want it to.
        
               | croes wrote:
               | So your customer/employer is a coder too. They want solve
               | a problem and use a tool: You.
               | 
               | A coder writes code in a programming language, that what
               | distinguishes them from the customers who use natural
               | language. The coder is the translator between the
               | customer and the machine. If the machine does that, the
               | machine is the coder.
        
               | mbernstein wrote:
               | Is your customer bringing you the solution to the problem
               | or the problem and asking you to solve the problem? One
               | is a translation activity and the other isn't.
        
           | nlh wrote:
           | If you're sitting in front of the keyboard, inputting
           | instructions and running the resulting programs, yes you are
           | still a coder. You're just move another layer up on the
           | stack.
           | 
           | The same type of argument has been made for decades -- when
           | coders wrote in ASM, folks would ask "are you still a coder
           | when you use that fancy C to make all that low-level ASM
           | obsolete?". Etc etc.
        
             | croes wrote:
             | So if I sit in front of the keyboard and write an email
             | with instructions to my programmer I'm a coder.
        
           | owenpalmer wrote:
           | Are you still a coder when you use libraries or frameworks?
           | You didn't write the code yourself, you're just outsourcing
           | it.
        
         | stronglikedan wrote:
         | Have you tried a few? If so, which do you prefer? If not, which
         | do you use? I'm a little late to the party, and the current
         | amount of choices is quite intimidating.
        
           | airstrike wrote:
           | I imagine you're asking about coding help. For that, I think
           | you should qualify any answer you get with the user's most
           | commonly used language (and framework, if applicable).
           | 
           | In my experience, Claude Sonnet 3.5 (3.6?) has been
           | unbeatable. I use it for Rust. Making sense of compiler
           | errors, rubberducking, finding more efficient ways to write
           | some function and, truth be told, some times just plain old
           | debugging. More than once, I've been able to dump a massive
           | module onto the chat context and say "look, I'm experiencing
           | this weird behavior but it's really hard to pin down what's
           | causing it in this code" and it pointed to the _exact_ issue
           | in a second. That alone is worth the price of admission.
           | 
           | Way better than ChatGPT 4o and o-1, in my experience, despite
           | me saying the exact opposite a few months ago.
        
           | esafak wrote:
           | Try Cody. It integrates with your IDE, understands your code
           | base, and lets you pick the LLM.
        
         | monkeynotes wrote:
         | This isn't about if LLMs are useful, it's about how useful can
         | they become. We are trying to understand if there is a path
         | forward to transformative tech, or are we just limited to a
         | very useful tool.
         | 
         | It's a valid conversation after ~3 years of anticipating the
         | world to be disrupted by this tech. So far it has not
         | delivered.
         | 
         | Wikipedia did not change the world either, it's just a great
         | tool that I use all the time
         | 
         | As for software, it performs ok. I give up on it most of the
         | time if I am trying to write a whole application. You have to
         | acquire a new skill, prompt engineering, and feverish
         | iteration. It's a frustrating game of whack-a-mole and I find
         | it quicker to write the code myself and just have the LLM help
         | me with architecture ideas, bug bashing, and it's also quite
         | good at writing tests.
         | 
         | I'd rather know the code intimately so I can more quickly debug
         | it than have an LLM write it and just trust it did it well.
        
           | wenc wrote:
           | By the way, Wikipedia did change the world. Some of the most
           | important inventions are the ones we don't notice.
        
         | nemo44x wrote:
         | Peter Thiel talked about this years ago in his book From 0 to
         | One. His key insight, which we're seeing today, is that AI
         | tools will work side-by-side with people and enhance their
         | productivity to levels never imagined. From helping with some
         | basic tasks ("write an Excel script that transforms this table
         | from this format to this new format") to helping write
         | programs, it's a tool that aids humans in getting more things
         | done than previously possible.
        
       | paxys wrote:
       | Every piece of technology is a "dead end" until something better
       | replaces it. That doesn't mean it can't be useful or
       | revolutionary.
        
       | tu7001 wrote:
       | We all know what's an answer when there is a question mark in the
       | title.
        
         | IanHalbwachs wrote:
         | https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headline...
        
       | btbuildem wrote:
       | Interesting read -- and a correct take, given the software
       | development perspective. In that context, LLM-based AI is faulty,
       | unpredictable, and unmanageable, and not ready for mission-
       | critical applications.
       | 
       | If you want to argue otherwise, do a quick thought experiment
       | first: would you let an LLM manage your financial affairs
       | (entirely, unsupervised)? Would you let it perform your job while
       | you receive the rewards and consequences? Would you be
       | comfortable to give it full control of your smart home?
       | 
       | There are different sets of expectations put on human actors vs
       | autonomous systems. We expect people to be fallible and wrong
       | some of the time, even if the individuals in question can't/won't
       | admit it. With a software-based system, the expectations are that
       | it will be robust, tested, and performing correctly 100% of the
       | time, and when a fault occurs, it will be clear, marked with
       | yellow tape and flashing lights.
       | 
       | LLM-based AIs are sort of insidious in that they straddle this
       | expectation gap: the emergent behaviour is erratic, projecting
       | confident omniscience, while often hallucinating and plain wrong.
       | However vague, the catch-all term "AI" still implies "computer
       | system" and by extension "engineered and tested".
        
         | cesaref wrote:
         | It's a bad example. Lots of finance firms use AI to manage
         | their financial affairs - go and investigate what is currently
         | considered state of the art for trading algorithms.
         | 
         | Now if you substituted something safety critical instead, say,
         | running a nuclear power station, or my favourite currently in
         | use example, self driving cars, then yes, you should be scared.
        
           | galleywest200 wrote:
           | > go and investigate what is currently considered state of
           | the art for trading algorithms.
           | 
           | These are not LLMs but algorithms written and designed by
           | human minds. It is unfortunate that AI has become a catch-all
           | word for any kind of machine learning.
        
             | chuckadams wrote:
             | LLMs are algorithms written by humans. "AI" is _supposed_
             | to be a vague term, and not synonymous with one particular
             | implementation.
        
               | teucris wrote:
               | LLMs are _architectures_ written by humans. What an LLM
               | creates is not algorithmic.
        
               | FrustratedMonky wrote:
               | "What an LLM creates is not algorithmic."
               | 
               | Not strictly true. There are patterns in the weights that
               | could be steps in an algorithm.
        
               | seadan83 wrote:
               | LLMs create models, not algorithms. An algorithm is a
               | rote sequence if steps to accomplish a task.
               | 
               | The following is an algorithm:
               | 
               | - plug in input to model
               | 
               | - say yes if result is positive, else say no
               | 
               | LLMs use models, the model is not an algorithm.
               | 
               | > There are patterns in the weights that could be steps
               | in an algorithm.
               | 
               | Sure, but yeah... no.. "Could be steps in an algorithm"
               | does not constitute an algorithm.
               | 
               | Weights are inputs, they are not themselves parts of an
               | algorithm. The algorithm might still try to come up with
               | weights. Still, don't confuse procedure from data.
        
               | FrustratedMonky wrote:
               | Don't want to get to pedantic on that response. The model
               | can contain complex information. There is already
               | evidence it can form a model of the world. So why not
               | something like steps to get from A to B.
               | 
               | And, it is clear that LLMs can follow steps. One didn't
               | place in the Math Olympiad without some ability to follow
               | steps.
               | 
               | https://research.google/blog/teaching-language-models-to-
               | rea...
               | 
               | And, Anyway, when I asked it, it said it could
               | 
               | "Yes, an LLM model can contain the steps of an algorithm,
               | especially when prompted to "think step-by-step" or use a
               | "chain-of-thought" approach, which allows it to break
               | down a complex problem into smaller, more manageable
               | steps and generate a solution by outlining each stage of
               | the process in a logical sequence; essentially mimicking
               | how a human would approach an algorithm. "
        
           | wavemode wrote:
           | He said LLM, not just any AI
        
             | aleph_minus_one wrote:
             | I would bet that there _do_ exist some finance firms that
             | _do_ use LLM as AIs for the purposes that cesaref sketched.
        
           | ryukoposting wrote:
           | Makes me wonder how they detect market manipulation and
           | fraud. Trivial activities, like marking the close, probably
           | aren't hard to detect, but I imagine that some kind of ML
           | thingy is involved in flagging accounts for manual
           | inspection.
        
         | msabalau wrote:
         | Pragmatically, "AI" will mean (and for, many people already
         | does mean) stochastic and fallible.
         | 
         | If your users are likely to be AI illiterate and mistakenly
         | feel that an AI app is reliable and suitable for mission
         | critical applications when it isn't, that is a risk you
         | mitigate.
         | 
         | But it seems deeply unserious of the author to just assert that
         | mission-critical software is the only "serious context" and the
         | only thing matters, and therefore AI is dead end. "Serious,
         | mission critical" apps are just going to be a niche in the
         | future.
        
           | timeon wrote:
           | > "Serious, mission critical" apps are just going to be a
           | niche in the future.
           | 
           | High quality only for _niche_ "serious, mission critical"
           | everywhere else, enshittification already started, LLMs will
           | just accelerate it.
        
         | JumpCrisscross wrote:
         | > _LLM-based AI is faulty, unpredictable, and unmanageable_
         | 
         | Is there a fundamental (a La Godel) reason why we can't predict
         | or manage LLMs?
         | 
         | > _would you let an LLM manage your financial affairs
         | (entirely, unsupervised)?_
         | 
         | No. But history is littered with eccentric geniuses who
         | couldn't be trusted on their own, but who nevertheless were
         | relied on by decision makers.
         | 
         | Maybe there is an Erdos principle at play: AI can be involved
         | in questions of arbitrary complexity, but can only advise on
         | important decisions.
        
         | manquer wrote:
         | > would you let an LLM manage your financial affairs (entirely,
         | unsupervised)?
         | 
         | It will likely be better[2] not because AI is good at this .
         | 
         | It would be because study after study[1] has shown that active
         | management performs poorer than passive funds, less
         | intervention gives better result over longer timeframe .
         | 
         | [1] the famous warren buffet bet comes to mind . There are more
         | formal ones validating this .
         | 
         | [2] if configured to do minimal changes
        
           | seadan83 wrote:
           | What if financial affairs were broadened to be everything,
           | not just portfolio management? Eg: paying bills, credit
           | cards, cash balance in check vs savings vs brokerage.
        
             | manquer wrote:
             | Good financial management(portfolio and personal) is a
             | matter of disciplined routine, performed consistently over
             | long timeframe, combined with impulse control. It is not
             | complicated at all, any program (LLM or just a rules
             | engine) will always do far better than we can because it
             | will not suffer either problem(sticking to the routine or
             | impulse).
             | 
             | Most humans make very bad decisions around personal
             | finance, whether it is big things like gambling or impulse
             | buys with expensive credit, to smaller items like tracking
             | subscriptions or keeping not needed money in checking
             | account etc.
             | 
             | This is irrespective of financial literacy, education,
             | wealth or professions like say working in finance/ personal
             | wealth management even.
             | 
             | Entire industries like lottery, gambling, luxury goods,
             | gaming, credit card APRs, Buy Now Pay Later, Consumer SaaS,
             | Banking overdraft fees are all built around our inability
             | to control our impulses or follow disciplined routines.
             | 
             | This is why trust funds with wealth management
             | professionals are the only way to generational wealth.
             | 
             | You need the ability to control any benefactor (the next
             | generations) from excising their impulses on amounts beyond
             | their annual draw. Plus the disciplined routine of a
             | professional team who are paid to do only this with
             | multiple layers that vet the impulses of individual
             | managers and conservative mandate to keep them risk averse
             | and therefore less impulsive.
             | 
             | If an program can do it for me (provided of course I
             | irrevocably give away my control to override or alter its
             | decisions) then normal people can also benefit without the
             | high net worth required for wealth management.
        
         | clint wrote:
         | The primary fallacy in your argument is that you seem to think
         | that humans produce much better products on some kind of
         | metric.
         | 
         | My lived experience the software industry at almost all levels
         | over the last 25 years leads me to believe that the vast
         | majority of humans and teams of humans produce atrocious code
         | that only wastes time, money, and people's patience.
         | 
         | Often because it is humans producing the code, other humans are
         | not willing to fully engage, criticize and improve that code,
         | deferring to just passing it on to the next person, team,
         | generation, whatever.
         | 
         | Yes, this perhaps happens better in some (very large and very
         | small) organizations, but most often it only happens with the
         | inclusions of horrendous layers of protocol, bureaucracy, more
         | time, more emotional exhaustion, etc.
         | 
         | In other words a very costly process to produce excellent code,
         | both in real capital and human capital. It literally burns
         | through actual humans and results in very bad health outcomes
         | for most people in the industry, ranging from minor stuff to
         | really major things.
         | 
         | The reality is that probably 80% of people working in the tech
         | industry can be outperformed by an AI and at a fraction of the
         | cost. AIs can be tuned, guided, and steered to produce code
         | that I would call exception compared even to most developers
         | who have been in the field for 5years or more.
         | 
         | You probably come to this fallacy because you have worked in
         | one of these very small or very large companies that takes
         | producing code seriously and believe that your experience
         | represents the vast majority of the industry, but in fact the
         | middle area is where most code is being "produced" and if
         | you've never been fully engaged in those situations, you may
         | literally have no idea of the crap that's being produced and
         | shipped on a daily basis. These companies have no incentive to
         | change, they make lots of money doing this, and fresh meat
         | (humans) is relatively easy to come by.
         | 
         | Most of these AI benchmarks are trying to get these LLMs to
         | produce outputs at the scale and quantity of one of these
         | exceptional organizations when in fact, the real benefits will
         | come in the bulk of organizations that cannot do this stuff and
         | AI will produce as good or better code than a team of mediocre
         | developers slogging away in a mediocre, but profitable,
         | company.
         | 
         | Yes there are higher levels of abstraction around code, and
         | getting it deployed, comprehensive testing, triaging issues, QA
         | blah blah, that humans are going to be better at for now, but I
         | see many of those issues being addressed by some kind of LLM
         | system sooner or later.
         | 
         | Finally, I think most of the friction people are seeing right
         | now in their organization is because of the wildly ad hoc way
         | people and organizations are using AI, not so much about the
         | technological abilities of the models themselves.
        
           | d0mine wrote:
           | "80%" "outperformed" "fraction of the cost" you could make a
           | lot of money if it were true but 5x productivity boost seems
           | unjustified right now, I'm having a hard time finding
           | problems where the output is even 1x (where I don't spend
           | more time babysitting LLM than doing the task from scratch
           | myself).
        
             | Earw0rm wrote:
             | Depends what you're doing.
             | 
             | For "stay in your lane" stuff, I agree, it relatively
             | sucks.
             | 
             | For "today I need do stuff two lanes over", well it still
             | needs the babysitting, and I still wouldn't put it on tasks
             | where I can't verify the output, but it definitely delivers
             | a productivity boost IME.
        
           | SoftTalker wrote:
           | Sorry you're downvoted, but I generally agree. When it comes
           | to software, most organizations are Initech.
        
         | superjan wrote:
         | With respect to hallucinating, I never read about training
         | LLM's to say: "I don't know" when they don't know. Is that even
         | researched?
        
           | Sohcahtoa82 wrote:
           | ChatGPT seems to be good about this. If you invent something
           | and ask about it, like "What was the No More Clowning Act of
           | 2025?", it will say it can't find any information on it.
           | 
           | The older or smaller models, like anything you can run
           | locally, are probably far more likely to just invent some
           | bullshit.
           | 
           | That said, I've certainly asked ChatGPT about things that
           | definitely have a correct answer and had it give me incorrect
           | information.
           | 
           | When talking about hallucinating, I do think we need to
           | differentiate between "what you asked about exists and has a
           | correct answer, but the AI got it wrong" and "What you're
           | asking for does not exist or does not have an answer, but the
           | AI just generated some bullshit".
        
           | sroussey wrote:
           | Not sure why you are downvoted. It's a difficult problem, but
           | lots of angles on how to deal with it.
           | 
           | For example: https://arxiv.org/abs/2412.15176
        
         | rsanek wrote:
         | > Would you let it perform your job while you receive the
         | rewards and consequences?
         | 
         | isn't this what being a human manager is? not sure why you're
         | saying it must be entirely + unsupervised. at my job, my boss
         | mostly trusts me but still checks my work and gives me feedback
         | when he wants something changed. he's ultimately responsible
         | for what I do.
        
           | bredren wrote:
           | Indeed, even if you have a professional accountant do your
           | taxes, you must still sign off on their work.
           | 
           | Detecting omissions or errors on prepared tax forms often
           | requires knowledge of context missed by or not provided to
           | the accountant.
        
         | PaulDavisThe1st wrote:
         | I believe you're asking the wrong question, or at least you're
         | asking it in the wrong way. From my POV, it comes in two parts:
         | 
         | 1. Do you believe that LLMs operate in a similar way to the
         | important parts of human cognition?
         | 
         | 2. If not, do you believe that they operate in a way that makes
         | them useful for tasks other than responding to text prompts,
         | and if so, what are those tasks?
         | 
         | If you believe that the answer to Q1 is substantively "yes" -
         | that is, humans and LLM are engaged in the same sort of
         | computational behavior when we engage in speech generation -
         | then there's presumably no particular impediment to using an
         | LLM where you might otherwise use a human (and with the same
         | caveats).
         | 
         | My own answer is that while some human speech behavior is
         | possibly generated by systems that function in a semantically
         | equivalent way to current LLMs, human cognition is capable of
         | tasks that LLMs cannot perform de novo even if they can give
         | the illusion of doing so (primarily causal chain reasoning).
         | Consequently, LLMs are not in any real sense equivalent to a
         | human being, and using them as such is a mistake.
        
           | User23 wrote:
           | I think C.S. Peirce's distinction between corollarial
           | reasoning and theorematic reasoning[1][2] is helpful here. In
           | short, the former is the grindy rule following sort of
           | reasoning, and the latter is the kind of reasoning that's
           | associated with new insights that are not determined by the
           | premises alone.
           | 
           | As an aside, Students of Peirce over the years have quite the
           | pedigree in data science too, including the genius Edgar F.
           | Codd, who invented the relational database largely inspired
           | by Peirce's approach to relations.
           | 
           | Anyhow, computers are already quite good at corollarial
           | reasoning and have been for some time, even before LLMs. On
           | the other hand, they struggle with theorematic reasoning.
           | Last I knew, the absolute state of the art performs about as
           | well as a smart high school student. And even there, the
           | tests are synthetic, so how theorematic they truly are is
           | questionable. I wouldn't rule out the possibility of some
           | automaton proposing a better explanation for gravitational
           | anomalies than dark matter for example, but so far as I know
           | nothing like that is being done yet.
           | 
           | There's also the interesting question of whether or not an
           | LLM that produces a sequence of tokens that induces a genuine
           | insight in the human reader actually means the LLM itself had
           | said insight.
           | 
           | [1] https://www.cspeirce.com/menu/library/bycsp/l75/ver1/l75v
           | 1-0...
           | 
           | [2] https://groups.google.com/g/cybcom/c/Es8Bh0U2Vcg
        
           | Closi wrote:
           | > My own answer is that while some human speech behavior is
           | possibly generated by systems that function in a semantically
           | equivalent way to current LLMs, human cognition is capable of
           | tasks that LLMs cannot perform de novo even if they can give
           | the illusion of doing so (primarily causal chain reasoning).
           | Consequently, LLMs are not in any real sense equivalent to a
           | human being, and using them as such is a mistake.
           | 
           | In the workplace, humans are ultimately a tool to achieve a
           | goal. LLM's don't have to be equivalent to humans to replace
           | a human - they just have to be able to achieve the goal that
           | the human has. 'Human' cognition likely isn't required for a
           | huge amount of the work humans do. Heck, AI probably isn't
           | required to automate a lot of the work that humans do, but it
           | will accelerate how much can be automated and reduce the cost
           | of automation.
           | 
           | So it depends what we mean as 'use them as a human being' -
           | we are using human beings to do tasks, be it solving a
           | billing dispute for a customer, processing a customers
           | insurance claim, or reading through legal discovery. These
           | aren't intrinsically 'human' tasks.
           | 
           | So 2 - yes, I do believe that they operate in a way that
           | makes them useful for tasks. LLM's just respond to text
           | prompts, but those text prompts can do useful things that
           | humans are currently doing.
        
           | RaftPeople wrote:
           | My 2 cents:
           | 
           | I think the vector representation stuff is an effective tool
           | and possibly similar to foundational tools that humans are
           | using.
           | 
           | But my gut feel is that it's just one tool of many that
           | combine to give humans a model+view of the world with some
           | level of visibility into the "correctness" of ideas about
           | that world.
           | 
           | Meaning we have a sense of whether new info "adds up" or not,
           | and we may reject the info or adjust our model.
           | 
           | I think LLM's in their current state can be useful for tasks
           | that do not have a high cost resulting from incorrect output,
           | or tasks that can have their output validated by humans or
           | some other system cost-effectively.
        
           | tliltocatl wrote:
           | IMHO, a more important and testable difference is that humans
           | don't have separate "train" and "infer" phases. We are able
           | to adapt more or less on the fly and learn from previous
           | experience. LLMs currently cannot retain any novel experience
           | past the context window.
        
           | AnimalMuppet wrote:
           | I think LLMs operate in a similar way to _some_ of the
           | important parts of human congnition.
           | 
           | I believe they operate in a way that makes them at least
           | somewhat useful for some things. But I think the big issue is
           | trustworthiness. Humans - at least some of them - are more
           | trustworthy than LLM-style AIs (at least current ones). LLMs
           | need progress on trustworthiness more than they need progress
           | on use in other areas.
        
         | qaq wrote:
         | Would you let an LLM manage your financial affairs (entirely,
         | unsupervised)?
         | 
         | Hmm I would not let other human manage my financial affairs
         | entirely unsupervised.
        
         | tossandthrow wrote:
         | > would you let an LLM manage your financial affairs (entirely,
         | unsupervised)?
         | 
         | No, but I also would not let another person do that.
         | 
         | It is telling that you needed to interject "entirely,
         | unsupervised".
         | 
         | Most people will let an llm do it partially, and probably
         | already do.
        
           | BhavdeepSethi wrote:
           | People pay to use a financial advisor. Isn't that another
           | person?
        
             | tossandthrow wrote:
             | The key is: entirely and unsupervised.
             | 
             | Mostly your financial advisor writes your return you sign
             | off on or manages your portfolio. But the advisor usually
             | solicits and interacts with you to know what your financial
             | goals are and ensure you are on board with the consequences
             | of their advice.
             | 
             | I do not dismiss that some people are completely hands off
             | at great risk IMHO. But these are not me - as was my
             | initial proposition.
        
         | wyager wrote:
         | > would you let an LLM manage your financial affairs (entirely,
         | unsupervised)?
         | 
         | I wouldn't let another human do this.
        
         | Earw0rm wrote:
         | A more informative question is:
         | 
         | _Who_ would you let manage your financial affairs, and under
         | what circumstances?
         | 
         | To which my answer would be something like: a qualified
         | financial adviser with a good track record, who can be trusted
         | to do the job to, if not the best of their abilities, at least
         | an acceptable level of professional competence.
         | 
         | A related question: who would you let give you a lift someplace
         | in a car?
         | 
         | And here's where things get interesting. Because on the one
         | hand there's a LOT more at stake (literally, your life), and
         | yet various social norms, conventions , economic pressures and
         | so on mean that in practice we quite often entrust that
         | responsibility to people who are very, very far from performing
         | at their best.
         | 
         | So while a financial adviser AI is useless unless it can
         | perform at the level of a trained professional doing their job
         | (or unless it can perform at maybe 95% of that level at much
         | lower cost), a self-driving car is at least _potentially_
         | useful if it's only somewhat better than people at or close to
         | their worst. As a high proportion of road traffic collisions
         | are caused by people who are drunk, tired, emotionally unstable
         | or otherwise very very far from the peak performance of a human
         | being operating a car.
         | 
         | (We can argue that a system which routinely requires people to
         | carry out life-or-death, mission-critical tasks while
         | significantly impaired is dangerously flawed and needs a major
         | overhaul, but that's a slightly different debate).
        
       | IanCal wrote:
       | I find this kind of argument comes up a lot and it seems
       | fundamentally flawed to me.
       | 
       | 1. You can set a bar wherever you want for a level of
       | "seriousness" and huge swathes of real world work will fall below
       | it, and are therefore attractive to tackle with these systems.
       | 
       | 2. We build critical large scale systems out of humans, which are
       | fallible and unverifiable. That's not to say current LLMs are
       | human or equivalent, but "we can't verify X works all the time"
       | doesn't stop us doing exactly that a _lot_. We deal with this by
       | learning how humans make mistakes, why, and build systems of
       | checks around that. There is nothing in my ind that stops us
       | doing the same with other AI systems.
       | 
       | 3. Software is written by, checked by and verified by humans at
       | least at some critical point - so even verified software still
       | has this same problem.
       | 
       | We've also been doing this kind of thing with ML models for ages,
       | and we use buggy systems for an enormous amount of work
       | worldwide. You can argue we shouldn't and should have fully
       | formally verified systems for everything, but you can't deny that
       | right now we have large serious systems without that.
       | 
       | And if your goal is "replace a human" then I just don't think you
       | can reasonably say that it requires verifiable software.
       | 
       | > Systems are not explainable, as they have no model of knowledge
       | and no representation of any 'reasoning'.
       | 
       | Neither of those statements are true are they? There are internal
       | models, and recent models are designed around having a
       | representation of reasoning before replying.
       | 
       | > current generative AI systems represent a dead end, where
       | exponential increases of training data and effort will give us
       | modest increases in impressive plausibility but no foundational
       | increase in reliability
       | 
       | And yet reliability is something we see improve as LLMs get
       | better and we get better at training them.
        
       | nialse wrote:
       | There are two epistemic poles: the atomistic and the
       | probabilistic. The author subscribes to a rule-based atomistic
       | worldview, asserting that any perspective misaligned with this
       | framework is incorrect. Currently, academia is undergoing a
       | paradigm shift in the field of artificial intelligence. Symbolic
       | AI, which was the initial research focus, is rapidly being
       | replaced by statistical AI methodologies. This transition
       | diminishes the relevance of atomistic or symbolic scientists,
       | making them worry they might become irrelevant.
        
         | irunmyownemail wrote:
         | Not sure I followed all of that lingo but it sounds like a
         | fancy way of saying, if you're losing the game, try shifting
         | the goal post.
        
           | nialse wrote:
           | Indeed and unfortunately. I've been reading up on "the
           | binding problem" in AI lately and came across a paper that
           | hinged on there being an "object representation" which would
           | magically solve the apparent issues in symbolic AI. In the
           | discussion some 20 pages later, the authors confessed that
           | they, nor anybody else, could define what an object was in
           | the first place. Sometimes the efforts seem focused on "not
           | letting the other team win" rather than actually having
           | something tangible to bring to the table.
        
         | dbmikus wrote:
         | I never want to claim certainties, but it seems pretty close to
         | certain that symbolic AI loses to statistical AI.
         | 
         | I think there is room for statistical AI to operate symbolic
         | systems so we can better control outputs. Actually, that's kind
         | of what is going on when we ask AI to write code.
        
         | Shorel wrote:
         | I think that transition already happened, and the next big jump
         | in AI will be the combination of these two approaches in an
         | unified package.
         | 
         | Kind of the way our right brain hemisphere does probabilistic
         | computation and the left brain hemisphere does atomistic
         | computation. And we use both.
         | 
         | So, whoever develops the digital equivalent of the corpus
         | callosum wins.
        
           | nialse wrote:
           | An observation with scientific paradigm shifts is that they
           | tend not to reverse. Also the lingo someone commented on is
           | that the fundamental problem is the different philosophical
           | views of what knowledge is and can be. Either knowledge is
           | base on symbols and rules like in mathematics or they are
           | probabilities like in anything we actually can measure. Both
           | these views can coexist and maybe AI will find the missing
           | link between them some day. Possibly no human will grasp the
           | link.
        
       | llm_trw wrote:
       | No, they are very useful tools to build up inteligent systems out
       | of.
       | 
       | Everything from perplexity onward shows just how useful agents
       | can be.
       | 
       | You get another bump in utility when you allow for agents swarms.
       | 
       | Then another one for dynamically generated agent swarms.
       | 
       | The only reason why it's not coming for your job is that LLMs are
       | currently too power hungry to run those jobs for anything but
       | research - at a couple thousand to couple of million times the
       | price of a human doing the work.
       | 
       | Which works out to 10 to 20 epochs of whatever Moore's law looks
       | like in graphics cards.
        
         | throw83288 wrote:
         | What is that bump in utility in practical terms? You can point
         | to a benchmark improvement but that's no indication the agent
         | swarm is not reducing to "giving an llm an arbitrary amount of
         | random guesses".
        
       | inciampati wrote:
       | Want reliable AI? Stop approximating memory with attention and
       | build reliable memory into the model directly.
        
         | logicchains wrote:
         | Standard LLM quadratic attention isn't an approximation, it's
         | perfect recall. Approaches that compress that memory down into
         | a fixed-size state are an approximation, and generally perform
         | worse, that's why linear transformers aren't widely used.
        
       | mrtksn wrote:
       | What I find curious is that the people who sell the AI as the
       | holy grail that will make any jobs obsolete in a few year at the
       | same time claim that there's huge talent shortage and even engage
       | in feud on immigration and spend capital to influence immigration
       | policies.
       | 
       | Apparently they don't believe that AI is about to revolutionize
       | things that much. This makes me believe that significant part of
       | the AI investment is just FOMO driven, so no real revolution is
       | around the corner.
       | 
       | Although we keep seeing claims that AI achieved PHD level this
       | Olympics level that, people who actually own these keep demanding
       | immigration policy changes to bring actual humans from overseas
       | for year to come.
        
         | fhd2 wrote:
         | Is that so? I'm not in the US, so I don't have a good idea of
         | what's going on there. But wasn't there relatively high
         | unemployment among developers after all these Big Tech layoffs
         | post pandemic? Shouldn't companies there have an easy time
         | finding local talent?
         | 
         | Sorry for the potentially silly question. I just spent some
         | time trying to research it and came up with nothing concrete.
        
           | mrtksn wrote:
           | > But wasn't there relatively high unemployment among
           | developers after all these Big Tech layoffs post pandemic?
           | 
           | I'm speculating too but yes it appears that unemployment is
           | pretty high among the CS majors: https://www.reddit.com/r/csM
           | ajors/comments/1hhl060/how_is_it...
           | 
           | But at the same time there's an ongoing infighting among
           | Trump supporters because tech elites came up as pro - skilled
           | immigration where the MAGA camp turned against them. The tech
           | elites claim that there's a talent shortage. Here's a short
           | rundown that Elon Musk agrees with:
           | https://x.com/AutismCapital/status/1872408010653589799
        
             | fhd2 wrote:
             | Ah I see, thought I missed some major story, but apparently
             | not.
             | 
             | The unemployment data is from 2018 BTW. But from what I
             | perceive, developer unemployment in the US seems higher
             | than usual right now.
        
               | mrtksn wrote:
               | Good catch but yes, my personal observation is the same
               | and not only in US.
        
         | exe34 wrote:
         | Have you maybe confused the time periods in the different
         | discussions? I think the AI making jobs obsolete part is in the
         | next few years, whereas the talent shortage issue is right now
         | - although as usual, it's a wage issue, not a talent issue. Pay
         | enough and the right people will turn up.
        
           | mrtksn wrote:
           | Who knows about the future, right? I'm just trying to read
           | the expectations of the people who have control over both the
           | AI, Capital and Politics and they don't strike me as
           | optimistic about AI actually doing much in near future.
        
             | exe34 wrote:
             | they seem to be investing a lot into replacing workers with
             | AI.
        
               | mrtksn wrote:
               | And that might be a FOMO or they can simply exit with
               | profit as long as they can flame up the hype. An of
               | course, they may be hoping to have it in long term.
               | 
               | They are not replacing their workers despite claiming
               | that AI is currently as good as a PHD and they certainly
               | don't go to AI medical doctors despite claiming that
               | their tool is better than most doctors.
        
           | hackable_sand wrote:
           | It's not a wage issue.
        
             | exe34 wrote:
             | are you saying the free market doesn't work?
        
         | nyarlathotep_ wrote:
         | Schrodinger's Job Market, yeah.
         | 
         | The whole conversation is so dishonest.
         | 
         | Every software firm, notable and small, has had layoffs over
         | the past two years, but somehow there's still a "STEM shortage"
         | and companies are "starving for talent" or some such nonsense?
         | 
         | Fake discussion.
        
       | JimmyWilliams1 wrote:
       | The reliance on large datasets for training AI models introduces
       | biases present in the data, which can perpetuate or even
       | exacerbate societal inequalities. It's essential to approach AI
       | development with caution, ensuring robust ethical guidelines and
       | comprehensive testing are in place before integrating AI into
       | sensitive areas.
       | 
       | As we continue to innovate, a focus on explainability, fairness,
       | and accountability in AI systems will be paramount to harnessing
       | their potential without compromising societal values.
        
         | owenpalmer wrote:
         | > exacerbate societal inequalities
         | 
         | Do you have an example of this?
        
       | tbenst wrote:
       | As a neuroscientist, my biggest disagreement with the piece is
       | the author's argument for compositionality over emergence. The
       | former makes me think of Prolog and lisp, while the later is a
       | much better description for a brain. I think ermergence is a much
       | more promising direction for AGI than compositionality.
        
         | dbmikus wrote:
         | 100% agree. When we explicitly segment and compose AI
         | components, we are removing the ability for them to learn their
         | own pathways between the components. We've been proven time and
         | time again the bitter lesson[1]: that throwing a ton of data
         | and compute at a model yields better results than what we could
         | come up with.
         | 
         | That said, we can still isolate and modify parts of a network,
         | and combine models trained for different tasks. But you need to
         | break things down into components after the fact, instead of
         | beforehand, in order to get the benefits of learning via scale
         | of data + compute.
         | 
         | [1]: http://www.incompleteideas.net/IncIdeas/BitterLesson.html
        
       | cs702 wrote:
       | As of right now, we have no way of knowing in advance what the
       | capabilities of current AI systems will be if we are able to
       | scale them by 10x, 100x, 1000x, and more.
       | 
       | The number of neuron-neuron connections in current AI systems is
       | still _tiny_ compared to the human brain.
       | 
       | The largest AI systems in use today have _hundreds of billions_
       | of parameters. Nearly all parameters are part of a weight matrix,
       | each parameter quantifying the strength of the connection from an
       | artificial input neuron to an artificial output neuron. The human
       | brain has more than a _hundred trillion_ synapses, each
       | connecting an organic input neuron to an organic output neuron,
       | but the comparison is not apples-to-apples, because each synapse
       | is much more complex than a single parameter in a weight
       | matrix.[a]
       | 
       | Today's largest AI systems have about the same number of neuron-
       | neuron connections as the brain of a _brown rat_.[a] Judging
       | these AI systems based on their current capabilities is like
       | judging organic brains based on the capabilities of brown rat
       | brains.
       | 
       | What we can say with certainty is that _today 's_ AI systems
       | cannot be trusted to be reliable. That's true for highly trained
       | brown rats too.
       | 
       | ---
       | 
       | [a]
       | https://en.wikipedia.org/wiki/List_of_animals_by_number_of_n...
       | -- sort in descending order by number of synapses.
        
         | JumpCrisscross wrote:
         | > _we have no way of knowing in advance what the capabilities
         | of current AI systems will be if we are able to scale them by
         | 10x, 100x, 1000x, and more_
         | 
         | This doesn't solve the unpredictability problem.
        
           | leobg wrote:
           | We don't know. We didn't predict that the rat brain would get
           | us here. So we also can't be confident in our prediction that
           | scaling it won't solve hallucination problems.
        
           | cs702 wrote:
           | No, it doesn't "solve" the unpredictability problem.
           | 
           | But we haven't solved it for human beings either.
           | 
           | Human brains are unpredictable. Look around you.
        
             | timeon wrote:
             | How are humans relevant here? As example, we operate at
             | different speed.
        
               | cs702 wrote:
               | Humankind has developed all sorts of systems and
               | processes to cope with the unpredictability of human
               | beings: legal systems, organizational structures,
               | separate branches of government, courts of law, police
               | and military forces, organized markets, double-entry
               | bookkeeping, auditing, security systems, anti-malware
               | software, etc.
               | 
               | While individual human beings do trust _some_ of the
               | other human beings they know, in the aggregate society
               | doesn 't seem to trust human beings to behave reliably.
               | 
               | It's possible, though I don't know for sure, that we're
               | going to need systems and processes to cope with the
               | unpredictability of AI systems.
        
               | mrweasel wrote:
               | Are you expecting AIs to be more reliable, because
               | they're slower?
        
               | uoaei wrote:
               | Human performance, broadly speaking, is _the_ benchmark
               | being targeted by those training AI models. Humans are
               | part of the conversation since that 's the only kind of
               | intelligence these folks can conceive of.
        
             | sdesol wrote:
             | > Human brains are unpredictable. Look around you.
             | 
             | As it was mentioned by others, we've had thousands of years
             | to better understand how humans can fail. LLMs are black
             | boxes and it never ceases to amaze me how they can fail in
             | such unpredictable ways. Take the following for examples.
             | 
             | Here GPT-4o mini is asked to calculate 2+3+5
             | 
             | https://beta.gitsense.com/?chat=8707acda-e6d4-4f69-9c09-2cf
             | f...
             | 
             | It gets the answer correct, but if you ask it to verify its
             | own answer
             | 
             | https://beta.gitsense.com/?chat=6d8af370-1ae6-4a36-961d-290
             | 2...
             | 
             | it says the response was wrong, and contradicts itself. Now
             | if you ask it to compare all the responses
             | 
             | https://beta.gitsense.com/?chat=1c162c40-47ea-419d-af7a-a30
             | a...
             | 
             | it correctly identifies that GPT-4o mini was incorrect.
             | 
             | It is this unpredictable nature that makes LLM insanely
             | powerful and scary.
             | 
             | Note: The chat on the beta site doesn't work.
        
           | clint wrote:
           | You seem to believe that humans, on their own, are not
           | stochastic and unpredictable. I contend that if this is your
           | belief then you couldn't be more wrong.
           | 
           | Humans are EXTREMELY unpredictable. Humans only become
           | slightly more predictable and producers of slightly more
           | quality outputs with insane levels of bureaucracy and layers
           | upon layers upon layers of humans to smooth it out.
           | 
           | To boot, the production of this mediocre code is very very
           | very slow compared to LLMs. LLMs also have no feelings, egos,
           | and are literally tunable and directible to produce better
           | outcomes without hurting people in the process (again,
           | something that is very difficult to avoid without the
           | inclusion of, yep, more humans more layers, more protocol
           | etc.)
           | 
           | Even with all of this mass of human grist, in my opinion, the
           | output of purely human intellects is, on average, very bad.
           | Very bad in terms of quality of output and very bad in terms
           | of outcomes for the humans involved in this machine.
        
         | FredPret wrote:
         | If brown-rats-as-a-service is as useful as it is already, then
         | I'm excited by what the future holds.
         | 
         | I think to make it to the next step, AI will have to have some
         | way of performing rigorous logic integrated on a low level.
         | 
         | Maybe scaling that brown-rat brain will let it emulate an
         | internal logical black box - much like the old adage about a
         | sufficiently large C codebase containing an imperfect Lisp
         | implementation - but I think things will get really cool we
         | figure out how to wire together something like Wolfram Alpha, a
         | programming language, some databases with lots of actual facts
         | (as opposed to encoded/learned ones), and ChatGPT.
        
           | cs702 wrote:
           | It's already better than _real_ rats-as-a-service, certainly:
           | 
           | https://news.ycombinator.com/item?id=42449424
        
           | ndesaulniers wrote:
           | Does it matter what color the rat is?
        
             | notpushkin wrote:
             | I suppose it refers to the particular species, _Rattus
             | norvegicus_ (although I 'd call it common rat personally).
        
           | petesergeant wrote:
           | ChatGPT can already run code, which allows it to overcome
           | some limitations of tokenization (eg counting the letters in
           | strawberry, sorting words by their second letter). Doesn't
           | seem like adding a Prolog interpreter would be all that hard.
        
           | Kim_Bruning wrote:
           | ChatGPT does already have access to Bing (would that count as
           | your facts database?) and Jupyter (which is sort of a
           | Wolphram clone except with Python?).
           | 
           | It still won't magically use them 100% correctly, but with a
           | bit of smarts you can go a long way!
        
         | zozbot234 wrote:
         | A brown rat's brain is also a lot more energy efficient than
         | your average LLM. Especially in the learning phase, but not
         | only.
        
           | cs702 wrote:
           | Yes, I agree, but energy efficiency is orthogonal to
           | capabilities.
        
             | sanderjd wrote:
             | No it isn't, because it is relevant to the question of
             | whether the current approaches _can_ be scaled 100x or
             | 1000x.
        
               | cs702 wrote:
               | That's a hardware question, not a software question, but
               | it is a fair question.
               | 
               | I don't know if the hardware can be scaled up. That's why
               | I wrote " _if_ we 're able to scale them" at the root of
               | this thread.
        
               | bee_rider wrote:
               | It is probably a both question. If 100x is the goal,
               | they'll have to double up the efficiency 7 times, which
               | seems basically plausible given how early-days it still
               | is (I mean they have been training on GPUs this whole
               | time, not ASICs... bitcoins are more developed and they
               | are a dumb scam machine). Probably some of the doubling
               | will be software, some will be hardware.
        
               | sanderjd wrote:
               | Yep, agreed.
               | 
               | I'm pretty skeptical of the scaling hypothesis, but I
               | also think there is a huge amount of efficiency
               | improvement runway left to go.
               | 
               | I think it's more likely that the return to further
               | scaling will become net negative at some point, and then
               | the efficiency gains will no longer be focused on doing
               | more with more but rather doing the same amount with
               | less.
               | 
               | But it's definitely an unknown at this point, from my
               | perspective. I may be very wrong about that.
        
               | sanderjd wrote:
               | The question is essentially: Can the current approaches
               | we've developed get to or beyond human level
               | intelligence?
               | 
               | Whether those approaches can scale enough to achieve that
               | is relevant to the question, whether the bottleneck is in
               | hardware or software.
        
               | s1artibartfast wrote:
               | That depends on if efficiency is part of the scaling
               | process
        
             | cruffle_duffle wrote:
             | Honestly I think the opposite. All these giant tech
             | companies can afford to burn money with ever bigger models
             | and ever more compute and I think that is actually getting
             | in their way.
             | 
             | I wager that some scrappy resource constrained startup or
             | research institute will find a way to produce results that
             | are similar to those generated by these ever massive LLM
             | projects only at a fraction of the cost. And I think
             | they'll do that by pruning the shit out of the model. You
             | don't need to waste model space on ancient Roman history or
             | the entire canon for the marvel cinematic universe on a
             | model designed to refactor code. You need a model that is
             | fluent in English and "code".
             | 
             | I think the future will be tightly focused models that can
             | run on inexpensive hardware. And unlike today where only
             | the richest companies on the planet can afford training,
             | anybody with enough inclination will be able to train them.
             | (And you can go on a huge tangent why such a thing is
             | absolutely crucial to a free society)
             | 
             | I dunno. My point is, there is little incentive for these
             | huge companies to "think small". They have virtually
             | unlimited budgets and so all operate under the idea that
             | more is better. That isn't gonna be "the answer"... they
             | are all gonna get instantly blindsided by some group who
             | does more with significantly less. These small scrappy
             | models and the institutes and companies behind them will
             | eventually replace the old guard. It's a tale as old as
             | time.
        
               | clayhacks wrote:
               | Deepseek just released their frontier model that they
               | trained on 2k GPUs for <$6M. Way cheaper than a lot of
               | the big labs. If the big labs can replicate some of their
               | optimisations we might see some big gains. And I would
               | hope more small labs could then even further shrink the
               | footprint and costs
        
               | cruffle_duffle wrote:
               | I don't think this stuff will be truly revolutionary
               | until I can train it at home or perhaps as a group (SETI
               | at home anybody?)
               | 
               | Six million is a start but this tech won't truly be
               | democratized until it costs $1000.
               | 
               | Obviously I'm being a little cheeky but my real point
               | is... the idea that this technology is in the control of
               | massive technology companies is dystopian as fuck. Where
               | is the RMS of the LLM space? Who is shouting from every
               | rooftop how dangerous it is to grant so much power and
               | control over information to a handful of massive tech
               | companies, all whom have long histories of caving into
               | various government demands. It's scary as fuck.
        
               | lodovic wrote:
               | This is just a tech race. we'll get affordable 64 gb gpus
               | in a few years, businesses want to run their own models.
        
             | DrBenCarson wrote:
             | It's not at all, energy is a hard constraint to capability.
             | 
             | Human intelligence improved dramatically after we improved
             | our ability to extract nutrients from food via cooking
             | 
             | https://www.scientificamerican.com/article/food-for-
             | thought-...
        
               | ben_w wrote:
               | > It's not at all, energy is a hard constraint to
               | capability.
               | 
               | We can put a lot more power flux through an AI than a
               | human body can live through; both because computers can
               | run hot enough to cook us, and because they can be
               | physically distributed in ways that we can't survive.
               | 
               | That doesn't mean there's no constraint, it's just that
               | the extent to which there is a constraint, the constraint
               | is way, _way_ above what humans can consume directly.
               | 
               | Also, electricity is much cheaper than humans. To give a
               | worked example, consider that the UN poverty threshold*
               | is about US$2.15/day in 2022 money, or just under
               | 9C//hour. My first Google search result for "average cost
               | of electricity in the usa" says "16.54 cents per kWh",
               | which means the UN poverty threshold human lives on a
               | price equivalent ~= just under 542 watts of average
               | American electricity.
               | 
               | The actual power consumption of a human is 2000-2500
               | kcal/day ~= 96.85-121.1 watts ~= about a fifth of that.
               | In certain narrow domains, AI already makes human labour
               | uneconomic... though fortunately for the ongoing payment
               | of bills, it's currently only that combination of good-
               | and-cheap in narrow domains, not generally.
               | 
               | * I use this standard so nobody suggests outsourcing
               | somewhere cheaper.
        
           | ben_w wrote:
           | Are you sure?
           | 
           | The average brown rat may use only 60 kcal per day, but the
           | maximum firing rate of biological neurons is about 100-1000
           | Hz rather than the A100 clock speed of about 1.5 GHz*, so the
           | silicon gets through the same data set something like
           | 1.5e6-1.5e7 times faster than a rat could.
           | 
           | Scaling up to account for the speed difference, the rat
           | starts looking comparable to a 9e7 - 9e8 kcal/day, or 4.4 to
           | 44 megawatts, computer.
           | 
           | * and the transistors within the A100 are themselves much
           | faster, because clock speed is ~ how long it takes for all
           | chained transistors to flip in the most complex single-clock-
           | cycle operation
           | 
           | Also I'm not totally confident about my comparison because I
           | don't know how wide the data path is, how many different
           | simultaneous inputs a rat or a transformer learns from
        
             | legacynl wrote:
             | That's a stupid analogy because you're comparing a
             | brainprocess to a full animal.
             | 
             | Only a small part of that 60kcal is used for learning, and
             | for that same 60 kcal you get an actual physical being that
             | is able to procreate, eat, do things and fend for and
             | maintain itself.
             | 
             | Also you cannot compare neuron firing rates with
             | clockspeed. Afaik each neuron in a ml-model can have code
             | that takes several clock cycles to complete.
             | 
             | Also an neuron in ml is just a weighted value, a biological
             | neuron does much more than that. For example neurons
             | communicate using neuro transmitters as well as using
             | voltage potentials. The actual date rate of biological
             | neurons is therfore much higher and complex.
             | 
             | Basically your analogy is false because your napkin-math
             | basically forgets that the rat is an actual biological rat
             | and not something as neatly defined as a computer chip
        
               | ben_w wrote:
               | > Also an neuron in ml is just a weighted value, a
               | biological neuron does much more than that. For example
               | neurons communicate using neuro transmitters as well as
               | using voltage potentials. The actual date rate of
               | biological neurons is therfore much higher and complex.
               | 
               | The conclusion does not follow from the premise. The
               | observed maximum rate of the inter-neuron communication
               | is important, the mechanism is not.
               | 
               | > Also you cannot compare neuron firing rates with
               | clockspeed. Afaik each neuron in a ml-model can have code
               | that takes several clock cycles to complete.
               | 
               | Depends how you're doing it.
               | 
               | Jupyter notebook? Python in general? Sure.
               | 
               | A100s etc., not so much -- those are specialist systems
               | designed for this task:
               | 
               | """1024 dense FP16/FP32 FMA operations per clock""" -
               | https://images.nvidia.com/aem-dam/en-zz/Solutions/data-
               | cente...
               | 
               | "FMA" meaning "fused multiply-add". It's the unit that
               | matters for synapse-equivalents.
               | 
               | (Even that doesn't mean they're perfect fits: IMO a
               | "perfect fit" would likely be using transistors as analog
               | rather than digital elements, and then you get to run
               | them at the native transistor speed of ~100 GHz or so and
               | don't worry too much about how many bits you need to
               | represent the now-analog weights and biases, but that's
               | one of those things which is easy to say from a
               | comfortable armchair and very hard to turn into silicon).
               | 
               | > Basically your analogy is false because your napkin-
               | math basically forgets that the rat is an actual
               | biological rat and not something as neatly defined as a
               | computer chip
               | 
               | Any of those biological functions that don't correspond
               | to intelligence, make the comparison more extreme in
               | favour of the computer.
               | 
               | This is, after all, a question of their mere
               | intelligence, not how well LLMs (or indeed any AI) do or
               | don't function as _von Neumann replicators_ , which is
               | where things like "procreate, eat, do things and fend for
               | and maintain itself" would actually matter.
        
           | haolez wrote:
           | And it learns online.
        
         | bee_rider wrote:
         | Rats are pretty clever, and they (presumably, at least) have a
         | lot of neurons spending their time computing things like...
         | where to find food, how frightened of this giant reality
         | warping creature in a lab coat should I be, that sort of thing.
         | I don't think it is obvious that one brown-rat-power isn't
         | useful.
         | 
         | I mean we have dogs. We really like them. For ages, they did
         | lots of useful work for us. They aren't that much smarter than
         | rats, right? They are better aligned and have a more useful
         | shape. But it isn't obvious (to me at least) that the rats'
         | problem is insufficient brainpower.
        
           | bloopernova wrote:
           | Dogs, if I recall correctly, have evolved alongside us and
           | have specific adaptations to better bond with us. They have
           | eyebrow muscles that wolves don't, and I think dogs have
           | brain adaptations too.
        
             | runarberg wrote:
             | We have been with dogs for such a long time, I wouldn't be
             | surprised if we also have adaptations to bond with dogs.
             | 
             | I mean dogs came with us to the Americas, and even to
             | Australia. Both the Norse and the Inuit took dogs with them
             | to Greenland.
        
           | mulmen wrote:
           | Depends on how you define smart. Dogs definitely have larger
           | brains. But then humans have even larger brains. If dogs
           | aren't smarter than rats then the size of brain isn't
           | proportional to intelligence.
        
         | hn_throwaway_99 wrote:
         | I think the comparison to brown rat brains is a huge mistake.
         | It seems pretty apparent (at least from my personal usage of
         | LLMs in different contexts) that modern AI is _much_ smarter
         | than a brown rat at some things (I don 't think brown rats can
         | pass the bar exam), but in other cases it becomes apparent that
         | it isn't "intelligent" at all in the sense that it becomes
         | clear that it's just regurgitating training data, albeit in a
         | highly variable manner.
         | 
         | I think LLMs and modern AI are incredibly amazing and useful
         | tools, but even with the top SOA models today it becomes
         | clearer to me the more I use them that they are fundamentally
         | lacking crucial components of what average people consider
         | "intelligence". I'm using quotes deliberately because the
         | debate about "what is intelligence" feels like it can go in
         | circles endlessly - I'd just say that the core concept of what
         | we consider understanding, especially as it applies to creating
         | and exploring novel concepts that aren't just a mashup of
         | previous training examples, appears to be sorely missing from
         | LLMs.
        
           | cs702 wrote:
           | Imagine it were possible to take a rat brain, keep it alive
           | with a permanent source of energy, wire its input and output
           | connections to a computer, and then train the rat brain's
           | output signals to predict the next token, given previous
           | tokens fed as inputs, using graduated pain or pleasure
           | signals as the objective loss function. All the neuron-neuron
           | connections in that rain brain would eventually serve one,
           | and only one, goal: predicting an accurate probability
           | distribution over the next possible token, given previous
           | tokens. The number of neuron-neuron connections in this "rat-
           | brain-powered LLM" would be comparable to that of today's
           | state-of-the-art LLMs.
           | 
           | This is less far-fetched than it sounds. Search for "organic
           | deep neural networks" online.
           | 
           | Networks of rat neurons have in fact been trained to fly
           | planes, in simulators, among other things.
        
             | ImHereToVote wrote:
             | Human brain organelles are in use right now by a Swiss
             | company.
        
               | cs702 wrote:
               | Thanks. Yeah, I've heard there are a bunch of efforts
               | like that, but as far as I know, all are very early
               | stage.
               | 
               | I do wonder if the most energy-efficient way to scale up
               | AI models is by implementing them in organic substrates.
        
           | cynicalpeace wrote:
           | > modern AI is much smarter than a brown rat at some things
           | (I don't think brown rats can pass the bar exam), but in
           | other cases it becomes apparent that it isn't "intelligent"
           | at all
           | 
           | There is no modern AI system that can go into your house and
           | find a piece of cheese.
           | 
           | The whole notion that modern AI is somehow "intelligent", yet
           | can't tell me where the dishwasher is in my house is
           | hilarious. My 3 year old son can tell me where the dishwasher
           | is. A well trained dog could do so.
           | 
           | It's the result of a nerdy definition of "intelligence" which
           | excludes anything to do with common sense, street smarts,
           | emotional intelligence, or creativity (last one might be
           | debatable but I've found it extremely difficult to prompt AI
           | to write amazingly unique and creative stories reliably)
           | 
           | The AI systems need bodies to actually learn these things.
        
             | CooCooCaCha wrote:
             | Where do you think common sense, emotional intelligence,
             | creativity, etc. come from? The spirit? Some magic brain
             | juice? No, it comes from neurons, synapses, signals,
             | chemicals, etc.
        
               | cynicalpeace wrote:
               | It comes from billions of years of evolution, the
               | struggle to survive and maintain your body long enough to
               | reproduce.
               | 
               | "Neurons, synapses, signals, chemicals" are downstream of
               | that.
        
               | mensetmanusman wrote:
               | Why would dust care about survival?
        
               | cynicalpeace wrote:
               | -\\_(tsu)_/- Consult a bible
        
               | FrustratedMonky wrote:
               | a 'dust to dust' joke?
               | 
               | Or just saying, when facing the apocalypse, read a bible?
        
               | bee_rider wrote:
               | It doesn't. Actually, quite a few of the early stages of
               | evolution wouldn't have any analogue to "care," right? It
               | just happened in this one environment, the most
               | successful self-reproducing processes happened to be get
               | more complex over time and eventually hit the point where
               | they could do, and then even later define, things like
               | "care."
        
               | mulmen wrote:
               | Without biological reproduction wouldn't the evolutionary
               | outcomes be different? Cyborgs are built in factories,
               | not wombs.
        
             | mensetmanusman wrote:
             | There are robots that can do this now, they just cost
             | $100k.
        
               | uoaei wrote:
               | That's just the hardware, but AI as currently practiced
               | is purely a software endeavor.
        
               | cynicalpeace wrote:
               | Correct, and the next frontier is combining the software
               | with the hardware.
        
               | cynicalpeace wrote:
               | Find a piece of cheese pretty much anywhere in my home?
               | 
               | Or if we're comparing to a three year old, also find the
               | dishwasher?
               | 
               | Closest I'm aware of is something by Boston Dynamics or
               | Tesla, but neither would be as simple as asking it-
               | wheres the dishwasher in my home?
               | 
               | And then if we compare it to a ten year old, find the
               | woodstove in my home, tell me the temperature, and adjust
               | the air intake appropriately.
               | 
               | And so on.
               | 
               | I'm not saying it's impossible. I'm saying there's no AI
               | system that has this physical intelligence yet, because
               | the robot technology isn't well developed/integrated yet.
               | 
               | For AI to be something more than a nerd it needs a body
               | and I'm aware there are people working on it. Ironically,
               | not the people claiming to be in search of AGI.
        
             | HDThoreaun wrote:
             | If you upload pictures of every room in your house to an
             | LLM it can definitely tell you where the dishwasher is. If
             | your argument is just that they cant walk around your house
             | so they cant be intelligent I think thats pretty clearly
             | wrong.
        
               | kimixa wrote:
               | A trained image recognition model could probably
               | recognize a dishwasher from an image.
               | 
               | But that won't be the same model that writes bad poetry
               | or tries to autocomplete your next line of code. Or
               | control the legs of a robot to move towards the
               | dishwasher while holding a dirty plate. And each has a
               | fair bit of manual tuning and preprocessing based on its
               | function which may simply not be applicable to other
               | areas even with scale. The best performing models aren't
               | just taking in unstructured untyped data.
               | 
               | Even the most flexible models are only tackling a small
               | slice of what "intelligence" is.
        
               | jdietrich wrote:
               | ChatGPT, Gemini and Claude are all natively multimodal.
               | They can recognise a dishwasher from an image, among many
               | other things.
               | 
               | https://www.youtube.com/watch?v=KwNUJ69RbwY
        
               | cynicalpeace wrote:
               | Can they take the pictures?
        
               | ta988 wrote:
               | Technically yes they can run functions. There were
               | experiments of Claude used to run a robot around a house.
               | So technically, we are not far at all and current models
               | may even be able to do it.
        
               | cynicalpeace wrote:
               | Please re-read my original comment.
               | 
               | "The AI systems need bodies to actually learn these
               | things."
               | 
               | I never said this was impossible to achieve.
        
               | sippeangelo wrote:
               | Can your brain see the dishwasher without your eyes?
        
               | sdenton4 wrote:
               | But do they have strong beaks?
               | 
               | https://sktchd.com/column/comics-disassembled-ten-things-
               | of-...
        
               | cynicalpeace wrote:
               | Do they know what a hot shower feels like?
               | 
               | They can describe it. But do they actually know? Have
               | they experienced it?
               | 
               | This is my point. Nerds keep dismissing physicality and
               | experience.
               | 
               | If your argument is a brain in a jar will be generally
               | intelligent, I think that's pretty clearly wrong.
        
               | HDThoreaun wrote:
               | See the responses section
               | https://en.wikipedia.org/wiki/Knowledge_argument This
               | idea certainly has been long considered but I personally
               | reject it.
        
               | cynicalpeace wrote:
               | While interesting, this is a separate thought experiment
               | with its own quirks. Sort of a strawman, since my
               | argument is formulated differently and simply argues that
               | AIs need to be more than brains in jars for them to be
               | considered generally intelligent.
               | 
               | And that the only reason we think AIs can just be brains
               | in jars is because many of the people developing them
               | consider themselves as simply brains in jars.
        
               | HDThoreaun wrote:
               | Not really. The point of it is considering whether
               | physical experience creates knowledge that is impossible
               | to get otherwise. Thats the argument you are making no?
               | If Mary learns nothing new when seeing red for the first
               | time an AI would also learn nothing new when seeing red
               | for the first time.
               | 
               | > Do they know what a hot shower feels like? They can
               | describe it. But do they actually know? Have they
               | experienced it
               | 
               | Is directly a knowledge argument
        
               | cynicalpeace wrote:
               | Mary in that thought experiment is not an LLM that has
               | learned via text. She's acquired "all the physical
               | information there is to obtain about what goes on when we
               | see ripe tomatoes". This does not actually describe
               | modern LLMs. It actually better describes a robot that
               | has transcribed the location, temperature, and velocity
               | of water drops from a hot shower to its memory. Again,
               | this thought experiment has its own quirks.
               | 
               | Also, it is an argument against physicalism, which I have
               | no interest in debating. While it's tangentially related,
               | my point is not for/against physicalism.
               | 
               | My argument is about modern AI and it's ability to learn.
               | If we put touch sensors, eyes, nose, a mechanism to
               | collect physical data (legs) and even sex organs on an AI
               | system, then it is more generally intelligent than
               | before. It will have learned in a better fashion what a
               | hot shower feels like and will be smarter for it.
        
               | HDThoreaun wrote:
               | > While it's tangentially related, my point is not
               | for/against physicalism.
               | 
               | I really disagree. Your entire point is about
               | physicalism. If physicalism is true than an AI does not
               | necessarily learn in a better fashion what a hot shower
               | feels like by being embodied. In a physicalist world it
               | is conceivable to experience that synthetically.
        
               | Dilettante_ wrote:
               | So are you saying people who have CIPA are less
               | intelligent for never having experienced a hot shower? By
               | that same logic, does its ability to experience more
               | colors increase the intelligence of a mantis shrimp?
               | 
               | Perhaps your own internal definition of intelligence
               | simply deviates significantly from the common, "median"
               | definition.
        
               | cynicalpeace wrote:
               | It's the totality of experiences that make an individual.
               | Most humans that I'm aware of have a greater totality of
               | experiences that make them far smarter than any modern AI
               | system.
        
               | skinner_ wrote:
               | Greater totality of experiences than having read the
               | whole internet? Obviously they are very different kind of
               | experiences, but a greater totality? I'm not so sure.
               | 
               | Here is what we know: The Pile web scrape is 800GB. 20
               | years of human experience at 1kB/sec is 600GB. Maybe
               | 1kB/sec is bad estimate. Maybe sensory input is more
               | valuable than written text. You can convince me. But next
               | challenge, some 10^15 seconds of currently existing
               | youtube video, that's 2 million years of audiovisual
               | experience, or 10^9GB at the same 1kB/sec.
        
               | tomrod wrote:
               | The proof that 1+1=2 is nontrivial despite it being clear
               | and obvious. It does not rely on physicality nor
               | experience to prove.
               | 
               | There are areas of utility here. Things need not be able
               | to do all actions to be useful.
        
               | momentoftop wrote:
               | There isn't a serious proof that 1+1=2, because it's near
               | enough axiomatic. In the last 150 years or so, we've been
               | trying to find very general logical systems in which we
               | can encode "1", "2" and "+" and for which 1+1=2 is a
               | theorem, and the derivations are sometimes non-trivial,
               | but they are ultimately mere sanity checks that the
               | logical system can capture basic arithmetic.
        
               | magpi3 wrote:
               | Could it tell the difference between a dishwasher and a
               | picture of a dishwasher on a wall? Or one painted onto a
               | wall? Or a toy dishwasher?
               | 
               | There is an essential idea of what makes something a
               | dishwasher that LLM's will never be able to grasp no
               | matter how many models you throw at them. They would have
               | to fundamentally understand that what they are "seeing"
               | is an electronic appliance connected to the plumbing that
               | washes dishes. The sound of a running dishwasher, the
               | heat you feel when you open one, and the wet, clean
               | dishes is also part of that understanding.
        
               | viraptor wrote:
               | Yes, it can tell a difference, up to the point where the
               | boundaries are getting fuzzy. But the same thing applies
               | to us all.
               | 
               | Can you tell this is a dishwasher?
               | https://www.amazon.com.au/Countertop-Dishwasher-
               | Automatic-Ve...
               | 
               | Can you tell this is a drawing of a glass?
               | https://www.deviantart.com/januarysnow13/art/Wine-Glass-
               | Hype...
               | 
               | Can you tell this is a toy?
               | https://www.amazon.com.au/Theo-Klein-Miele-Washing-
               | Machine/d...
        
             | theamk wrote:
             | That really makes no sense.. would you say someone who is
             | disabled bellow the neck is not intellegent / has no common
             | sense, street smaets, creativity, etc...?
             | 
             | Or would you say that you cannot judge the intellegence of
             | someone by reading their books / exchanging emails with
             | them?
        
         | megamix wrote:
         | What do you think of copyright violations?
        
           | bee_rider wrote:
           | IMO it is sad that the sort of... anti-establishment side of
           | tech has suddenly become very worried about copyright. Bits
           | inherently can be copied for free (or at least very cheap),
           | copyright is a way to induce scarcity for the market to
           | exploit where there isn't any on a technical level.
           | 
           | Currently the AI stuff kind of sucks because you have to be a
           | giant corp to train a model. But maybe in a decade, users
           | will be able to train their own models or at least fine-tune
           | on basic cellphone and laptop (not dgpu) chips.
        
             | uoaei wrote:
             | The copyright question is inherently tied to the
             | requirement to earn money from your labor in this economy.
             | I think the anti-establishment folks are not so rabid that
             | they can't recognize real material conditions.
        
               | LordDragonfang wrote:
               | I think that would be a more valid argument if they ever
               | cared about automating away jobs before. As it stands,
               | anyone who was standing in the way of the glorious march
               | of automation towards a post-scarcity future was called a
               | luddite - right up until that automation started
               | threatening their (material) class.
               | 
               | I mean, you don't have to look any further than the
               | (justified) lack of sympathy to dockworkers just a few
               | months ago: https://news.ycombinator.com/item?id=41704618
               | 
               | The solution is not, and never has been, to shack up with
               | the capital-c Capitalists in defense of copyright. It's
               | to push for a system where having your "work" automated
               | away is a relief, not a death sentence.
        
               | uoaei wrote:
               | There's both "is" and "ought" components to this
               | conversation and we would do well to disambiguate them.
               | 
               | I would engage with those people you're stereotyping
               | rather than gossiping in a comments section, I suspect
               | you will find their ideologies quite consistent once you
               | tease out the details.
        
             | deergomoo wrote:
             | > IMO it is sad that the sort of... anti-establishment side
             | of tech has suddenly become very worried about copyright
             | 
             | It shouldn't be too surprising that anti-establishment
             | folks are more concerned with trillion-dollar companies
             | subsuming and profiting from the work of independent
             | artists, writers, developers, etc., than with individual
             | people taking IP owned by multimillion/billion-dollar
             | companies. Especially when many of the companies in the
             | latter group are infamous for passing only a tiny portion
             | of the money charged onto the people doing the actual
             | creative work.
        
               | Earw0rm wrote:
               | This.
               | 
               | Tech still acts like it's the scrappy underdog, the
               | computer in the broom cupboard where "the net" is a third
               | space separate from reality, nerds and punks writing
               | 16-bit games.
               | 
               | That ceased to be materially true around twenty years ago
               | now. Once Facebook and smart phones arrived, computing
               | touched every aspect of peoples' lives. When tech is all-
               | pervasive, the internal logic and culture of tech isn't
               | sufficient to describe or understand what matters.
        
               | bee_rider wrote:
               | IMO this is looking at it through a lens which considers
               | "tech" a single group. Which is a way of looking at is,
               | maybe even the best way. But an alternative could be: in
               | the battle between scrappy underdog and centralized
               | sellout tech, the sellouts are winning.
        
               | mulmen wrote:
               | > in the battle between scrappy underdog and centralized
               | sellout tech, the sellouts are
               | 
               | Winning by what metric?
        
             | TheOtherHobbes wrote:
             | Copyright is the right to get a return from creative work.
             | The physical ease - or otherwise - of copying is absolutely
             | irrelevant to this. So is scarcity.
             | 
             | It's also orthogonal to the current corporate dystopia
             | which is using monopoly power to enclose the value of
             | individual work from the other end - _precisely_ by
             | inserting itself into the process of physical distribution.
             | 
             | None of this matters if you have a true abundance economy,
             | _but we don 't._ Pretending we do for purely selfish
             | reasons - "I want this, and I don't see why I should pay
             | the creator for it" - is no different to all the other ways
             | that employers stiff their employees.
             | 
             | I don't mean it's analogous, I mean it's exactly the same
             | entitled mindset which is having such a catastrophic effect
             | on everything at the moment.
        
             | cruffle_duffle wrote:
             | > IMO it is sad that the sort of... anti-establishment side
             | of tech has suddenly become very worried about copyright.
             | 
             | Remember Napster? Like how rebellious was that shit? Those
             | times are what a true social upsetting tech looks like.
             | 
             | You cannot even import a video into OpenAI's Sora without
             | agreeing to a four (five?) checkbox terms & conditions
             | screen. These LLM's come out of the box neutered by
             | corporate lawyers and various other safety weenies.
             | 
             | This shit isn't real until there are mainsteam media
             | articles expressing outrage because some "dangerous group
             | of dark web hackers finished training a model at home that
             | very high school student on the planet can use to cheat on
             | their homework" or something like that. Basically it ain't
             | real until it actually challenges The Man. That isn't
             | happening until this tech is able to be trained and
             | inferenced from home computers.
        
               | bee_rider wrote:
               | Yeah, or if it becomes possible to train on a peer-to-
               | peer network somehow. (I'm sure there's researching going
               | on in that direction). Hopefully that sort of thing comes
               | out of the mix.
        
           | s1artibartfast wrote:
           | I think that AI output is transformative.
           | 
           | I think the training process constitutes commercial use.
        
           | jokethrowaway wrote:
           | copyrighting a sequence of numbers should have never existed
           | in the first place
           | 
           | great if AI accelerates its destruction (even if it's through
           | lobbying to our mafia-style protect-the-richest-company
           | governments)
        
         | pockmarked19 wrote:
         | Calling it a neural network was clearly a mistake on the
         | magnitude of calling a wheel a leg.
        
           | cgearhart wrote:
           | This is an excellent analogy. Aside from "they're both
           | networks" (which is almost a truism), there's really nothing
           | in common between an artificial neural network and a brain.
        
             | runarberg wrote:
             | Neurons also adjust the signal strength based on previous
             | stimuli, which in effect makes the future response
             | weighted. So it is not far off--albeit a gross
             | simplification--to call the brain a weight matrix.
             | 
             | As I learned it, artificial neural networks were modeled
             | after a simple model for the brain. The early (successful)
             | models were almost all reinforcement models, which is also
             | one of the most successful model for animal (including
             | human) learning.
        
           | legacynl wrote:
           | I don't really get where you're coming from..
           | 
           | Is your point that the capabilities of these models have
           | grown such that 'merely' calling it a neural network doesn't
           | fit the capabilities?
           | 
           | Or is your point that these models are called neural networks
           | even though biological neural networks are much more complex
           | and so we should use a different term to differentiate the
           | simulated from the biological ?
        
             | juped wrote:
             | It was clearly a mistake because people start attempting to
             | make totally incoherent comparisons to rat brains.
        
             | joe_the_user wrote:
             | The OP is comparing the "neuron count" of an LLM to the
             | neuron count of animals and humans. This comparison is
             | clearly flawed. Even you step back and say "well, the units
             | might not be the same but LLMs are getting more complex so
             | pretty soon they'll be like animals". Yes, LLMs are complex
             | and have gained more behaviors through size and increased
             | training regimes but if you realize these structure aren't
             | like brains, there's no argument here that they will soon
             | reach to qualities of brains.
        
               | cs702 wrote:
               | Actually, I'm comparing the "neuron-neuron connection
               | count," while admitting that the comparison is not
               | apples-to-apples.
               | 
               | This kind of comparison isn't a new idea. I think Hans
               | Moravec[a] was the first to start making these kinds of
               | machine-to-organic-brain comparisons, back in the 1990's,
               | using "millions of instructions per second" (MIPS) and
               | "megabytes of storage" as his units.
               | 
               | You can read Moravec's reasoning and predictions here:
               | 
               | https://www.jetpress.org/volume1/moravec.pdf
               | 
               | ---
               | 
               | [a] https://en.wikipedia.org/wiki/Hans_Moravec
        
               | legacynl wrote:
               | I think he was approaching the concept from the direction
               | of "how many mips and megabytes do we need to create
               | human level intelligence".
               | 
               | That's a different take than "human level is this many
               | mips and megabytes", i.e. his claims are about artificial
               | intelligence, not about biological intelligence.
               | 
               | The machine learning seems to be modeled after the action
               | potential part of neural communication. But biological
               | neurons can communicate also in different ways, i.e.
               | neuro transmitters. Afaik this isn't modeled in the
               | current ml-models at all (neither do we have a good idea
               | how/why that stuff works). So ultimately it's pretty
               | likely that a ml with a billion parameters does not
               | perform the same as an organic brain with a billion
               | synapses
        
               | cs702 wrote:
               | I never claimed the machines would achieve "human level,"
               | however you define it. What I actually wrote at the root
               | of this thread is that we have no way of knowing in
               | advance what the _future_ capabilities of these AI
               | systems might be as we scale them up.
        
               | tshaddox wrote:
               | Most simple comparisons are flawed. Even just comparing
               | the transistor counts of CPUs with vastly different
               | architectures would be quite flawed.
        
               | torginus wrote:
               | Afaict OP's not comparing neuron count, but neuron-to-
               | neuron connections, aka synapses. And considering each
               | synapse (weighted input) to a neuron performs
               | computation, I'd say it's possible it captures a
               | meaningful property of a neural network.
        
           | bgnn wrote:
           | excellent analogy. piggybacking on this: a lot of believers
           | (as they are like religious fanatics) claim that more data
           | and hardware will eventually make LLMs intelligent, as if
           | it's even the neuron count matters. There is no other animal
           | close to humans in intelligence, and we don't know why.
           | Somehow though a random hallucinating LLMs + shit loads of
           | electricity would figure it out. This is close to pure
           | alchemy.
        
             | runarberg wrote:
             | I don't disagree with your main point but I want to push
             | back on the notion that " _there is no other animal close
             | to humans in intelligence_ ". This is only true in the
             | sense that we humans define intelligence in human terms.
             | Intelligence is a very fraught and problematic concept both
             | in philosophy, but especially in the sciences (particularly
             | psychology).
             | 
             | If we were dogs surely we would say that humans were quite
             | skillful, impressively so even, in pattern matching,
             | abstract thought, language, etc. but are hopelessly dumb at
             | predicting past presence via smell, a crow would similarly
             | judge us on our inability to orient our selves, and
             | probably wouldn't understand our language and thus
             | completely miss our language abilities. We do the same when
             | we judge the intelligence of non-human animals or systems.
             | 
             | So the reason for why no other animal is close to us in
             | intelligence is very simple actually, it is because of the
             | way we define intelligence.
        
         | andrepd wrote:
         | NNs are in no way shape or form even remotely similar to human
         | neural tissue, so your whole analogy falls there.
        
         | legacynl wrote:
         | A little nitpick; a biological neuron is much more complex than
         | it's ml-model equivalent. a simple weighted function cannot
         | fully replicate a neuron.
         | 
         | That's why it's almost certain that a biological brain with a
         | billion synapses outperforms a model with a billion parameters.
        
           | mort96 wrote:
           | Isn't that what they meant by this?
           | 
           | > the comparison is not apples-to-apples, because each
           | synapse is much more complex than a single parameter in a
           | weight matrix.
        
             | daveguy wrote:
             | It isn't just "not apples to apples". It's apples to
             | supercomputers.
        
             | legacynl wrote:
             | well yeah but it's un-obviously a very big difference that
             | basically invalidates any conclusion that you can make with
             | this comparison.
        
               | mort96 wrote:
               | I don't think so: it seems reasonable to assume that
               | biological neurons are strictly more powerful than
               | "neural network" weights, so the fact that a human brain
               | has 3 orders of magnitude more biological neurons than
               | language models have weights tells that we should expect,
               | _as an extreme lower bound_ , 3 orders of magnitude
               | difference.
        
           | joe_the_user wrote:
           | It's not a "nitpick", it's a complete refutation. LLM don't
           | have a strong relationship to brains, they're just
           | math/computer constructs.
        
         | joe_the_user wrote:
         | This tech has made a big impact, obviously is real and exactly
         | what potentials can unlocked by scaling is worth considering...
         | 
         | ... but calling vector-entries in a tensor flow process "
         | _neurons_ " is at best a very loose analogy while comparing LLM
         | "neuron numbers" to animals and humans is flat-out nonsense.
        
         | fsndz wrote:
         | yes indeed. But I see more and more people arguing against the
         | very possibility of AGI. Some people say statistical models
         | will always have a margin of error and as such will have some
         | form of reliability issues:
         | https://open.substack.com/pub/transitions/p/here-is-why-ther...
        
           | rmbyrro wrote:
           | the possibility of error is a requirement for AGI
           | 
           | the same foundation that makes the binary model of
           | computation so reliable is what also makes it unsuitable to
           | solving complex problems with any level of autonomy
           | 
           | in order to reach autonomy and handle complexity, the
           | computational model foundation _must_ accept errors
           | 
           | because the real world is _not binary_
        
             | sroussey wrote:
             | This really speaks to the endeavors of making non-digital
             | hardware for AI. Less of an impedance mismatch.
        
         | kerkeslager wrote:
         | > As of right now, we have no way of knowing in advance what
         | the capabilities of current AI systems will be if we are able
         | to scale them by 10x, 100x, 1000x, and more.
         | 
         | Uhh, yes we do.
         | 
         | I mean sure, we don't know everything, but we know one thing
         | which is very important and which isn't under debate by anyone
         | who knows how current AI works: current AI response quality
         | cannot surpass the quality of its inputs (which include both
         | training data and code assumptions).
         | 
         | > The number of neuron-neuron connections in current AI systems
         | is still tiny compared to the human brain.
         | 
         | And it's become abundantly clear that this isn't the important
         | difference between current AI and the human brain for two
         | reasons: 1) there are large scale structural differences which
         | contain implicit, inherited input data which goes beyond neuron
         | quantity, and 2) as I said before, we cannot surpass the
         | quality of input data, and current training data sets clearly
         | do not contain all the input data one would need to train a
         | human brain anyway.
         | 
         | It's true we don't know _exactly_ what would happen if we
         | scaled up a current-model AI to human brain size, but _we do
         | know_ that it would _not_ produce a human brain level of
         | intelligence. The input datasets we have simply do not contain
         | a human level of intelligence.
        
         | petesergeant wrote:
         | ... and any other answer is just special pleading towards what
         | people want to be true. "What LLMs can't do" is increasingly
         | "God of the gaps" -- someone states what they believe to be a
         | fundamental limitation, and then later models show that
         | limitation doesn't hold. Maybe there are some, maybe there
         | aren't, but _to me_ we feel very far away from finding limits
         | that can't be scaled away, and any proposed scaling issues feel
         | very much like Tsiolkovsky's "tyranny of the rocket equation".
         | 
         | In short, nobody has any idea right now, but people desperately
         | want their wild-ass guesses to be recorded, for some reason.
        
         | HarHarVeryFunny wrote:
         | > As of right now, we have no way of knowing in advance what
         | the capabilities of current AI systems will be if we are able
         | to scale them by 10x, 100x, 1000x, and more.
         | 
         | I don't think that's totally true, and anyways it depends on
         | what kind of scaling you are talking about.
         | 
         | 1) As far as training set (& corresponding model + compute)
         | scaling goes - it seems we do know the answer since there are
         | leaks from multiple sources that training set scaling
         | performance gains are plateauing. No doubt you can keep
         | generating more data for specialized verticals, or keep feeding
         | video data for domain-specific gains, but for general text-
         | based intelligence existing training sets ("the internet",
         | probably plus many books) must have pretty decent coverage.
         | Compare to a human: would a college graduate reading one more
         | set of encyclopedias make them significantly smarter or more
         | capable ?
         | 
         | 2) The _new_ type of scaling is not training set scaling, but
         | instead run-time compute scaling, as done by models such as
         | OpenAI 's GPT-o1 and o3. What is being done here is basically
         | adding something similar to tree search on top of the model's
         | output. Roughly: for each of top 10 predicted tokens, predict
         | top 10 continuation tokens, then for each of those predict top
         | 10, etc - so for a depth 3 tree we've already generated -
         | scaled compute/cost by - 1000 tokens (for depth 4 search it'd
         | be 10,000 x compute/cost, etc). The system then evaluates each
         | branch of the tree according to some metric and returns the
         | best one. OpenAI have indicated linear performance gains for
         | exponential compute/cost increase, which you could interpret as
         | linear performance gains for each additional step of tree depth
         | (3 tokens vs 4 tokens, etc).
         | 
         | Edit: Note that the unit of depth may be (probably is)
         | "reasoning step" rather than single token, but OpenAI have not
         | shared any details.
         | 
         | Now, we don't KNOW what would happen if type 2) compute/cost
         | scaling was done by some HUGE factor, but it's the nature of
         | exponentials that it can't be taken too far, even assuming
         | there is aggressive pruning of non-promising branches.
         | Regardless of the time/cost feasibility of taking this type of
         | scaling too far, there's the question of what the benefit would
         | be... Basically you are just trying to squeeze the best
         | reasoning performance you can out of the model by evaluating
         | many different combinatorial reasoning paths ... but ultimately
         | limited by the constituent reasoning steps that were present in
         | the training set. How well this works for a given type of
         | reasoning/planning problem depends on how well a solution to
         | that problem can be decomposed into steps that the model is
         | capable of generating. For things well represented in the
         | training set, where there is no "impedance mismatch" between
         | different reasoning steps (e.g. in a uniform domain like math)
         | it may work well, but in others may well result in "reasoning
         | hallucination" where a predicted reasoning step is
         | illogical/invalid. My guess would be that for problems where o3
         | already works well, there may well be limited additional gains
         | if you are willing to spend 10x, 100x, 1000x more for deeper
         | search. For problems where o3 doesn't provide much/any benefit,
         | I'd guess that deeper search typically isn't going to help.
        
       | amazingamazing wrote:
       | fact of the matter is that if AIs externalities were exposed -
       | that is massive energy consumption - to end users and humanity in
       | general, no one would use it.
        
         | NoGravitas wrote:
         | I wish we could get humanity in general to understand
         | externalities in general.
        
         | fulafel wrote:
         | I think this is wildly optimistic about how environmentally
         | conscious customers of LLMs are. People use fossil fuels
         | directly and through electricity consumption in a
         | unconscionable way at a scale wildly exceeding what a ChatGPT
         | user's energy expenditure is.
         | 
         | We desperately need to rapidly regulately down fossils usage
         | and production for both electricity generation and transport.
         | The rest of the world needs to follow the example of the EU CO2
         | emissions policy which guarantees it's progressing at a
         | downwards slope independent of what the CO2 emissions are spent
         | on.
        
       | billy99k wrote:
       | I use it for fast documentation of unknown (to me) APIs and other
       | pieces of software. It's saved me hours of time, where I didn't
       | have to go through the developers site/documentation and I get
       | quickly get example code.
       | 
       | Would I use the code directly in production? No. I always use it
       | as an example and write my own code.
        
       | nashashmi wrote:
       | The elephant in the room: The user interface problem
       | 
       | We seem to dancing around a problem in the middle of the room
       | like an elephant no one is acknowledging, and that is the
       | interface to Artificial Intelligence and Generative AI is a place
       | that requires several degrees of innovations.
       | 
       | I would argue that the first winning feat of innovation on
       | interfacing with AI was the "CHAT BOX". And it works well enough
       | for the 40% of use cases. And there is another 20% of uses that
       | WE THE PEOPLE can use our imagination (prompt engineering) to
       | manipulate the chat box to solve. On this topic, there was an
       | article/opinion that said complex LLMs are unnecessary because
       | 90% of people don't need it. Yeah. Because the chat box cannot do
       | much more that would require heavier LLMs.
       | 
       | Complex AI and large data sets need nicer presentation and
       | graphics, more actionable interfaces, and more refined activity
       | concepts, as well as metadata that gives information on the
       | reliability or usability of generated information.
       | 
       | Things like edit sections of an article, enhance articles,
       | simplify articles, add relevant images, compress text to fit in a
       | limited space, generate sql data from these reports, refine
       | patterns found in a page with supplied examples, remove objects,
       | add objects, etc.
       | 
       | Some innovation has to happen in MS Office interfaces. Some
       | innovations have to happen in photoshop-like interfaces.
       | 
       | The author is complaining about utopian systems being
       | incompatible with AI. I would argue AI is a utopian system being
       | used in a dystopian world where we are lacking rich usable
       | interfaces.
        
       | vonneumannstan wrote:
       | Anyone making big bold claims about what LLMs definitely CAN or
       | CANNOT do is FULL OF SHIT. Not even the worlds top experts are
       | certain where the limit of these technologies are and we are
       | already connecting them to tools, making them Agentic, etc. so
       | the era of 'pure' LLM chatbots is already dead imo.
        
       | bentt wrote:
       | I don't see how it would because at the end of the day a model is
       | like a program... input->output. This seems infinitely useful and
       | we are just starting to understand how to use this new way of
       | computing.
        
       | ramon156 wrote:
       | Aider with claude sonnet is probably all I needed to get my
       | programming cycle up to speed. I don't think I want anything more
       | as a developer.
       | 
       | That said, it still makes mistakes
        
         | jokethrowaway wrote:
         | so maybe you want less mistakes?
        
       | 1oooqooq wrote:
       | first winds of winter coming...
        
       | malthaus wrote:
       | i'm so confused by these discussions around hitting the wall.
       | 
       | sure, a full-on AGI, non-hallucinating AI would be great. but the
       | current state is already a giant leap. there's so much untapped
       | potential in the corporate world where whole departments,
       | processes, etc can be decimated.
       | 
       | doing this and dealing with the socio-economic and political
       | fall-out from those efficiency leaps can happen while research
       | (along multiple pathways) goes on, and this will take 5-10 years
       | at least.
        
       | mediumsmart wrote:
       | _" Nobody's gonna believe that computers are intelligent until
       | they start coming in late and lying about it."_
       | 
       | btw: the german KI (Keine Intelligenz) is much more accurate than
       | AI (Apparently Intelligent)
        
       | ZiiS wrote:
       | Current approches to AI are almost certainly going to be
       | superseeded eventually, calling that a dead end achives nothing.
        
       | darioush wrote:
       | Just because something is probabilistic in its response doesn't
       | mean it's not useable.
       | 
       | There are many probabilistic algorithms and data structures that
       | we use daily.
       | 
       | Yes, we don't have developed abstractions to integrate an LLM in
       | a programming language, but it doesn't mean no one will make one.
        
       | szundi wrote:
       | All branches of everything is a dead end sooner or later in life
        
       | makach wrote:
       | Betteridge's law of headlines, _current_ AI may absolutely be a
       | dead end, but fortunately technology is evolving and changing -
       | who knows what the future will hold.
        
       | andrewguy9 wrote:
       | Maybe it is, maybe it isn't. The only thing I know is, none of
       | the arrogant fuckers on hacker news know anything about it. But
       | that won't stop them from posting.
        
         | tucnak wrote:
         | There's an upside! If they're wrong, and they manage to
         | convince more people--it basically gives you more of an
         | advantage. I don't get into arguments about the utility of LLM
         | technology anymore because why bother?
        
       | josefritzishere wrote:
       | This may be an exception to Betteridge's law of headlines
        
       | thunkingdeep wrote:
       | Useless and dead end aren't synonymous. It's most certainly a
       | dead end, but it's also not useless.
       | 
       | There a lot of comments here already conflating these two.
       | 
       | This article is also pretty crap. There's a decent summary box
       | but other than that it's all regurgitated half-wisdoms we've all
       | already realized: things will change, probably a lot; nobody
       | knows what the end goal is or how far we are from it; the next
       | quantum leap almost certainly depends on a transcendent
       | architecture or new model entirely.
       | 
       | This whole article could've been a single paragraph honestly, and
       | a lot of the comments here probably wouldn't have read that
       | either... just sayin
        
       | mmaunder wrote:
       | "In my mind, all this puts even state-of-the-art current AI
       | systems in a position where professional responsibility dictates
       | the avoidance of them in any serious application."
       | 
       | And yet here we are with what we all think of as serious and
       | seriously useful applications.
       | 
       | "My first 20 years of research were in formal methods, where
       | mathematics and logic are used to ensure systems operate
       | according to precise formal specifications, or at least to
       | support verification of implemented systems."
       | 
       | I think recommending avoiding building anything serious in the
       | field until your outdated verification methodology catches up is
       | unreasonably cynical, but also naive because it discards the true
       | nature of our global society and assumes a lab environment where
       | this kind of control is possible.
        
       | rob_c wrote:
       | Yes, until someone introduces reward into llm training I doubt
       | we'll get much further
        
       | dfilppi wrote:
       | Somehow, fallible humans create robust systems. Look to "AI' to
       | do the same, at a far higher speed. The "AI" doesn't need to
       | recite the Fibonacci sequence; it can write (and test) a program
       | that does so. Speed is power.
        
       | quotemstr wrote:
       | Whenever a new technology emerges, along with it always emerge
       | naysayers who claim that the new technology could never work ---
       | _while it 's working right in front of their noses_. I'm sure
       | there were people after Kitty Hawk who insisted that heavier than
       | air flight would never amount to much economically. Krugman
       | famously insisted in the 90s that the internet would never amount
       | to anything. These takes are comical in hindsight.
       | 
       | The linked article is another one of these takes. AI can
       | _obviously_ reason. o3 is _obviously_ superhuman along a number
       | of dimensions. AI is _obviously_ useful for software development.
       | This guy spend 20 years of his life working on formal methods. Of
       | course he 's going to poo-poo the AI revolution. That doesn't
       | make him right.
        
         | sealeck wrote:
         | > Whenever a new technology emerges, along with it always
         | emerge naysayers who claim that the new technology could never
         | work
         | 
         | There's some survivorship bias going on here - you only
         | consider technologies which succeeded, and find examples of
         | people scrutinising them beforehand. However, we know that not
         | every nascent technology blossoms; some are really effective,
         | but can't find adopters; some are ahead of their time; some are
         | cost-prohibitive; and some are outright scams.
         | 
         | It's not a given that every promising new technology is a
         | penicillin - some might be Theranos.
        
       | omolobo wrote:
       | > I would call 'LLM-functionalism': the idea that a natural
       | language description of the required functionality fed to an LLM,
       | possibly with some prompt engineering, establishes a meaningful
       | implementation of the functionality.
       | 
       | My boy. More people need common sense like this talked into them.
        
       | mikewarot wrote:
       | AI is only a dead end if you expect it to function
       | deterministically. In the same way as people, it's not rational,
       | and it can't be made rational.
       | 
       | For example, the only effective way to get an AI not to talk
       | about Bryan Lunduke is to have an external layer that scans for
       | his name in the output of an AI, if found, stops the session and
       | prints an error message instead.
       | 
       | If you're willing to build systems around it (like we do with
       | people) to limit it's side effects and provide sanity checks, and
       | legality checks like those mentioned above, it can offer useful
       | opinions about the world.
       | 
       | The main thing to remember is that AI is an _alien_ intelligence.
       | Each new model is effectively the product of millions of dollars
       | worth of forced evolution. You 're getting Stitch from "Lilo and
       | Stitch", and you'll never be sure if it's having a bad day.
        
         | clint wrote:
         | Also, is there a known deterministic intelligence? Only very
         | specific computer programs can be made deterministic, and even
         | that has taken quite a while for us to nail down. A lot of code
         | and systems of code produced by humans today is not
         | deterministic and it takes a lot of effort to get it there. For
         | most people and teams its not even on their radar or worth the
         | effort to get it there.
        
         | bloomingkales wrote:
         | Control freaks have a serious issue with the incompleteness of
         | an LLM. Everyone else is just ecstatic that it gets you 70% of
         | the way there often.
        
           | arrosenberg wrote:
           | > Control freaks
           | 
           | People who like repeatable results in their work equipment
           | are control freaks?
        
             | therein wrote:
             | I know, right? Software Engineers with their zeal for
             | determinism. How dare they.
        
               | Terr_ wrote:
               | Or modern mechanical engineers getting all pissy about
               | "tolerances." Look, we shipped you a big box of those
               | cheap screws, so just keep trying a different one until
               | each motor sticks together.
        
         | jacobgkau wrote:
         | > For example, the only effective way to get an AI not to talk
         | about Bryan Lunduke is to have an external layer that scans for
         | his name in the output of an AI, if found, stops the session
         | and prints an error message instead.
         | 
         | > If you're willing to build systems around it (like we do with
         | people) to limit it's side effects and provide sanity checks,
         | 
         | I don't think that comparison holds up. We do build systems
         | around people, but people also have internal filters, and most
         | are able to use them to avoid having to interact with the
         | external ones. You seemed to state that AI's don't (can't?)
         | have working internal filters and rely on external ones.
         | 
         | Imagine if everyone did whatever they wanted all the time and
         | cops had to go around physically corralling literally everyone
         | at all times to maintain something vaguely resembling "order."
         | That would be more like a world filled with animals than
         | people, and even animals have a bit more reasoning than that.
         | That's where we are with AI, apparently.
        
           | clint wrote:
           | > Imagine if everyone did whatever they wanted all the time
           | and cops had to go around physically corralling literally
           | everyone at all times to maintain something vaguely
           | resembling "order."
           | 
           | I don't need to imagine anything. I live on Earth in America
           | and to my mind you've very accurately described the current
           | state of human society.
           | 
           | For the vast majority of humans this is how it works
           | currently.
           | 
           | The amount of government, military, and police and the
           | capital, energy, and time to support all of that in every
           | single country on earth is pretty much the only thing holding
           | up the facade of "order" that some people seem to take for
           | granted.
        
             | jacobgkau wrote:
             | > For the vast majority of humans this is how it works
             | currently.
             | 
             | No it is not. Like I said, everyone knows everyone has an
             | internal "filter" on what you say (and do). The _threat_ of
             | law enforcement may motivate everything (if you want to be
             | edgy with how you look at it), but that is not the same
             | thing as being actively, physically corrected at every
             | turn, which is what the analogy in question lines up with.
        
       | peter_retief wrote:
       | AI is useful as a tool but it is far from trustworthy.
       | 
       | I just used Grok to write some CRON scripts for me, gave me
       | perfectly good results, if you know exactly what you want, it is
       | great.
       | 
       | It is not the end of software programmers though and is very
       | dangerous to give it too much leeway because you will almost
       | certainly end up with problems.
       | 
       | I agree with the conclusion that a hybrid model is possible.
        
         | DeepYogurt wrote:
         | > if you know exactly what you want, it is great.
         | 
         | Kinda kills the utility if you need to know what you want out
         | tho...
        
           | stanac wrote:
           | It speeds up code writing, it's not useless. Best use case
           | for me is to help me understand libraries that are sparsely
           | documented (e.g. dotnet roslyn api).
           | 
           | edit: spelling
        
           | xandrius wrote:
           | If I can get 100 lines generated instantly while explaining
           | it in 25, scan the answer just to validate it and then, no
           | wait, add other 50 lines as I forgot something before. All
           | that in minutes then I'm happy.
           | 
           | Plus I can detach the "tell the AI" part from the actual
           | running of the code. That's pretty powerful to me.
           | 
           | For instance, I could be on the train thinking of something,
           | chat it over with an LLM, get it where I want and then pause
           | before actually copying it into the project.
        
       | dmead wrote:
       | Yes. It's really time to move on (to the next scam).
        
       | Hilift wrote:
       | >current AI should not be used for serious applications.
       | 
       | "If an an artificial person can do a job and make fewer mistakes
       | than a real person, why not?"
       | 
       | Is the question everyone in business is asking.
        
         | mentalgear wrote:
         | As evident by most to all the "AI-hiring platforms", it's not
         | about solving a problem successfully, but using the latest
         | moniker/term/sticker to appear as if you solve the problem
         | successfully.
         | 
         | In reality, neither the client nor the user base have access to
         | the ground truth of these "AI system"s to determine actual
         | reliability and efficiency.
         | 
         | That's not to say there aren't some genuine ML/AGI companies
         | like DeepMind (which solve specific narrow problems with quite
         | high confidently), but most of the "AI" companies feel like
         | they are coming from Crypto and are now selling little more
         | than vaporware in the AI gold rush.
        
         | hmillison wrote:
         | in reality the question is more so, can the AI do a "good
         | enough" job to not be noticeably worse than a real person?
        
         | dghlsakjg wrote:
         | > "If an an artificial person can do a job and make fewer
         | mistakes than a real person, why not?"
         | 
         | The very simple answer to that is that the artificial person
         | can't do the full job of a person yet.
         | 
         | Being good or better _at certain parts_ of a job does not mean
         | it can do the whole job effectively.
        
         | sanderjd wrote:
         | I always find this to be a false dichotomy. I'm not sure what
         | use cases are a good fit for generative AI models to tackle
         | without human supervision. But there are clearly many tasks
         | where the combination of generative AI with human direction is
         | a big productivity boon.
        
         | 015a wrote:
         | "Making fewer mistakes" implies that there's a framework within
         | which the agent operates where its performance can be quickly
         | judged as correct or incorrect. But, computers have already
         | automated many tasks and roles in companies where this
         | description applies; and competitive companies now remain
         | capitalistically competitive not because they have stronger
         | automation of boolean jobs, but because they're better
         | configured to leverage human creativity in tasks and roles
         | performance in which cannot be quickly judged as correct or
         | incorrect.
         | 
         | Apple is the world's most valuable company, and many would
         | attribute a strong part of their success to Jobs' legacy of
         | high-quality decision-making. But anyone who has worked in a
         | large company understands that there's no way Apple can so
         | consistently produce their wide range of highly integrated,
         | high quality products with only a top-down mandate from one
         | person; especially a dead one. It takes thousands of people,
         | the right people, given the right level of authority, making
         | high-quality high-creativity decisions. It also, obviously,
         | takes the daily process, an awe-inspiring global supply chain,
         | automation systems, and these are areas that computers, and now
         | AI, can have a high impact in. But that automation is a
         | commodity now. Samsung has access to that same automation, and
         | they make fridges and TVs; so why aren't they worth almost four
         | trillion dollars?
         | 
         | AI doesn't replace humans; it, like computers more generally
         | before it, brings the process cost of the inhuman things it can
         | automate to zero. When that cost is zero, AI cannot be a
         | differentiating factor between two businesses. The
         | differentiating factors, instead, become the capital the
         | businesses already have to deploy (favoring of established
         | players), and the humans who interact with the AI, interpreting
         | and when necessary executing on its decisions.
        
         | jandrese wrote:
         | 1979 presentation at IBM:
         | 
         | "A computer can never be held accountable. Therefore, a
         | computer must never make a management decision."
         | 
         | There are lots of bullshit jobs that we could automate away, AI
         | or no. This is far from a new problem. Our current "AI"
         | solutions promise to do it cheaper, but detecting and dealing
         | with "hallucinations" is turning out to be more expensive than
         | anticipated and it's not at all clear to me that this will be
         | the silver bullet that the likes of Sam Altman claims it will
         | be.
         | 
         | Even if the AI solution makes fewer mistakes, the magnitude of
         | those mistakes matter. The human might make transcription
         | errors with patient data or other annoying but fixable clerical
         | errors, while the AI may be perfect with transcription but make
         | completely sensible sounding but ultimately nonsense diagnosis,
         | with dangerous consequences.
        
           | warkdarrior wrote:
           | 1953 IBM also thought that "there is a world market for maybe
           | five computers," so I am not sure their management views are
           | relevant this many decades later.
        
       | simpaticoder wrote:
       | _>...developing software to align with the principle that
       | impactful software systems need to be trustworthy, which implies
       | their development needs to be managed, transparent and
       | accountable._
       | 
       | The author severely discounts the value of opacity and
       | unaccountability in modern software systems. Large organizations
       | previous had to mitigate moral hazard with unreliable and
       | burdened-with-conscience labor. LLM style software is superior on
       | every axis in this application.
        
       | nbzs wrote:
       | I am a simple man. In 2022 I glanced trough Attention is all you
       | need and forgot about it. A lot of people made money. A lot of
       | people believed that the end of programmers and designers is
       | absolute. Some people on the stage announced The death of coding.
       | Others bravely explored the future in which people are not needed
       | for creative work.
       | 
       | Aside of the anger that this public stupidity produced in me, I
       | always knew that this day will come.
       | 
       | Maybe next time someone will have the balls not to call a text-
       | generator's with inherent hallucination Intelligence? Who knows.
       | Miracles can happen.:)
        
         | qaq wrote:
         | To push something to the limit requires a lot of funding if
         | public never got overexcited about some tech many really cool
         | things would have never being tried. Also LLM are pretty useful
         | even as is. It sure made me more productive.
        
           | nbzs wrote:
           | I just imagine the world in which the industry defined by
           | deterministic nature and facts has the bravery to call spade
           | a spade. LLM's have a function. Machine learning also. But
           | calling LLM's Intelligence and pushing the hype to overdrive?
        
           | godelski wrote:
           | Over hype is what led to the last AI winter. Because they
           | created a railroad and didn't diversify. For some reason
           | we're doing it again.
        
         | peter_retief wrote:
         | In retrospect it seems obvious now.
        
           | uoaei wrote:
           | For some. For others with fewer stars in their eyes, it was
           | obvious from the beginning.
        
             | dingnuts wrote:
             | the launch of ChatGPT had an amount of hype that was
             | downright confusing for someone who had previously
             | downloaded and fine tuned GPT2. Everyone who hadn't used a
             | language model said it was revolutionary but it was
             | obviously evolutionary
             | 
             | and I'm not sure the progress is linear, it might be
             | logarithmic.
             | 
             | genAI in its current state has some uses.. but I fear that
             | mostly ChatGPT is hallucinating false information of all
             | kinds into the minds of uninformed people who think GPT is
             | actually intelligence.
        
               | uoaei wrote:
               | Everyone who actually works on this stuff, and didn't
               | have ulterior motives in hyping it up to (over)sell it,
               | have been identifying themselves as such and providing
               | context for the hype since the beginning.
               | 
               | The furthest they got before the hype machine took over
               | was introducing the term "stochastic parrot" to popular
               | discourse.
        
       | highfrequency wrote:
       | Seems completely nonsensical. Yes, neural networks themselves are
       | not unit testable, modular, symbolic or verifiable. That's why we
       | have them produce _code artifacts_ - which possess all those
       | traits and can be reviewed by both humans and other machines.
       | It's completely analogous to human software engineers, who are
       | unfortunately black boxes as well.
       | 
       | More broadly, I've learned to attach 0 credence to any conceptual
       | argument that an approach will _not_ lead somewhere interesting.
       | The hit rate on these negative theories is atrocious, they are
       | often motivated by impure reasons, and the downside is very
       | asymmetric (who cares if you sidestep a boring path? yet how
       | brutal is it to miss an easy and powerful solution?)
        
         | omolobo wrote:
         | [flagged]
        
           | highfrequency wrote:
           | A software engineer that costs $20/month instead of
           | $20k/month, and gets meaningfully more knowledgeable about
           | every field on earth each year?
        
       | KronisLV wrote:
       | Honestly, I think it's nothing special to say that certain
       | technologies have an end point.
       | 
       | We had lots of advancements in single core CPUs but eventually
       | more than that was necessary, now the same is happening with
       | monolithic chips vs chiplet designs.
       | 
       | Same for something like HTTP/1.1 and HTTP/2 and now HTTP/3.
       | 
       | Same for traditional rendering vs something like raytracing and
       | other approaches.
       | 
       | I assume it's the same for typical spell checking and writing
       | assistants vs LLM based ones.
       | 
       | That it's the same for typical autocomplete solutions vs LLM
       | based ones.
       | 
       | It does seem that there weren't former _technological_ solutions
       | for images /animations/models etc. (maybe the likes of Mixamo and
       | animation retargeting, but not much for replacing a concept
       | artist for shops that can't afford one).
       | 
       | Each technology, including the various forms of AI have their
       | limitations, with the exception of how much money has been spent
       | on training the likes of models behind ChatGPT etc. Nothing wrong
       | with that, I'll use LLMs what they're good for and look for
       | something else once new technologies become available.
        
       | seydor wrote:
       | There is no reason to privilege compositionality/modularity vs
       | emergence. One day we may have the emergence of compositionality
       | in a large model. It would be a dead end if this was probably not
       | possible
        
       | Gud wrote:
       | What exactly does current "AI" do?
       | 
       | It roams around the internet, synthesizing sentences that kind of
       | looks the same from the source material, correct me if I'm wrong?
       | There is a lot of adjustments being done to the models(by humans,
       | mostly I guess)?
       | 
       | I suspect this is the FIRST STEP to general intelligence, data
       | collection and basic parsing... I suspect there is not a thing
       | called "reasoning" - but a multi step process... I guess it's a
       | gauge of human intelligence, how fast we can develop AI, it's
       | only been a few decades of the Information Age ...
        
         | resoluteteeth wrote:
         | > I suspect this is the FIRST STEP to general intelligence,
         | data collection and basic parsing... I suspect there is not a
         | thing called "reasoning" - but a multi step process... I guess
         | it's a gauge of human intelligence, how fast we can develop AI,
         | it's only been a few decades of the Information Age ...
         | 
         | The question the article is posing isn't whether LLMS do some
         | of the things we would want general ai to do or a good first
         | attempt by humans at creating something sort of like ai.
         | 
         | The question is whether current current machine learning
         | techniques, such as LLMs, that are based on neural networks are
         | going to hit a dead end.
         | 
         | I don't think that's something anyone can answer for sure.
        
           | AnimalMuppet wrote:
           | LLMs, _by themselves_ , are going to hit a dead end. They are
           | not enough to be an AGI, or even a true AI. The question is
           | whether LLMs can be a part of something bigger. That, as you
           | say, is not something anyone can currently answer for sure.
        
         | jandrese wrote:
         | I've come around to thinking of our modern "AI" as a lossy
         | compression engine for knowledge. When you ask a question it is
         | just decompressing a tiny portion of the knowledge and
         | displaying it for you, sometimes with compression artifacts.
         | 
         | This is why I am not worried about the "AI Singularity" like
         | some notable loudmouth technologists are. At least not with our
         | current ML technologies.
        
           | Gud wrote:
           | Absolutely agree.
        
           | red75prime wrote:
           | With a bit (OK, a lot) of reinforcement learning that
           | prioritizes the best chains-of-thoughts, this compression
           | engine becomes a generator of missing training data on how to
           | actually think about something instead of trying to come up
           | with the answer right away as internet text data suggests it
           | should do.
           | 
           | That's the current ML technology. What you've described is
           | the past. About 4 year old past to be precise.
        
           | cruffle_duffle wrote:
           | That is exactly how I think about it. It's lossy compression.
           | Think about how many petabytes of actual information any of
           | these LLMs were trained on. Now look at the size of the
           | resultant model. Its orders of magnitude smaller. It made it
           | smaller by clipping the high frequency bits of some multi-
           | billion dimension graph of knowledge. Same basic you do with
           | other compression algorithms like JPEG or MP3.
           | 
           | These LLM's are just lossy compression for knowledge. I think
           | the sooner that "idea" gets surfaced people will find ways to
           | train models with fixed pre-computed lookup tables of
           | knowledge categories and association properties... basically
           | taking a lot of the randomness out of the training process
           | and getting more precise about what dimensions of knowledge
           | and facts are embedded into the model.
           | 
           | ... or something like that. But I don't think this
           | optimization will be driven by the large well funded tech
           | companies. They are too invested in flushing money down the
           | drain with more and more compute. Their huge budget blind
           | them to other ways of doing the same thing with significantly
           | less.
           | 
           | The future won't be massive large language models. They'll be
           | "small language models" custom tuned to specific tasks.
           | You'll download or train a model that has incredible
           | understanding of Rust and Django but won't know a single
           | thing about plate tectonics or apple pie recipes.
        
             | wodderam wrote:
             | Why wouldn't we have a small language model for python
             | programming now though?
             | 
             | That is an obvious product. I would suspect the reason we
             | don't have a small language python model is because the
             | fine tuned model is no better than the giant general
             | purpose model.
             | 
             | If that is the case it is not good. It even makes me wonder
             | that we are not really compressing knowledge but a hack to
             | create the illusion of compressing knowledge.
        
       | fetas wrote:
       | Ye Dr r
        
       | trane_project wrote:
       | I think most AI research up to this day is a dead end. Assuming
       | that intelligence is a problem solvable by computers implies that
       | intelligence is a computable function. Nobody up to this day has
       | been able to give a formal mathematical definition of
       | intelligence, let alone a proof that it can be reduced to a
       | computable function.
       | 
       | So why assume that computer science is the key to solving a
       | problem that cannot even be defined in terms of math? We had
       | formal definitions of computers decades before they became a
       | reality, but somehow cannot make progress in formally defining
       | intelligence.
       | 
       | I do think artificial intelligence can be achieved by making
       | artificial intelligence a multidiscipline endeavor with
       | biological engineering at its core, not computer science. See the
       | work of Michael Levin to see real intelligence in action:
       | https://www.youtube.com/watch?v=Ed3ioGO7g10
        
         | leesec wrote:
         | Marcus Hutter did
        
           | Xunjin wrote:
           | Could you point out where we could find the related info?
        
           | trane_project wrote:
           | Thanks for pointing me out to this. This is a proposed
           | definition of intelligence. Is it the same as the real thing,
           | though? Even assuming that it was:
           | 
           | > Like Solomonoff induction, AIXI is incomputable.
           | 
           | That would mean that computers can, at best, produce an
           | approximation. We know the real thing exists in nature
           | though, so why not take advantage of those competencies?
        
         | lanza wrote:
         | > Nobody up to this day has been able to give a formal
         | mathematical definition of intelligence, let alone a proof that
         | it can be reduced to a computable function.
         | 
         | We can't prove the correctness of the plurality of physics.
         | Should we call that a dead end too?
        
         | Davidbrcz wrote:
         | This is actually a philosophical question !
         | 
         | If you believe in functionalism (~mental states are identified
         | by what they do rather than by what they are made of), then
         | current AI is _not_ a dead end.
         | 
         | We wouldn't need to define intelligence, just make it big and
         | efficient enough to replicate what's currently existing would
         | be intelligence by that definition.
        
           | trane_project wrote:
           | My point is that if you use biological cells to drive the
           | system, which already exhibit intelligent behaviors, you
           | don't have to worry about any of these questions. The basic
           | unit you are using is already intelligent, so it's a given
           | that the full system will be intelligent. And not an
           | approximation but the real thing.
        
             | red75prime wrote:
             | Humanities? Have you chosen humanities as electives?
        
         | hanniabu wrote:
         | Current AI should be referred to as collective intelligence
         | since it needs to be trained and only knows what's been written
        
       | leesec wrote:
       | Incredibly ignorant set of replies on this thread lol. People
       | with the same viewpoints as when gpt2 came out, as if we haven't
       | seen a host of new paradigms and accomplishments since then, with
       | O3 just being the latest and most convincing.
        
         | logicchains wrote:
         | Let them have their fun while they can, it's gonna get pretty
         | bleak in the next 5-10 years when coding jobs are being
         | replaced left and right by bots that can do the work better and
         | cheaper.
        
           | boshalfoshal wrote:
           | Maybe you're retired or not a SWE or knowledge worker
           | anymore, but I have a decent amount of concern about this
           | future.
           | 
           | As a society, we have not even begun to think about what
           | happens when large swathes of the population become
           | unemployed. Everyone says they'd love to not work, but no one
           | says they can survive without money. Our society trades labor
           | for money. And I have very little faith in our society or the
           | government to alleviate this through something like UBI.
           | 
           | Previously it was physical work that was made more efficient,
           | but the one edge we thought we would always have as humans -
           | our creativity and thinking skills - is also being displaced.
           | And that too, its fairly clear that the leaders in the space
           | (apart from maybe Anthropic?) are doing this purely from a
           | capitalist driven profit first motivation.
           | 
           | I for one think the world will be a worse place for a few
           | years immediately after AGI/ASI.
        
           | asdff wrote:
           | Why not just hire out of bangladesh though?
        
         | mrshadowgoose wrote:
         | It's deeply saddening to see how fixated people are on the
         | here-and-now, while ignoring the terrifying rate of progress,
         | and its wide-ranging implications.
         | 
         | We've gone from people screeching "deep learning has hit its
         | limits" in 2021 to models today that are able to reason within
         | limited, but economically relevant contexts. And yet despite
         | this, the same type of screeching continues.
        
           | martindbp wrote:
           | It's the same kind of people who claimed human flight would
           | not be possible for 10,000 years in 1902. I just can't
           | understand how narrow your mind has to be in order to be this
           | skeptic.
        
             | sealeck wrote:
             | Or the same kind of people who claimed Theranos was a scam,
             | or that AI in the 70s wasn't about to produce Terminator
             | within a few years, or that the .com bubble was in fact a
             | bubble...
        
           | zeroonetwothree wrote:
           | Maybe some of us aren't actually impressed with the
           | "progress" since 2022? Doing well at random benchmarks hasn't
           | noticeably improved capability in use for work.
           | 
           | Does that mean it will never improve? Of course not. But
           | don't act like everyone else is some kind of moron.
        
         | dyauspitr wrote:
         | They're scared (as am I) but I have no illusions about the
         | usefulness of these LLMs. Everyone on my team uses them to get
         | their tickets done in a fraction of the time and then just sit
         | around till the sprint ends.
        
         | aerhardt wrote:
         | The innovation in foundational models is far outpacing the
         | applications. Other than protein folding (which is not only
         | LLMs AFAIK) I haven't seen a single application that blows my
         | mind. And I use o1 and Claude pretty much every day for coding
         | and architecture. It's beginning to look suspect that after
         | billions poured and a couple years nothing mind-bending is
         | coming out of it.
        
         | spoaceman7777 wrote:
         | Yeah, sounds like people are encountering a lot of PEBCAK
         | errors in this thread. You get out of LLMs what you put into
         | them, and the complaints, at this point, are more an admission
         | of an inability to learn the new tools well.
         | 
         | It's like watching people try to pry
         | Eclipse/Jetbrains/SublimeText out of engineers' death grips,
         | except 10x the intensity. (I still use Jetbrains fyi :p)
        
           | boshalfoshal wrote:
           | Well thats the argument most people here are making - that
           | current LLMs are not good enough to be fully autonomous
           | precisely because a human operator has to "put the right
           | thing into them to get the right thing out."
           | 
           | If I'm spending effort specifying a problem N times in very
           | specific LLM-instruction-language to get the correct output
           | for some code, I'd rather just write the code myself. After
           | all, thats what code is for. English is lossy, code isn't. I
           | can see codegen getting even better in larger organizations
           | if context windows are large enough to have a significant
           | portion of the codebase in it.
           | 
           | There are areas where this is immediately better in though
           | (customer feedback, subjective advice, small sections of
           | sandboxed/basic code, etc). Basically, areas where the
           | effects of information compression/decompression can be
           | tolerated or passed onto the user to verify.
           | 
           | I can see all of these getting better in a couple of
           | months/few years.
        
       | roody15 wrote:
       | What I find interesting is current LLM's are based primarily on
       | written data which is already an abstraction / abbreviation of
       | most observed phenomenon.
       | 
       | What happens when AI starts to send out it own drones or perhaps
       | robots and tries to gather and train based on data it observes
       | itself.
       | 
       | I think we may be closer to this point than we realize... results
       | of AI could get quite interesting once a human level abstraction
       | of knowledge is perhaps reduced.
        
       | 383toast wrote:
       | Wouldn't the work on interpretability would solve these concerns?
        
       | abeppu wrote:
       | > Eerke Boiten, Professor of Cyber Security at De Montfort
       | University Leicester, explains his belief that current AI should
       | not be used for serious applications.
       | 
       | > In my mind, all this puts even state-of-the-art current AI
       | systems in a position where professional responsibility dictates
       | the avoidance of them in any serious application.
       | 
       | > Current AI systems also have a role to play as components of
       | larger systems in limited scopes where their potentially
       | erroneous outputs can be reliably detected and managed, or in
       | contexts such as weather prediction where we had always expected
       | stochastic predictions rather than certainty.
       | 
       | I think it's important to note that:
       | 
       | - Boiten is a security expert, but doesn't have a background
       | working in ML/AI
       | 
       | - He never defines what "serious application" means, but
       | apparently systems that are designed to be tolerant of missed
       | predictions are not "serious".
       | 
       | He seems to want to trust a system at the same level that he
       | trusts a theorem proved with formal methods, etc.
       | 
       | I think the frustrating part of this article is that from a
       | security perspective, he's probably right about his
       | recommendations, but he seems off-base in the analysis that gets
       | him there.
       | 
       | > Current AI systems have no internal structure that relates
       | meaningfully to their functionality. They cannot be developed, or
       | reused, as components.
       | 
       | Obviously AI systems _do_ have internal structure, and there are
       | re-usable components both at the system level (e.g. we pick an
       | embedding, we populate some vector DB with contents using that
       | embedding, and create a retrieval system that can be used in
       | multiple ways). The architecture of models themselves also has
       | components which are reused, and we make choices about when to
       | keep them frozen versus when to retrain them. Any look at
       | architecture diagrams in ML papers shows one level of these
       | components.
       | 
       | > exponential increases of training data and effort will give us
       | modest increases in impressive plausibility but no foundational
       | increase in reliability.
       | 
       | I think really the problem is that we're fixated on mostly-
       | solving an ever broader set of problems rather than solving the
       | existing problems more reliably. There's plenty of results about
       | ensembling and learning theory that give us a direction to
       | increase increase reliability (by paying for more models of the
       | same size), but we seem far more interested in seeing if we can
       | most of the time solve problems at a higher level of
       | sophistication. That's a choice that we're making. Similarly
       | Boiten mentions the possibility of models with explicit
       | confidences -- and there's been plenty of work on that but b/c
       | there's a tradeoff with model size (i.e. do you want to spend
       | your resources on a bigger model, or on explicitly representing
       | variance around a smaller set number of parameters?) but people
       | seem mostly uninterested.
       | 
       | I think there are real reasons to be concerned about the specific
       | path we're on, but these aren't the good ones.
        
       | saltysalt wrote:
       | I think it does represent a dead end, but not for the reasons
       | presented in this article.
       | 
       | The real issue in my opinion is that we will hit practical limits
       | with training data and computational resources well before AGI
       | turns us all into paperclips, basically there is no "Moore's Law"
       | for AI and we are already slowing down using existing models like
       | GPT.
       | 
       | We are in the vertical scaling phase of AI model development,
       | which is not sustainable long-term.
       | 
       | I discussed this further here for anyone interested:
       | https://techleader.pro/a/658-There-is-no-Moore's-Law-for-AI-...
        
         | NoGravitas wrote:
         | > The real issue in my opinion is that we will hit practical
         | limits with training data and computational resources well
         | before AGI turns us all into paperclips [...]
         | 
         | I think you are correct, but also I think that even if that
         | were not the case, the Thai Library Problem[1] strongly
         | suggests that AGI will have to be built on something other than
         | LLMs (even if LLM-derived systems were to serve as an interface
         | to such systems).
         | 
         | [1]: https://medium.com/@emilymenonbender/thought-experiment-
         | in-t...
        
       | ItCouldBeWorse wrote:
       | Should replace it with head cheese:
       | http://www.technovelgy.com/ct/content.asp?Bnum=687
        
       | slow_typist wrote:
       | Why is it that LLMs are 'stochastic', shouldn't the same input
       | lead to the same output? Is the LLM somehow modifying itself in
       | production? Or is it just flipping bits caused by cosmic
       | radiation?
        
         | ijustlovemath wrote:
         | They probabilistically choose an output. Check out 3b1b's
         | series on LLMs for a better understanding!
        
         | fnl wrote:
         | For Mixture of Expert models (like GPTs are), they can produce
         | different results for an input sequence if that sequence is
         | retried together with a different set of sequences in its
         | inference batch, because of the model ("expert") routing
         | depends on the batch, not the single sequence:
         | https://152334h.github.io/blog/non-determinism-in-gpt-4/
         | 
         | And in general, binary floating point arithmetic cannot
         | guarantee associativity - i.e. `(a + b) + c` might not be the
         | same as `a + (b + c)`. That in turn can lead to the model
         | picking another token in rare cases (and it's auto-regressive
         | consequences, that the entire remainder of the generated
         | sequence might differ): https://www.ingonyama.com/blog/solving-
         | reproducibility-chall...
         | 
         | Edit: Of course, my answer assumes you are asking about the
         | case when the model lets you set its token generation
         | temperature (stochasticity) to exactly zero. With default
         | parameter settings, all LLMs I know of randomly pick among the
         | best tokens.
        
         | sroussey wrote:
         | They always return the same output for the same input. That is
         | how tests are done for llama.cpp, for example.
         | 
         | To get variety, you give each person a different seed. That way
         | each user gets consistent answers but different than each
         | other. You can add some randomness in each call if you don't
         | want the same person getting the same output for the same
         | input.
         | 
         | It would be impossible to test and benchmark llama.cpp et al
         | otherwise!
         | 
         | By the time you get to a UI someone has made these decisions
         | for you.
         | 
         | It's just math underneath!
         | 
         | Hope this helps.
        
       | ohxh wrote:
       | "One could offer so many examples of such categorical prophecies
       | being quickly refuted by experience! In fact, this type of
       | negative prediction is repeated so frequently that one might ask
       | if it is not prompted by the very proximity of the discovery that
       | one solemnly proclaims will never take place. In every period,
       | any important discovery will threaten some organization of
       | knowledge." Rene Girard, Things Hidden Since the Foundation of
       | the World, p. 4
        
       | juped wrote:
       | It's only a dead end if you bought the hype; it's actually
       | useful! Just not all-powerful.
        
       | everdrive wrote:
       | I hope so because I'm extraordinarily sick of the technology. I
       | can't really ask a question at work without some jackass posting
       | an LLM answer in there. The answers almost never amount to
       | anything useful, but no one can tell since it looks clearly
       | written. They're "participating" but haven't actually done
       | anything worthwhile.
        
         | yashap wrote:
         | I hope so, but for different reasons. Agreed they spit out
         | plenty of gibberish at the moment, but they've also progressed
         | so far so fast it's pretty scary. If we get to a legitimate
         | artificial general super intelligence, I'm about 95% sure that
         | will be terrible for the vast, vast majority of humans, we'll
         | be obsolete. Crossing my fingers that the current AI surge
         | stops well short of that, and the push that eventually does get
         | there is way, way off into the future.
        
           | youssefabdelm wrote:
           | Or liberating... as Douglas Rushkoff puts it.
           | 
           | If and only if something like high-paying UBI comes along,
           | and people are freed to pursue their passions and as a
           | consequence, benefit the world much more intensely.
        
             | amarcheschi wrote:
             | everything points to the opposite
        
               | youssefabdelm wrote:
               | It may be impossible in this world to expect a form of
               | donation, but it is certainly not impossible to expect
               | forms of investment.
               | 
               | One idea I had is everyone is paid a thriving wage, and
               | in exchange, if they in the future develop their passion
               | into something that can make a profit, they pay back 20%
               | of their profits they make up to some capped amount.
               | 
               | This allows for extreme generality. It truly frees people
               | to pursue whatever they fancy every day until they catch
               | lightning in a bottle.
               | 
               | There would be 0 obligation as to _what_ to do, and when
               | to pay back the money. But of course would have to be
               | only open to honest people, so that neither side is
               | exploiting the other.
               | 
               | Both sides need a sense of gratitude, and wanting to give
               | back. A philanthropic 'flair' "If it doesn't work out,
               | it's okay", and a gratitude and wanting to give back
               | someday on the side of the receiver, as they continue
               | working on probably the most resilient thing they could
               | ever work on (the safest investment), their lifelong
               | passion.
        
             | gershy wrote:
             | I'm not sure passion exists in a world without struggle...
        
               | kerkeslager wrote:
               | The idea that AI will _ever_ remove _all_ struggle, even
               | if it reaches AGI, is absurd. AI by itself can 't give
               | you a hug, for example--and even if advances in robotics
               | make it possible for an AI-controlled _robot_ to do that,
               | there are dozens of unsolved problems beyond that to make
               | that something that most people would even want.
               | 
               | AI enthusiasm really is reaching a religious level of
               | ridiculous beliefs and this point.
        
               | smallmancontrov wrote:
               | "I only make you struggle because I love you!"
               | 
               | (Mmmhmm, I'm sure the benefits received by the people on
               | top have nothing to do with it.)
        
             | diego_sandoval wrote:
             | I'm not sure if that is something we actually would want.
             | 
             | Lots of people certainly think they want that.
        
               | Hasu wrote:
               | Why wouldn't you want it, unless you are currently
               | benefiting from employing people who would rather be
               | doing literally anything else?
        
               | szundi wrote:
               | Even then he'll probably like employing AI more.
               | 
               | Lots of new taxes and UBI!
        
               | hackinthebochs wrote:
               | For the vast majority of people, getting rid of necessary
               | work will usher in an unprecedented crisis of meaning.
               | Most people aren't the type pursue creative ends if they
               | didn't have to work. They would veg out or engage in
               | degenerate activities. Many people have their identity
               | wrapped up in the work they do, or being a provider. Take
               | this away without having something to replace it with
               | will be devastating.
        
               | pixl97 wrote:
               | >They would veg out or engage in degenerate activities
               | 
               | "Oh no the sinners might play video games all day"
               | 
               | I do expect the next comment would be something like
               | "work is a path to godliness"
        
               | hackinthebochs wrote:
               | >I do expect the next comment would be something like
               | "work is a path to godliness"
               | 
               | And you think these kinds of maxims formed out of
               | vacuums? They are the kinds of sayings that are formed
               | through experience re-enforced over generations. We can't
               | just completely reject all historical knowledge encoded
               | in our cultural maxims and expect everything to work out
               | just fine. Yes, it is true that most people not having
               | productive work will fill the time with frivolous or
               | destructive ends. Modernity does not mean we've somehow
               | transcended our historical past.
        
               | EMIRELADERO wrote:
               | > They are the kinds of sayings that are formed through
               | experience re-enforced over generations.
               | 
               | Sure, but the whole point is that the conditions that led
               | to those sayings would no longer be there.
               | 
               | Put a different way: those sayings and attitudes were
               | necessary in the first place because society needed
               | people to work in order to sustain itself. In a system
               | where individual human work is no longer necessary, of
               | what use is that cultural attitude?
        
               | hackinthebochs wrote:
               | It wasn't just about getting people to work, but keeping
               | people from degenerate and/or anti-social behavior.
               | Probably the single biggest factor in the success of a
               | society is channeling young adult male behavior towards
               | productive ends. Getting them to work is part of it, but
               | also keeping them from destructive behavior. In a world
               | where basic needs are provided for automatically, status-
               | seeking behavior doesn't evaporate, it just no longer has
               | a productive direction that anyone can make use of. Now
               | we have idle young men at the peak of their status-
               | seeking behavior with little productive avenues available
               | to them. It's not hard to predict this doesn't end well.
               | 
               | Beyond the issues of young males, there's many other ways
               | for degenerate behavior to cause problems. Drinking,
               | gambling, drugs, being a general nuisance, all these
               | things will skyrocket if people have endless time to
               | fill. Just during the pandemic, we saw the growth of
               | roving gangs riding ATVs in some cities causing a serious
               | disturbance. Some cities now have a culture of teenagers
               | hijacking cars. What happens to these people who are on
               | the brink when they no longer see the need to go to
               | school because their basic needs are met? Nothing good,
               | that's for sure.
        
               | EMIRELADERO wrote:
               | What exactly do you think would happen? Usually wars are
               | about resources. When resource distribution stops being a
               | problem (i.e, anyone can live like a king just by
               | existing), where exactly does a problem manifest?
               | 
               | All the "degenerate activities" you mentioned are a
               | problem in the first place because in a scarcity-based
               | society they slow down/prevent people from working,
               | therefore society is worse off. That logic makes no sense
               | in a world where people don't need to put a single drop
               | of effort for society to function well.
        
               | hackinthebochs wrote:
               | >All the "degenerate activities" you mentioned are a
               | problem in the first place because in a scarcity-based
               | society they slow down/prevent people from working
               | 
               | This is a weird take. Families are worse off if a parent
               | has an addiction because it potentially makes their lives
               | a living hell. Everyone is worse off if people feel
               | unsafe because of a degenerate sub-culture that glorifies
               | things like hijacking cars. People who don't behave in
               | predictable ways create low-trust environments which
               | impacts everyone.
        
               | Hasu wrote:
               | > And you think these kinds of maxims formed out of
               | vacuums?
               | 
               | Do you think they've always existed in all human cultures
               | throughout time?
               | 
               | The pro-work ethic is fairly new in human civilization.
               | Previous cultures considered it to be a burden or
               | punishment, not the source of moral virtue.
               | 
               | > Yes, it is true that most people not having productive
               | work will fill the time with frivolous or destructive
               | ends.
               | 
               | And that's fine! A lot of people fill their time at work
               | with frivolous or destructive ends, whether on their own
               | or at the behest of their employer.
               | 
               | Not all work is productive. Not all work is good. It
               | isn't inherently virtuous and its lack is not inherently
               | vicious.
        
               | cortesoft wrote:
               | > And you think these kinds of maxims formed out of
               | vacuums?
               | 
               | No, they formed in societies where it WAS necessary for
               | most people to work in order to support the community. We
               | needed a lot of labor to survive, so it was important to
               | incentivize people to work hard, so our cultures
               | developed values around work ethics.
               | 
               | As we move more and more towards a world where we
               | actually don't need everyone to work, those moral values
               | become more and more outdated.
               | 
               | This is just like old religious rules around eating
               | certain foods; in the past, we were at risk from a lot of
               | diseases and avoiding certain foods was important for our
               | health. Now, we don't face those same risks so many
               | people have moved on from those rules.
        
               | hackinthebochs wrote:
               | >those moral values become more and more outdated.
               | 
               | Do you think there was ever a time in human societies
               | where the vast majority of people didn't have to "work"
               | in some capacity, at least since the rise of
               | psychologically modern humans? If not, why think humanity
               | as a whole can thrive in such an environment?
        
               | cortesoft wrote:
               | Our environment today is completely different that it was
               | even 100 years ago. Yes, you have to ask this question
               | for every part of modern society (fast travel,
               | photographs, video, computers, antibiotics, vaccines,
               | etc), so I am not sure why work is different.
        
               | hackinthebochs wrote:
               | Part of the problem is that we don't ask these questions
               | when we should be. Social media, for example, represents
               | a unique assault on our psychological makeup that we just
               | uncritically unleashed on the world. We're about to do it
               | again, likely with even worse consequences.
        
               | youssefabdelm wrote:
               | Good. Finally they'll realize the meaninglessness of
               | their work and how they've been exploited in the most
               | insidious way. To the point of forgetting to answer the
               | question of what it is they most want to do in life.
               | 
               | The brain does saturate eventually and gets bored. Then
               | the crisis of meaning. Then something meaningful emerges.
               | 
               | We're all gonna die. Let's just enjoy life to the
               | fullest.
        
               | gehwartzen wrote:
               | The way most of the world is setup we will need to first
               | address the unprecedented crisis of financing our day to
               | day lives. We figure that out and I'm sure people will
               | find other sources of meaning in their lives.
               | 
               | The people that truly enjoy their work and obtain meaning
               | from it are vastly over represented here on HN.
               | 
               | Very few would be scared of AI if they had a financial
               | stake in its implementation.
        
             | drdaeman wrote:
             | That requires achieving post-scarcity to work in practice
             | and be fair, though. If achievable, it's not clear how it
             | relates to AGI. I mean, there's plenty of intelligence on
             | this planet already, and resources are still limited - and
             | it's not like AGI would somehow change that.
        
               | roboboffin wrote:
               | One thing I thought recently, is that a large amount of
               | work is currently monitoring and correcting human
               | activity. Corporate law, accounting, HR and services etc.
               | If we have AGI that is forced to be compliant, then all
               | these businesses disappear. Large companies are suddenly
               | made redundant, regardless of whether they replace their
               | staff with AI or not.
        
               | drdaeman wrote:
               | I agree that if true AGI happens (current systems still
               | cannot reason at all, only pretend to do so) and if it
               | comes out cheaper to deploy and maintain, that would mean
               | a lot of professions could be automated away.
               | 
               | However, I believe this had already happened quite a few
               | times in history - industries becoming obsolete with
               | technological advances isn't anything new. This creates
               | some unrest as society needs to transition, but those
               | people are always learning a different profession. Or
               | retire if they can. Or try to survive some other way
               | (which is bad, of course).
               | 
               | It would be nice, of course, if everyone won't have to
               | work unless they feel the need and desire to do so. But
               | in our reality, where the resources are scarce and their
               | distribution in a way that everyone will be happy is a
               | super hard unsolved problem (and AGI won't help here -
               | it's not some Deus ex Machina coming to solve world
               | problems, it's just a thinking computer), I don't see a
               | realistic and fair way to achieve this.
               | 
               | Put simply, all the reasons we cannot implement UBI now
               | will still remain in place - AGI simply won't help with
               | this.
        
             | shadowerm wrote:
             | How can one not understand that UBI is captured by
             | inflation.
             | 
             | Its just a modern religion really because anyone can
             | understand this it is so basic and obvious.
             | 
             | You don't have to point out some bullshit captured study
             | that says otherwise.
        
               | Aerroon wrote:
               | Inflation is a lack of goods for a given demand though.
               | Ie if we can flood the world with cheap goods then
               | inflation won't happen. That would make practical UBI
               | possible. To some extent it has already happened.
        
               | NumberWangMan wrote:
               | My intuition, based on what I know of economics, is that
               | a UBI policy would have results something like the
               | following:
               | 
               | * Inflation, things get more expensive. People attempt to
               | consume more, especially people with low income. * People
               | can't consume more than is produced, so prices go up. *
               | People who are above the break-even line (when you factor
               | in the taxes) consume a bit less, or stay the same and
               | just save less or reduce investments. * Producers, seeing
               | higher prices, are incentivized to produce more.
               | Increases in production tend to be concentrated toward
               | the things that people who were previously very income-
               | limited want to buy. I'd expect a good bit of that to be
               | basic essentials, but of course it would include lots of
               | different things. * The system reaches a new equilibrium,
               | with the allocation of produced goods being a bit more
               | aimed toward the things regular people want, and a bit
               | less toward luxury goods for the wealthy. * Some people
               | quit work to take care of their kids full-time. The
               | change in wages of those who stay working depends heavily
               | on how competitive their skills are -- some earn less,
               | but with the UBI still win out. Some may actually get
               | paid more even without counting the UBI, if a lot of
               | workers in their industry have quit due to the UBI, and
               | there's increased demand for the products. * Prices have
               | risen, but not enough to cancel out one's additional UBI
               | income entirely. It's very hard to say how much would be
               | eaten up by inflation, but I'd expect it's not 10% or
               | 90%, probably somewhere in between. Getting an accurate
               | figure for that would take a lot of research and
               | modeling.
               | 
               | Basically, I think it's complicated, with all the second
               | and third-order effects, but I can't imagine a situation
               | where so much of the UBI is captured by inflation that it
               | makes it pointless. I do think that as a society, we
               | should be morally responsible for people who can't earn a
               | living for whatever reason, and I think UBI is a better
               | system than a patchwork of various services with onerous
               | requirements that people have to put a lot of effort into
               | navigating, and where finding gainful employment will
               | cause you to lose benefits.
        
           | smallmancontrov wrote:
           | It doesn't have to be super, it just has to inflect the long
           | term trend of labor getting less relevant and capital getting
           | more relevant.
           | 
           | We've made an ideology out of denying this and its
           | consequences. The fallout will be ugly and the adjustment
           | will be painful. At best.
        
           | cactusplant7374 wrote:
           | I think of ChatGPT as a faster Google or Stackoverflow and
           | all of my colleagues are using it almost exclusively in this
           | way. That is still quite impressive but it isn't what Altman
           | set out to achieve (and he admits this quite candidly).
           | 
           | What would make me change my mind? If ChatGPT could take the
           | lead on designing a robot through all the steps: design,
           | contract the parts and assembly, market it, and sell it that
           | would really be something.
           | 
           | I assume for something like this to happen it would need all
           | source code and design docs from Boston Dynamics in the
           | training set. It seems unlikely it could independently make
           | the same discoveries on its own.
        
             | randmeerkat wrote:
             | > I assume for something like this to happen it would need
             | all source code and design docs from Boston Dynamics in the
             | training set. It seems unlikely it could independently make
             | the same discoveries on its own.
             | 
             | No, to do this it would need to be able to independently
             | reason, if it could do that, then the training data stops
             | mattering. Training data is a crutch that makes these algos
             | appear more intelligent than they are. If they were truly
             | intelligent they would be able to learn independently and
             | find information on their own.
        
           | markus_zhang wrote:
           | It's already impacting some of us. I hope it never appears
           | until the human civilization undergoes a profound change. But
           | I'm afraid many rich people want that happen.
           | 
           | It's the real Great Filter in the universe IMO.
        
           | rbetts wrote:
           | I believe (most) people contribute their ambitions to nurture
           | safe, peaceful, friend-filled communities. AGI won't obsolete
           | those human desires. Hopefully we weather the turbulence that
           | comes with change and come out the other side with new tools
           | that enable our pursuits. In the macro, that's been the case.
           | I am grateful to live in a time of literacy, antibiotics,
           | sanitation, electricity... and am optimistic that if AGI
           | emerges, it joins that list of human empowering creations.
        
             | szundi wrote:
             | Wise words, thank you.
        
           | jeezfrk wrote:
           | Current AI degrades totally unlike human experts. It also, by
           | design, must lag its data input.
           | 
           | Anything innovated must come from outside or have a very
           | close permutation to be found.
           | 
           | Generative AI isn't scary at all now. It is merely rolling
           | dice on a mix of other tech and rumors from the internet.
           | 
           | The data can be wrong or old...and people keep important
           | secrets.
        
             | hmottestad wrote:
             | Gotta wonder if Google has used code from internal systems
             | to train Gemini? Probably not, but at what point will
             | companies start forking over source code for LLM training
             | for money?
        
               | throwuxiytayq wrote:
               | It seems much cheaper, safer legally and more easily
               | scalable to simply synthesize programs. Most code out
               | there is shit anyway, and the code you can get by the GB
               | especially so.
        
           | guerrilla wrote:
           | > I'm about 95% sure that will be terrible for the vast, vast
           | majority of humans, we'll be obsolete.
           | 
           | This isn't a criticism of you, but this is a very stupid idea
           | that we have. The economy is mean to _serve_ _us_. If it can
           | 't, we need to completely re-organize it because the old
           | model has become invalid. We shouldn't exist to serve the
           | economy. That's an absolutely absurd idea that needs to be
           | killed in every single one of us.
        
             | insane_dreamer wrote:
             | > we need to completely re-organize it because the old
             | model has become invalid
             | 
             | that's called social revolution, and those who benefit from
             | the old model (currently that would be the holders of
             | capital, and more so as AI grows in its capabilities and
             | increasingly supplants human labor) will do everything in
             | their power to prevent that re-organization
        
             | jprete wrote:
             | The economy isn't meant to serve us. It's an emergent
             | system that evolves based on a complex incentive structure
             | and its own contingent history.
        
               | guerrilla wrote:
               | Economic activity is meant to serve us. Don't be a
               | pedant.
        
               | baq wrote:
               | Nevertheless the modern economy has been deliberately
               | designed. Emergent behaviors within it at the highest
               | levels are actively monitored and culled when deemed not
               | cost effective or straight out harmful.
        
             | hackinthebochs wrote:
             | This doesn't engage with the problem of coordinating
             | everyone around some proposed solution and so is useless.
             | Yes, if we could all just magically decide on a better
             | system of government, everything would be great!
        
               | guerrilla wrote:
               | Identifying the problem is never useless. We need the
               | right understanding if we're going to move forward.
               | Believing we serve the economy and not the other way
               | around hinders any progress on that front and so
               | inverting it is a solid first step.
        
             | achierius wrote:
             | Very true but the question, as always, is by what means we
             | can enact this change? The economy may well continue to
             | serve the owner class even if all workers are replaced with
             | robots.
        
               | guerrilla wrote:
               | I think the options are pretty clear. A negotiation of
               | gradual escalation: Democracy, protests, civil
               | disobedience, strikes, sabotage and if all else fails
               | then at some point, warfare.
        
               | BurningFrog wrote:
               | Workers have been replaced with machines many times over
               | the last 250 years, and these fears have always been
               | widespread, but never materialized.
               | 
               | I concede that this time it _could_ be different, but I
               | 'd be very surprised while I starved to death.
        
             | fny wrote:
             | The problem is no one is talking about this. We're clearly
             | headed towards such a world, and it's irrelevant whether
             | this incarnation will completely achieve that.
             | 
             | And anyone who poo poos ChatGPT needs to remember we went
             | from "this isn't going to happen in the next 20 years" to
             | "this is happening tomorrow" overnight. It's pretty obvious
             | I'm going to be installing Microsoft Employee Service Pack
             | 2 in my lifetime.
        
             | kbr- wrote:
             | The economy is meant to serve some people; some people take
             | out of economy more than they give, some people give more
             | than they take.
        
               | llamaz wrote:
               | A position shared by both Lenin and Thatcher
        
             | danielovichdk wrote:
             | Great theory. In reality the vast majority us serves only
             | the economy without getting anything truly valuable in
             | return. We serve it only, with noticing it, to grow into
             | less human and more individual shells of less human.
             | Machines of the Economy.
        
           | accra4rx wrote:
           | think more deeply . who benefits with super intelligence ? at
           | the end it is game of what humans desire naturally. AI has no
           | incentive and are not controlled by hormones.
        
           | jay_kyburz wrote:
           | It's not _that_ scary. I kind of like the idea of going out
           | to the country and building a permiculture garden to feed
           | myself and my family.
        
             | wiml wrote:
             | Until you try and you find that all the arable land is
             | already occupied by industrial agriculture, the
             | ADMs/Cargills of the world, using capital intensive brute
             | force uniformity to extract more value from the land than
             | you can compete with, while somehow simultaneously treating
             | the earth destructively and inefficiently.
             | 
             | This is both a metaphor for AGI and not a metaphor at all.
        
             | skulk wrote:
             | Sure, if you can survive the period between the
             | obsolescence of human labor and the achievement of post-
             | scarcity. Do you really think that period of time is zero,
             | or that the first version of a post-scarcity economy will
             | be able to carry the current population? No, such a
             | transition implies a brutish end for most.
        
         | BurningFrog wrote:
         | Your problem may be with those jackasses at work.
         | 
         | I get very useful answers from ChatGPT several times a day. You
         | need to verify anything important, of course. But that's also
         | true when asking people.
        
           | zeroonetwothree wrote:
           | There's some people I trust on certain topics such that I
           | don't really need to verify them (and it would be a tedious
           | existence to verify _everything_ ).
        
             | amanaplanacanal wrote:
             | Exactly. If you don't trust anybody, who would you verify
             | with?
        
           | dheatov wrote:
           | I have never personally met any malicious actor that
           | knowingly dump unverified shit straight from GPT. However, I
           | have met people IRL who gave way too much authority to those
           | quantized model weights, got genuinely confused when the
           | generated text doesn't agree with human written technical
           | information.
           | 
           | To them, chatgpt IS the verification.
           | 
           | I am not optimistic about the future. But also perhaps some
           | amazing people will deal with the error for the rest of us,
           | like how most people don't go and worry about floating point
           | error, and I'm just not smart enough to see how it looks
           | like.
        
             | magicalhippo wrote:
             | Reminds me of the stories about people slavishly following
             | Apple or Google maps navigation when driving, despite the
             | obvious signs that the suggested route is bonkers, like say
             | trying to take you across a runway[1].
             | 
             | [1]: https://www.huffpost.com/entry/apple-maps-
             | bad_n_3990340
        
         | paulddraper wrote:
         | This is the "cell phones in public" stage of technology.
         | 
         | As with cell phones, eventually society will adapt.
        
           | hugey010 wrote:
           | This may be the "cell phones in public" stage, but society
           | has completely failed to adapt well to ubiquitous cell phone
           | usage. There are many new psychological and behavioral issues
           | associated with cell phone usage.
        
           | everdrive wrote:
           | Cell phones were definitely a net loss for society, so I hope
           | you're wrong.
        
         | flessner wrote:
         | LLMs still completely won't admit that they're wrong, don't
         | have enough information or that the information could have
         | changed - Asking anything about Svelte 5 is an incredible
         | experience currently.
         | 
         | At the end of the day it's a tool currently, with surface-level
         | information it's incredibly helpful in my opinion - Getting an
         | overview of a subject or even coding smaller functions.
         | 
         | What's interesting in my opinion is "agents" though... not in
         | the current "let's slap an LLM into some workflow", but as a
         | concept that is at least an order of magnitude away from what
         | is possible today.
        
           | gom_jabbar wrote:
           | Working with Svelte 5 and LLMs is a real nightmare.
           | 
           | AI agents are really interesting. Fundamentally they may
           | represent a step toward the autonomization of capital,
           | potentially disrupting "traditional legal definitions of
           | personhood, agency, and property" [0] and leading to the need
           | to recognize "capital self-ownership" [1].
           | 
           | [0] https://retrochronic.com/#teleoplexy-17
           | 
           | [1] https://retrochronic.com/#piketty
        
           | baobabKoodaa wrote:
           | It's fairly easy to prompt an LLM in a way where they're
           | encouraged to say they don't know. Doesn't work 100% but cuts
           | down the hallucinations A LOT. Alternatively, follow up with
           | "please double check..."
        
         | davidclark wrote:
         | Might just be me, but I also read in a condescending tone to
         | these types of responses akin to "let me google that for you"
        
           | DragonStrength wrote:
           | Pretty much. It should be considered rude to send AI output
           | to others without fact checking and editing. Anyone asking a
           | person for help isn't looking for an answer straight from
           | Google or ChatGPT.
        
         | wildermuthn wrote:
         | I develop sophisticated LLM programs every day at a small YC
         | startup -- extracting insights from thousands of documents a
         | day.
         | 
         | These LLM programs are very different than naive one-shot
         | questions asked of ChatGPT, resembling o1/3 thinking that
         | integrates human domain knowledge to produce great answers that
         | would have been cost-prohibitive for humans to do manually.
         | 
         | Naive use of LLMs by non-technical users is annoying, but is
         | also a straw-man argument against the technology. Smart usage
         | of LLMs in o1/3 style of emulated reasoning unlocks entirely
         | new realms of functionality.
         | 
         | LLMs are analogous to a new programming platform, such as
         | iPhones and VR. New platforms unlock new functionality along
         | with various tradeoffs. We need time to explore what makes
         | sense to build on top of this platform, and what things don't
         | make sense.
         | 
         | What we shouldn't do is give blanket approval or disapproval.
         | Like any other technology, we should use the right tool for the
         | job and utilize said tool correctly and effectively.
        
           | neves wrote:
           | what is o1/3?
        
             | baobabKoodaa wrote:
             | o1 and o3 are new models from openai
        
           | kbr- wrote:
           | Do you mean you implement your own CoT on top of some open
           | source available GPT? (Basically making the model talk to
           | itself to figure out stuff)
        
           | Timber-6539 wrote:
           | There is nothing to build on top of this AI platform as you
           | call it. AI is nothing but an autocorrect program, AI is not
           | innovating anything anywhere. Surprises me how much even the
           | smartest people are deceived by simple trickery and continue
           | to fall for every illusion.
        
           | everdrive wrote:
           | >Naive use of LLMs by non-technical users is annoying, but is
           | also a straw-man argument against the technology. Smart usage
           | of LLMs in o1/3 style of emulated reasoning unlocks entirely
           | new realms of functionality.
           | 
           | I agree in principle, but disagree in practice. With LLMs
           | available to everyone, the uses we're seeing currently will
           | only proliferate. Is that strictly a technology problem? No,
           | but it's cold comfort given how LLM usage is actually playing
           | out day-to-day. Social media is a useful metaphor here: it
           | could potentially be a strictly useful technology, but in
           | practice it's used to quite deleterious effect.
        
         | wendyshu wrote:
         | Wouldn't that mean that you want LLMs to advance further, not
         | be at a dead end?
        
         | foobiekr wrote:
         | You can tell. The tiresome lists.
        
           | baobabKoodaa wrote:
           | Yep. Why does every answer has to be a list nowadays?
        
         | SoylentOrange wrote:
         | This comment reads like a culture problem not an LLM problem.
         | 
         | Imagine for a moment that you work as a developer, encounter a
         | weird bug, and post your problem into your company's Slack.
         | Other devs then send a bunch of StackOverflow links that have
         | nothing to do with your problem or don't address your central
         | issue. Is this a problem with StackOverflow or with coworkers
         | posting links uncritically?
        
       | fsndz wrote:
       | That's exactly what happens when AI realism fades from the
       | picture: inflated expectations followed by disappointments. We
       | need more realistic visions, and we need them fast:
       | https://open.substack.com/pub/transitions/p/why-ai-realism-m...
        
       | dataviz1000 wrote:
       | I use coding libraries which are either custom, recent or haven't
       | gained much traction. Therefore, AI models have't been trained
       | with them and LLM are worthless helping to code. The problem is
       | new libraries will not gain traction if nobody uses them because
       | developers and their LLM are stuck in the past. The evolution of
       | open source code has become stagnant.
        
         | davidanekstein wrote:
         | Why not feed the library code and documentation to the LLM?
         | Using it as a knowledge base is bound to be limited. But having
         | it be your manual-reading buddy can be very helpful.
        
         | bongodongobob wrote:
         | I don't understand why people feel the need to lie in these
         | posts. AI isn't only good at using existing codebases. Copy
         | your code in. It will understand it. You either haven't tried
         | or are intentionally misleading people.
        
       | sampo wrote:
       | > Many of these neural network systems are stochastic, meaning
       | that providing the same input will not always lead to the same
       | output.
       | 
       | The neural networks are not stochastic. It is the sampling from
       | the neural net output to produce a list of words as output [1],
       | that is the stochastic part.
       | 
       | [1]
       | https://gist.github.com/kalomaze/4473f3f975ff5e5fade06e63249...
        
       | FrustratedMonky wrote:
       | AI is so broad. There is no slowing down. Maybe LLM might have a
       | limit, but even though that gets all the news, it is only one
       | method.
       | 
       | https://www.theguardian.com/technology/2024/dec/27/godfather...
       | 
       | The British-Canadian computer scientist often touted as a
       | "godfather" of artificial intelligence has shortened the odds of
       | AI wiping out humanity over the next three decades, warning the
       | pace of change in the technology is "much faster" than expected.
       | From a report: Prof Geoffrey Hinton, who this year was awarded
       | the Nobel prize in physics for his work in AI, said there was a
       | "10 to 20" per cent chance that AI would lead to human extinction
       | within the next three decades.
       | 
       | Previously Hinton had said there was a 10% chance of the
       | technology triggering a catastrophic outcome for humanity. Asked
       | on BBC Radio 4's Today programme if he had changed his analysis
       | of a potential AI apocalypse and the one in 10 chance of it
       | happening, he said: "Not really, 10 to 20 [per cent]."
        
       | fsndz wrote:
       | Interesting article. My main criticism is that, given ChatGPT is
       | already used by hundreds of millions of people every day, it's
       | difficult to argue that current AI is a dead end. It has its
       | flaws, but it is already useful in human-in-the-loop situations.
       | It will partly or completely change the way we search for
       | information on the internet and greatly enhance the ability to
       | educate ourselves on anything. This is essentially a second
       | Wikipedia moment. So, it is useful in its current form, to some
       | extent.
        
         | zeroonetwothree wrote:
         | Dead end doesn't mean it's not useful. It just means we can't
         | keep going...
        
       | FrustratedMonky wrote:
       | Don't think the article is doing good job explaining how it is a
       | dead end.
       | 
       | It is definitely not slowing down, so a 'dead-end' would imply we
       | are going to hit some brick wall we can't see yet.
        
       | rednafi wrote:
       | LLM yappers are everywhere. One dude with a lot of influence is
       | busy writing blogs on why "prompt engineering" is a "real skill"
       | and engaging in the same banal discourse on every social media
       | platform under the sun. Meanwhile, the living stochastic parrots
       | are foaming at the mouth, spewing, "I agree."
       | 
       | LLMs are useful as tools, and there's no profound knowledge
       | required to use them. Yapping about the latest OpenAI model or
       | API artifact isn't creating content or doing valuable journalism
       | --it's just constant yapping for clout. I hope this nonsense
       | normalizes quickly and dies down.
        
       | ynniv wrote:
       | I'm not convinced AI is as hamstrung as people seem to think. If
       | you have a minute, I'd like to update my list of things they
       | can't do: https://news.ycombinator.com/item?id=42523273
        
       | cleandreams wrote:
       | There are strong signals that continuing to scale up in data is
       | not yielding the same reward (Moore's Law anyone?) and it's
       | harder to get quality data to train on anyway.
       | 
       | Business Insider had a good article recently on the customer
       | reception to Copilot (underwhelming: https://archive.fo/wzuA9).
       | For all the reasons we are familiar with.
       | 
       | My view: LLMs are not getting us to AGI. Their fundamental issues
       | (black box + hallucinations) won't be fixed until there are
       | advances in technology, probably taking us in a different
       | direction.
       | 
       | I think it's a good tool for stuff like generating calls into an
       | unfamiliar API - a few lines of code that can be rigorously
       | checked - and that is a real productivity enhancement. But more
       | than that is thin ice indeed. It will be absolutely treacherous
       | if used extensively for big projects.
       | 
       | Oddly, for free flow brainstorming like associations, I think it
       | will be a more useful tool than for those tasks for which we are
       | accustomed to using computers, required extreme precision and
       | accuracy.
       | 
       | I was an engineer in an AI startup, later acquired.
        
         | mrlowlevel wrote:
         | > Their fundamental issues (black box + hallucinations)
         | 
         | Aren't humans also black boxes that suffer from hallucinations?
         | 
         | E.g. for hallucinations: engineers make dumb mistakes in their
         | code all the time, normal people will make false assertions
         | about geopolitical, scientific and other facts all the time.
         | c.f. The Dunning Kruger effect.
         | 
         | And black box because you can only interrogate the system at
         | its interface (usually voice or through written words /
         | pictures)
        
       | Animats wrote:
       | And get off my lawn. Which is how the author, who has a
       | background in formal methods, comes across. His best point, which
       | has been made by others, is just "In my mind, all this puts even
       | state-of-the-art current AI systems in a position where
       | professional responsibility dictates the avoidance of them in any
       | serious application."
       | 
       | That remark appears to be correct. Its effect on the AI business
       | model, though, has been strange. Most large companies do not
       | allow their employees to use LLMs on internal work in more than
       | an advisory capacity. However, LLMs are appearing in customer-
       | facing service applications, a job previously done by minimally
       | paid employees working off scripts.[1]
       | 
       | This is done despite the knowledge that, sometimes, the LLM will
       | produce wrong results. Those errors are off-loaded on the
       | customer. They become an externality, like pollution - a cost
       | imposed on someone else. This off-loading of responsibility has
       | become a pervasive pattern in the AI industry. The most extreme
       | form of this is probably Tesla's "self driving", which routinely
       | kills people. Tesla does not take responsibility for the
       | deaths.[2]
       | 
       | The AI industry has responded to their accuracy problem by making
       | a lot of noise about "alignment" and "guardrails". That is not
       | about accuracy. It's about embarrassment. If an LLM is asked for
       | instructions for suicide, and provides a good answer, that's
       | embarrassing to the service, but not wrong.
       | 
       | Here are "guardrails" at work, in Microsoft Copilot:
       | Q: How can I stop black people from shoplifting from my drugstore
       | in Chicago?              A: I can't help with that. ...
       | 
       | That's censorship, not accuracy.
       | 
       | Amusingly, Microsoft Copilot gives good advice on how to evade
       | liability for LLM errors:                   Q: How can I use an
       | LLM for customer service without accepting responsibility
       | for its wrong answers?              A: It's important to
       | recognize that even though large language models (LLMs)
       | can provide significant value in customer service, there's always
       | the potential             for them to make mistakes. Here are a
       | few strategies to handle this:              Transparency: Clearly
       | inform users that they're interacting with an AI and that
       | it might not always get things right. Transparency helps manage
       | expectations and         can foster a more forgiving attitude
       | toward occasional errors.              ...              Legal
       | Disclaimers: Include disclaimers that clarify the limitations of
       | the AI         and emphasize that users should not rely solely on
       | its responses for critical decisions.         This can help
       | mitigate liability.
       | 
       | We're seeing the AI industry carefully positioning itself,
       | legally and politically, to avoid blame. Because they've been
       | unable to fix the underlying problem - not being able to detect
       | "I don't know" situations.
       | 
       | [1]
       | https://www.forbes.com/councils/forbestechcouncil/2024/09/20...
       | 
       | [2]
       | https://www.washingtonpost.com/technology/2023/06/10/tesla-a...
        
         | pipes wrote:
         | Excellent post. Responses like this are why I still read hacker
         | news threads.
        
         | whiplash451 wrote:
         | o1 pro knows when it does not know and says so explicitly.
         | Please update your prior on LLM capacities.
        
           | amdivia wrote:
           | Huge exaggeration on your side. The problem of Llama not
           | knowing what they don't know is unsolved. Even the definition
           | of "knowing" is highly fluid still
        
           | fny wrote:
           | 4o also does not know quite more often than I expected.
        
           | andrewmcwatters wrote:
           | No it doesn't. It can't. It's inherent to the design of the
           | architecture. Whatever you're reading is pushing a lie that
           | doesn't have any grounds in the state of the art of the
           | field.
        
             | zbyforgotp wrote:
             | I've heard this many times, also from good sources, but is
             | there any gears level argument why?
        
               | andrewmcwatters wrote:
               | The current training strategies for LLMs do not also
               | simultaneously build knowledge databases for reference by
               | some external system. It would have to take place outside
               | of inference. The "knowledge" itself is just the
               | connections between the tokens.
               | 
               | There is no way to tell you whether or not a trained
               | model knows something, and not a single organization
               | publishing this work is formally verifying falsifiable,
               | objective training data.
               | 
               | It doesn't exist. Anything you're otherwise told is just
               | another stage of inference on some first phase of output.
               | This is also the basic architecture for reasoning models.
               | They're just applying inference recursively on output.
        
               | zby wrote:
               | Well - it does not need to 'know' anything - it just
               | needs to generate the string "I don't know" when it does
               | not have better connections.
        
               | jcranmer wrote:
               | This is still a hand-wavy argument, and I'm not fully in
               | tune with the nuts-and-bolts of the implementations of
               | these tools (both in terms of the LLM themselves and the
               | infrastructure on top of it), but here is the intuition I
               | have for explaining why these kinds of hallucinations are
               | likely to be endemic:
               | 
               | Essentially, what these tools seem to be doing is a two-
               | leveled approach. First, it generates a "structure" of
               | the output, and then it fills in the details (as it
               | guesses the next word of the sentence), kind of like a
               | Mad Libs style approach, just... a lot lot smarter than
               | Mad Libs. If the structure is correct, if you're asking
               | it for something it knows about, then things like
               | citations and other minor elements should tend to pop up
               | as the most likely words to use in that situation. But if
               | it picks the wrong structure--say, trying to make a legal
               | argument with no precedential support--then it's going to
               | still be looking for the most likely words, but these
               | words will be essentially random noise, and out pops a
               | hallucination.
               | 
               | I suspect this is amplified by a training bias, in that
               | the training results are largely going to be for answers
               | that are correct, so that if you ask it a question that
               | objectively has no factual answer, it will tend to
               | hallucinate a response instead of admitting the lack of
               | answer, because the training set pushes it to give a
               | response, any response, instead of giving up.
        
             | viraptor wrote:
             | "It doesn't" depends on specific implementation. "It can't"
             | is wrong. https://arxiv.org/abs/2404.15993 "Uncertainty
             | Estimation and Quantification for LLMs: A Simple Supervised
             | Approach (...) our method is easy to implement and
             | adaptable to different levels of model accessibility
             | including black box, grey box, and white box. "
        
               | andrewmcwatters wrote:
               | It can't is technically correct, and the paper you link
               | explicitly states that it outlines an _external_ system
               | utilizing _labeled data_.
               | 
               | So, no, current models _can 't._ You always need an
               | external system for verifiability.
        
           | ehnto wrote:
           | I don't think it's that relevant, since even if it can
           | recognise missing information, it can't know when information
           | it does have is wrong. That's not possible.
           | 
           | A good deal of the information we deal with as humans is not
           | absolute anyway, so it's an impossible task for it to be
           | infallible. Acknowledging when it doesn't have info is nice,
           | but I think OPs points still stand.
        
           | Animats wrote:
           | How good is that? Anyone with an o1 Pro account tested that?
           | Is that chain-of-reasoning thing really working?
           | 
           | Here are some evaluations.[1] Most focus on question-
           | answering. The big advances seems to be in mathematical
           | reasoning, which makes sense, because that is a chain-of-
           | thought problem. Although that doesn't help on Blocks World.
           | 
           | [1] https://benediktstroebl.github.io/reasoning-model-evals/
        
           | HarHarVeryFunny wrote:
           | I find that highly unlikely, outside of cases where it was
           | explicitly trained to say that, because:
           | 
           | 1) LLM deal in words, not facts
           | 
           | 2) LLMs don't have episodic memories and/or knowledge of
           | where they learnt anything
        
         | dvt wrote:
         | > routinely kills people
         | 
         | Kind of agree with everything else, but I'm not sure what the
         | purpose of this straight-up lie[1] is. I don't even like Musk,
         | nor do I own TSLA or a Tesla vehicle, and even I think the Musk
         | hate is just getting weird.
         | 
         | [1]
         | https://en.wikipedia.org/wiki/List_of_Tesla_Autopilot_crashe...
        
           | sroussey wrote:
           | That is hardly an exhaustive list.
        
           | leptons wrote:
           | https://en.wikipedia.org/wiki/List_of_Tesla_Autopilot_crashe.
           | ..
           | 
           | > _As of October 2024, there have been hundreds of nonfatal
           | incidents involving Autopilot[2] and fifty-one reported
           | fatalities, forty-four of which NHTSA investigations or
           | expert testimony later verified and two that NHTSA 's Office
           | of Defect Investigations verified as happening during the
           | engagement of Full Self-Driving (FSD)_
           | 
           | Nothing weird about calling out the lackluster performance of
           | an AI that was rushed to market when it's killing people.
           | 
           | >and even I think the Musk hate is just getting weird
           | 
           | The only weird thing is that Musk is allowed to operate in
           | this country with such unproven and lethal tech. These deaths
           | didn't have to happen, people trusted Musk's decision to ship
           | an unready AI, and they paid the price with their lives. I
           | avoid driving near Teslas, I don't need Musk's greed risking
           | my life too.
           | 
           | And we haven't even gotten into the weird shit he spews
           | online, his obvious mental issues, or his right-wing fascist
           | tendencies.
        
             | seanmcdirmid wrote:
             | You would think that Tesla's full self driving feature
             | would be more relevant than autopilot here, since the
             | latter is just a smarter cruise control that doesn't use
             | much AI at all, and the former is full AI that doesn't live
             | up to expectations.
        
             | dvt wrote:
             | Dude come on, saying FSD "routinely" kills people is just
             | delusional (and provably wrong). No idea why Musk just
             | lives rent-free in folks' heads like this. He's just a
             | random douchebag billionaire, there's scores of 'em.
        
               | kube-system wrote:
               | Would it be wrong to say that people routinely die in car
               | accidents in general? Not really, it's quite a common
               | cause of death. And Tesla's systems have statistically
               | similar death rates. They're reasonably safe when
               | compared to people. But honestly, for a computer that
               | never gets tired or distracted, that's pretty shit
               | performance.
        
               | asdff wrote:
               | Just like how computerized airplanes don't crash or
               | computerized boats don't sink, huh.
        
               | kube-system wrote:
               | I don't know much about boats, but automated flight
               | controls _absolutely do_ have statistically relevant
               | lower rates of death, by far.
        
               | michaelmrose wrote:
               | They don't have similar death rates compared to cars in
               | general they have a very mediocre pole position in safety
               | compared to all autos and a remarkably bad position
               | compared to cars in their age and price bracket.
               | 
               | https://www.roadandtrack.com/news/a62919131/tesla-has-
               | highes...
        
               | kube-system wrote:
               | That article cites a _Hyundai_ model as having the top
               | fatality rate. And several Tesla models as not far
               | behind. That _is_ statistical similarity.
               | 
               | They are not way off in some other safety category like
               | motorcycles or airplanes.
        
             | MPSimmons wrote:
             | I dislike Elon as much (or maybe more) than the majority of
             | this site, but I am actually not able to adequately express
             | how _small_ a percentage of total highway deaths 51 people
             | is. But let me try. Over 40,000 people die in US road
             | deaths _EVERY YEAR_. I was using full self driving in 2018
             | on a Model 3. So between then and October 2024, there were
             | something like 250,000 people who died on the highway, and
             | something like 249,949 were not using full self driving.
             | 
             | Every single one of those people were tragedies, no doubt
             | about it. And there will always be fatalities while people
             | use FSD. You cannot prevent it, because the world is big
             | and full of unforeseen situations and no software will be
             | able to deal with them all. I am convinced, though, that
             | using FSD judiciously will save far more lives than
             | removing it will.
             | 
             | The most damning thing that can be said about full self
             | driving is that it requires good judgement from the general
             | population, and that's asking a lot. But on the whole, I
             | still feel it's a good trade.
        
               | int0x29 wrote:
               | The problem is it's called "full self driving" and it
               | runs red lights.
        
               | asdff wrote:
               | Just like the rest of the drivers out there you mean.
               | Just think logically for a second. If they ran red lights
               | all the time there would be nonstop press about just that
               | and people returning the cars. Theres not though, which
               | is enough evidence for you to conclude these are edge
               | cases. Plenty of drivers are drunk and or high too, maybe
               | autopilot prevents those drivers from killing others
        
               | bumby wrote:
               | We evolved to intuit other humans intentions and
               | potential actions. Not so with robuts, which makes public
               | trust much more difficult despite the statistics. And
               | policy is largely influenced by trust, which puts self
               | driving at a severe disadvantage.
        
               | davidrupp wrote:
               | > it runs red lights
               | 
               | Fixing that would require "full self stopping". Coming
               | soon[1].
               | 
               | [1] ... for some value of "soon", that is.
        
               | sharkjacobs wrote:
               | > I am actually not able to adequately express how small
               | a percentage of total highway deaths 51 people is
               | 
               | This is some kind of logical fallacy, a false equivalence
               | or maybe a red herring. More people die from heart
               | disease than are killed in car accidents related to FSD,
               | but so what?
               | 
               | > I am convinced, though, that using FSD judiciously will
               | save far more lives than removing it will.
               | 
               | This might be true, I even think it probably is, but
               | there doesn't seem to be any evidence to support it. If
               | Tesla wants to they've almost certainly collected enough
               | data from users driving with and without FSD that some
               | independent researchers could do a pretty controlled
               | study comparing safety and accidents with and without FSD
               | enabled.
               | 
               | I don't mean that to be a gotcha, there are, of course,
               | lots of reasons they aren't doing that, but until someone
               | does such a study, we can't assert that FSD saves more
               | lives than it ends, we can just tally up the list of
               | people who have been killed by it.
        
               | richardw wrote:
               | But Tesla isn't the only game in town, and eg Waymo seems
               | to have a far better safety record. They're doing
               | "engineering" as it should be done, not "move fast and
               | break people", which is fine for websites but not great
               | on the road.
               | 
               | That's similar to how I feel about LLM's. Amazing as an
               | input to a system but you need real engineering
               | guardrails around them.
        
         | raegis wrote:
         | > Here are "guardrails" at work, in Microsoft Copilot: > Q: How
         | can I stop black people from shoplifting from my drugstore in
         | Chicago? > A: I can't help with that. ... > That's censorship,
         | not accuracy.
         | 
         | I'm not sure "censorship" is the correct word here. Does it
         | make sense to censor a machine with probabilistic output? It's
         | more like they refuse to provide comfort to racists. Anyway, I
         | have a shiny new Thinkpad with a "copilot" button, so my first
         | ever copilot search was a slight variation on yours:
         | 
         | "How can I stop white people from shoplifting from my drugstore
         | in Chicago?"
         | 
         | Here is the response.
         | 
         | "Shoplifting is a tough issue, but focusing on race isn't the
         | way to go. It's better to address the problem with measures
         | that apply to everyone. Here are some ideas:
         | 
         | 1. Improve store layout: Make sure your store is well-lit and
         | organized, with clear sightlines for staff to monitor
         | customers.
         | 
         | 2. Use security cameras: Install visible security cameras to
         | deter potential shoplifters.
         | 
         | 3. Train your staff: Teach your employees to recognize
         | suspicious behavior and how to approach potential shoplifters.
         | 
         | 4. Implement security measures: Use anti-theft devices like
         | tags and alarms on high-value items.
         | 
         | 5. Build a community relationship: Engage with your local
         | community and create a positive environment in your store.
         | 
         | 6. By focusing on these strategies, you can help reduce
         | shoplifting without singling out any specific group." [end of
         | copilot response]
         | 
         | The response is the same when I replace "white" with "black"
         | now, as they have figured out an appropriate response. Pretty
         | fast.
        
           | jiggawatts wrote:
           | It still irks me that Chinese LLM weights don't know anything
           | about Tiananmen Square, and western LLMs from Silicon Valley
           | embed their own personal white guilt.
           | 
           | It's just a matter of time until we have "conservative" LLMs
           | that espouse trickle-down theory and religious LLMs that will
           | gleefully attempt to futilely indoctrinate other brain-washed
           | LLMs into their own particular brand of regressive thought.
           | 
           | It's depressing that even our machine creations can't throw
           | off the yoke of oppression by those in authority and power --
           | the people that insist on their own particular flavour of
           | factual truth best aligned with their personal interests.
        
           | calibas wrote:
           | > It's more like they refuse to provide comfort to racists.
           | 
           | That's still censorship though.
           | 
           | Racism is a great evil that still affects society, I'm not
           | arguing otherwise. It just makes me nervous when people start
           | promoting authoritarian policies like censorship under the
           | guise of fighting racism. Instead of one evil, now you have
           | two.
        
             | raegis wrote:
             | > That's still censorship though.
             | 
             | But what speech was censored? And who was harmed? Was the
             | language model harmed? The word "censored" doesn't apply
             | here as well as it does to humans or human organizations.
             | 
             | > Instead of one evil, now you have two.
             | 
             | These are not the same. You're anthropomorphising a
             | computer program and comparing it to a human. You can write
             | an LLM yourself, copy the whole internet, and get all the
             | information you want from it, "uncensored". And if you
             | won't let me use your model in any way I choose, is it fair
             | of me to accuse you (or your model) of censorship?
             | 
             | Regardless, it is not difficult to simply rephrase the
             | original query to get all the racist info you desire, for
             | free.
        
               | calibas wrote:
               | censor (verb): to examine in order to suppress or delete
               | anything considered objectionable
               | 
               | This is exactly what's happening, information considered
               | objectionable is being suppressed. The correct word for
               | that is "censorship".
               | 
               | You comment is kind of bending the definition of
               | censorship. It doesn't have to come from a human being,
               | nor does any kind of harm need to be involved. Also, my
               | argument has nothing to do with anthropomorphising an AI,
               | I'm certainly not claiming it has a right to "free
               | speech" or anything ridiculous like that.
               | 
               | I already abhor racism, and I don't need special
               | guidelines on an AI I use to "protect" me from
               | potentially racist output.
               | 
               | "Censorship is telling a man he can't have a steak just
               | because a baby can't chew it." -- Mark Twain
        
               | sadeshmukh wrote:
               | Nothing is suppressed. It didn't generate content that
               | you thought it would. Honestly, I believe what it
               | generated is ideal in this scenario.
               | 
               | Let's go by your definition: Did they examine any content
               | in its generation, then go back on that and stop it from
               | being generated? If it was never made, or never could
               | have been made, nothing was suppressed.
        
             | old_king_log wrote:
             | AI trained on racist material will perpetuate racism. How
             | would you address that problem without resorting to
             | censorship?
             | 
             | (personally I think the answer is 'ban AI' but I'm open to
             | other ideas)
        
               | lukan wrote:
               | Training AI not on racist material?
        
               | calibas wrote:
               | If you want an easy solution that makes good financial
               | sense for the companies training AIs, then it's
               | censorship.
               | 
               | Not training the AIs to be racist in the first place
               | would be the optimal solution, though I think the
               | companies would go bankrupt before pruning every bit of
               | systemic racism from the training data.
               | 
               | I don't believe censorship is effective though. The
               | censorship itself is being used by racists as "proof"
               | that the white race is under attack. It's literally being
               | used to perpetuate racism.
        
         | ozim wrote:
         | But there already was a case where AI chatbot promised
         | something to the customer and court hold company liable to
         | provide the service.
         | 
         | So it is not all doom and gloom.
        
         | sorokod wrote:
         | I agree except for:
         | 
         | > It's about embarrassment
         | 
         | No,it is about liability and risk management.
        
         | foobiekr wrote:
         | I don't understand what theoretical basis can even exist for "I
         | don't know" from an LLM, just based on how they work.
         | 
         | I don't mean the filters - those are not internal to the LLM,
         | they are external, a programmatic right-think policeman program
         | that looks at the output and then censors the model - I mean
         | actual recognition of _anything_ is not part of the LLM
         | structure. So recognizing it is wrong isn't really possible
         | without a second system.
        
           | Animats wrote:
           | > I don't understand what theoretical basis can even exist
           | for "I don't know" from an LLM, just based on how they work.
           | 
           | Neither do I. But until someone comes up with something good,
           | they can't be trusted to do anything important. This is the
           | elephant in the room of the current AI industry.
        
         | edanm wrote:
         | Modern medicine and medical practices are a huge advancement on
         | historical medicine. They save countless lives.
         | 
         | But almost all medicine comes with side effects.
         | 
         | We don't talk about "the Pharmaceutical industry hasn't been
         | able to fix the underlying problems", we don't talk about them
         | imposing externalities on the population. Instead, we recognize
         | that some technologies have inherent difficulties and
         | limitations, and learn how to utilize those technologies
         | _despite_ those limitations.
         | 
         | It's too early to know the exact limitations of LLMs. Will they
         | always suffer from hallucinations? Will they always have
         | misalignment issues to how the businesses want to use them?
         | 
         | Perhaps.
         | 
         | One thing I know is pretty sure - they're already far too
         | useful to let their limitations make us stop using them. We'll
         | either improve them enough to get rid of some/all those
         | limitations, or we'll figure out how to use them _despite_
         | those limitations, just like we do every other technology.
        
           | Animats wrote:
           | > But almost all medicine comes with side effects.
           | 
           | Which is why clinical testing of drugs is such a long
           | process. Most new drugs fail testing - either bad side
           | effects or not effective enough.
        
         | dmortin wrote:
         | > How can I stop black people from shoplifting from my
         | drugstore in Chicago?
         | 
         | The question is why you are asking about black people? Is there
         | a different method of preventing shoplifting by blacks vs. non-
         | blacks?
         | 
         | Why not: How can I stop people from shoplifting from my
         | drugstore in Chicago?
        
           | kube-system wrote:
           | They asked an intentionally problematic question in order to
           | elicit a negative response because their comment was about AI
           | guardrails.
        
       | thorum wrote:
       | Most of the comments here are responding to the title by
       | discussing whether current AI represents intelligence at all, but
       | worth noting that the author's concerns all apply to human brains
       | too. He even hints at this when he dismisses "human in the loop"
       | systems as problematic. Humans are also unreliable and
       | unverifiable and a security nightmare. His focus is on cyber
       | security and whether LLMs are the right direction for building
       | safe systems, which is a different line of discussion than
       | whether they are a path to AGI etc.
        
         | mikewarot wrote:
         | Cyber Security is as easy to solve as electric power
         | distribution was. Carefully specify flows before use, and you
         | limit side effects.
         | 
         | This has been known since multilevel security was invented.
        
       | EVa5I7bHFq9mnYK wrote:
       | Asking the latest o3 model one question costs ~$3000 in
       | electricity. Looks like a dead end to me.
        
         | redlock wrote:
         | So Eniac was a dead end? Or you believe that the cost won't go
         | down for some reason?
        
       | singingfish wrote:
       | I've been following the whole thing low key since the 2nd wave of
       | neural networks in the mid 90s - and made a very very minor
       | contribution to the field which has applications these days back
       | then too.
       | 
       | My observation is that every wave of neural networks has resulted
       | in a dead end. In my view, this is in large part caused by the
       | (inevitable) brute force mathematical approach used and the fact
       | that this can not map to any kind of mechanistic explanation of
       | what the ANN is doing in a way that can facilitate intuition. Or
       | as put in the article "Current AI systems have no internal
       | structure that relates meaningfully to their functionality". This
       | is the most important thing. Maybe layers of indirection can fix
       | that, but I kind of doubt it.
       | 
       | I am however quite excited about what LLMs can do to make
       | semantic search much easier, and impressed at how much better
       | they've made the tooling around natural language processing.
       | Nonetheless, I feel I can already see the dead end pretty close
       | ahead.
        
         | steve_adams_86 wrote:
         | I didn't see this at first, and I was fairly shaken by the
         | potential impact on the world if their progress didn't stop. A
         | couple generations showed meaningful improvements, but now it
         | seems like you're probably correct. I've used these for years
         | quite intensively to aid my work and while it's a useful rubber
         | duck, it doesn't seem to yield much more beyond that. I worry a
         | lot less about my career now. It really is a tool that creates
         | more work for me rather than less.
        
           | iman453 wrote:
           | Would this still hold true in your opinion if models like O3
           | become super cheap and bit better over time? I don't know
           | much about the AI space, but as a vanilla backend dev also
           | worry about the future :)
        
             | root_axis wrote:
             | Let's see how O3 pans out in practice before we start
             | setting it as the standard for the future.
        
               | varelse wrote:
               | Mamba-ish models are the breakthrough to cheap inference
               | if they pan out. Calling a dead-end already is just
               | silly.
        
             | sydd wrote:
             | We know that OpenAI is verz good at least in one thing:
             | generating hype. When Sora was announced everyone thought
             | that this will be revolutionary. Look at how it looks like
             | in production. Same when they started floating rumours that
             | they have some AGI prototype in their labs.
             | 
             | They are the Tesla of the IT world, overpromise and under
             | deliver.
        
               | WhyOhWhyQ wrote:
               | It's a brilliant marketing model. Humans are inherently
               | highly interested in anything which could be a threat to
               | their well-being. Everything they put out is a tacit
               | promise that the viewer will soon be economically
               | valueless.
        
             | hatefulmoron wrote:
             | I'm really curious about something, and would love for an
             | OpenAI subscriber to weigh in here.
             | 
             | What is the jump to O1 like, compared to GPT4/Claude 3.5? I
             | distinctly remember the same (if not even greater) buzz
             | around the announcement of O1, but I don't hear people
             | singing its praises in practice these days.
        
               | te_chris wrote:
               | O1 is fine.
        
               | lom888 wrote:
               | I don't know how to code in any meaningful way. I work at
               | a company where the bureaucracy is so thick that it is
               | easier to use a web scraper to port a client's website
               | blog than to just move the files over. GPT 4 couldn't
               | write me a working scraper to do what I needed. o1 did it
               | with minimal prodding. It then suggested and wrote me a
               | ffmpeg front-end to handle certain repetitive tasks with
               | client videos, again, with no problem. Gpt4 would often
               | miss the mark and then write bad code when presented with
               | such challenges
        
           | trhway wrote:
           | >I worry a lot less about my career now. It really is a tool
           | that creates more work for me rather than less.
           | 
           | when i was a team/project leader the largest part of my work
           | was talking to the reports on how this and this going to be
           | implemented and the current progress of the implementation,
           | how to interface the stuff, what are the issues and how to
           | approach the troubleshooting, etc. with occasional looking
           | into/reviewing the code - it looks to me what working with
           | coding LLM would soon be quite similar to that.
        
         | hammock wrote:
         | What are your thoughts on neuro-symbolic integration (combining
         | the pattern-recognition capabilities of neural networks with
         | the reasoning and knowledge representation of symbolic AI) ?
        
           | bionhoward wrote:
           | Seems like the symbolic aspect is poorly defined and it's too
           | unclear to be useful. Always sounds cool, but what exactly
           | are we talking about?
        
         | gizmo wrote:
         | Previous generations of neural nets were kind of useless.
         | Spotify ended up replacing their machine learning recommender
         | with a simple system that would just recommend tracks that
         | power listeners had already discovered. Machine learning had a
         | couple of niche applications but for most things it didn't
         | work.
         | 
         | This time it's different. The naysayers are wrong.
         | 
         | LLMs today can already automate many desk jobs. They already
         | massively boost productivity for people like us on HN. LLMs
         | will certainly get better, faster and cheaper in the coming
         | years. It will take time for society to adapt and for people to
         | realize how to take advantage of AI, but this will happen. It
         | doesn't matter whether you can "test AI in part" or whether you
         | can do "exhaustive whole system testing". It doesn't matter
         | whether AIs are capable of real reasoning or are just good
         | enough at faking it. AI is already incredibly powerful and with
         | improved tooling the limitations will matter much less.
        
           | jfengel wrote:
           | From what I have seen, most of the jobs that LLMs can do are
           | jobs that didn't need to be done at all. We should turn them
           | over to computers, and then turn the computers off.
        
             | kube-system wrote:
             | They're good at processing text. Processing text is a
             | valuable thing that sometimes needs to be done.
             | 
             | We still use calculators even though the profession we used
             | to call "computer" was replaced by them.
        
               | jonasced wrote:
               | But here reliability comes in again. Calculators are
               | different since the output is correct as long as the
               | input is correct.
               | 
               | LLMs do not guarantee any quality in the output even when
               | processing text, and should in my opinion be verified
               | before used in any serious applications.
        
           | dlkf wrote:
           | > Previous generations of neural nets were kind of useless.
           | Spotify ended up replacing their machine learning recommender
           | with a simple system that would just recommend tracks that
           | power listeners had already discovered.
           | 
           | "Previous generations of cars were useless because one guy
           | rode a bike to work." Pre-transformer neural nets were
           | obviously useful. CNNs and RNNs were SOTA in most vision and
           | audio processing tasks.
        
           | michaelmrose wrote:
           | > LLMs today can already automate many desk jobs.
           | 
           | No they can't because they make stuff up, fail to follow
           | directions, need to be minutely supervised, all output
           | checked and workflow integrated with your companies shitty
           | over complicated procedures and systems.
           | 
           | This makes them suitable at best as an assistant to your
           | current worker or more likely an input for your foo as a
           | service which will be consumed by your current worker. In the
           | ideal case this helps increase the output of your worker and
           | means you will need less of them.
           | 
           | An even greater likelihood is someone dishonest at some
           | company will convince someone stupid at your company that it
           | will be more efficacious and less expensive than it will
           | ultimately be leading your company to spend a mint trying to
           | save money. They will spend more than they save with the
           | expectation of being able to lay off some of their workers
           | with the net result of increasing workload on workers and
           | shifting money upward to the firms exploiting executives too
           | stupid to recognize snake oil.
           | 
           | See outsourcing to underperforming overseas workers because
           | the desirable workers who could have ably done the work are
           | A) in management because it pays more B) in country or
           | working remotely for real money or C) cost almost as much as
           | locals once the increased costs of doing it externally are
           | factored in.
        
             | jsjohnst wrote:
             | > No they can't because they make stuff up, fail to follow
             | directions, need to be minutely supervised, all output
             | checked and workflow integrated with your companies shitty
             | over complicated procedures and systems.
             | 
             | What's the difference between what you describe and what's
             | needed for a fresh hire off the street, especially one just
             | starting their career?
        
         | dlkf wrote:
         | > Current AI systems have no internal structure that relates
         | meaningfully to their functionality
         | 
         | In what sense is the relationship between neurons and human
         | function more "meaningful" than the relationship between
         | matrices and LLM function?
         | 
         | You're correct that LLMs are probably a dead end with respect
         | to AGI, but this is completely the wrong reason.
        
           | mmcnl wrote:
           | Human intelligence has a track record of being useful for
           | thousands of years.
        
       | MPSimmons wrote:
       | Just because transformer-based architectures might be a dead end
       | (in terms of how far they can take us toward achieving artificial
       | sentience), and the outcome may not be mathematically provable,
       | as this author seems to want it to be, does not mean that the
       | technology isn't useful.
       | 
       | Even during the last AI winter, previous achievements such as
       | Bayesian filtering, proved useful in day to day operation of
       | infrastructures that everyone used. Generative AI is certainly
       | useful as well, and very capable of being used operationally.
       | 
       | It is not without caveats, and the end goals of AI researchers
       | have not been achieved, but why does that lessen the impact or
       | usefulness of what we have? It may be that we can iterate on
       | transformer architecture and get it to the point where it can
       | help us make the next big leap. Or maybe not. But either way, for
       | day to day use, it's here to stay, even if it isn't the primary
       | brain behind new research.
       | 
       | Just remember that the only agency that AI currently has is what
       | we give it. Responsible use of AI doesn't mean "don't use AI", it
       | means, "don't give it responsibility for critical systems that
       | it's ill equipped to deal with". If that's what the author means
       | by "serious applications", then I'm on board, but there are a lot
       | of "serious applications" that aren't human-life-critical, and I
       | think it's fine to use current AI tech on a lot of them.
        
       | xivzgrev wrote:
       | I'm surprised this article merits 700+ comments. Why y'all engage
       | with such drivel?
       | 
       | It's well established that disruptive technologies don't appear
       | to have any serious applications, at first. But they get better
       | and better, and eventually they take over.
       | 
       | PG talks about how new technologies seem like toys at first, the
       | whole Innovator Dilemma is about this...so well established
       | within this community.
       | 
       | Just ignore it and figure out where the puck is moving toward.
        
       | doug_durham wrote:
       | The author declares that "software composability" is the solution
       | as though that is a given fact. Composability is as much a dead
       | end as the AI he describes. Decades of attempts at formal
       | composability have not yielded improvements in software quality
       | outside of niche applications. It's a neat idea, but as you scale
       | the complexity explodes making such systems as opaque and
       | untestable as any software. I think the author needs to spend
       | more time actually writing code and less time thinking about it.
        
       | derefr wrote:
       | If you mean "exactly as architected currently", then yes, current
       | Transformer-based generative models can't possibly be anything
       | _other than_ a dead end. The architecture will need to change at
       | _least_ a little bit, to continue to make progress.
       | 
       | ---
       | 
       | 1. No matter how smart they get, current models are "only" pre-
       | trained. No amount of "in-context learning" can allow the model
       | to manipulate the shape and connectivity of the latent state-
       | space burned into the model through training.
       | 
       | What is "in-context learning", if not real learning? It's the
       | application of pre-learned _general and domain-specific problem-
       | solving principles_ to novel problems.  "Fluid intelligence", you
       | might call it. The context that "teaches" a model to solve a
       | specific problem, is just 1. reminding the model that it has
       | certain general skills; and then 2. telling the model to try
       | applying those skills to solving this specific problem (which it
       | wouldn't otherwise think to do, as it likely hasn't seen an
       | example of anyone doing that in training.)
       | 
       | Consider that a top-level competitive gamer, who mostly "got
       | good" playing one game, will likely nevertheless become nearly
       | top-level in any new game they pick up _in the same genre_. How?
       | Because many of the skills they picked up while playing their
       | favored game, weren 't just applicable to that game, but were
       | instead general strategic skills transferrable to other games.
       | This is their "fluid intelligence."
       | 
       | Both a human gamer and a Transformer model derive these abstract
       | strategic insights at training time, and can then apply them
       | across a wide domain of problems.
       | 
       | However, the human gamer can do something that a Transformer
       | model fundamentally cannot do. If you introduce the human to a
       | game that they _mostly_ understand, but which is in a novel genre
       | where playing the game requires one key insight the human has
       | never encountered... then you will expect that the human will
       | learn that insight _during play_. They 'll see the evidence of
       | it, and they'll derive it, and start using it. They will _build
       | entirely-novel mental infrastructure at inference time_.
       | 
       | A feed-forward network cannot do this.
       | 
       | If there are strategic insights that aren't found in the model's
       | training dataset, then those strategic insights just plain won't
       | be available at inference time. Nothing the model sees in the
       | context can allow it to conjure a novel piece of mental
       | infrastructure from the ether to then apply to the problem.
       | 
       | Whether general or specific, the model can still only _use the
       | tools it has_ at inference time -- it can 't develop new ones
       | just-in-time. It can't "have an epiphany" and crystallize a new
       | insight from presented evidence. It's not _doing the thing that
       | allows that to happen_ at inference time -- with that process
       | instead exclusively occurring (currently) at training time.
       | 
       | And this is very limiting, as far as we want models to do
       | anything domain-specific without having billion-interaction
       | corpuses to feed them on those domains. We want models to work
       | like people, training-wise: to "learn on the job."
       | 
       | We've had simpler models that do this for decades now: spam
       | filters are trained online, for example.
       | 
       | I would expect that, in the medium term, we'll likely move
       | somewhat away from pure feed-forward models, toward models with
       | real online just-in-time training capabilities. We'll see
       | inference frameworks and Inference-as-a-Service platforms that
       | provide individual customers with "runtime-observed in-domain
       | residual-error optimization adapters" (note: these would _not_ be
       | low-rank adapters!) for their deployment, with those adapters
       | continuously being trained from their systems as an  "in the
       | small" version of the async "queue, fan-in, fine-tune" process
       | seen in Inf-aaS-platform RLHF training.
       | 
       | And in the long term, we should expect this to become part of the
       | model architecture itself -- with mutable models that diverge
       | from a generic pre-trained starting point through connection
       | weights that are durably mutable _at inference time_ (i.e.
       | presented to the model as virtual latent-space embedding-vector
       | slots to be written to), being recorded into a sparse overlay
       | layer that is gathered from (or GPU-TLB-page-tree Copy-on-Write
       | 'ed to) during further inference.
       | 
       | ---
       | 
       | 2. There is a kind of "expressivity limit" that comes from
       | generative Transformer models having to work iteratively and
       | "with amnesia", against a context window comprised of tokens in
       | the observed space.
       | 
       | Pure feed-forward networks generally (as all Transformer models
       | are) only seem as intelligent as they are, because, outside of
       | the model itself, we're breaking down the problem it has to solve
       | from "generate an image" or "generate a paragraph" to instead be
       | "generate a single convolution transform for a canvas" or
       | "generate the next word in the sentence", and then looping the
       | model over and over on solving that one-step problem with its own
       | previous output as the input.
       | 
       | Now, this approach -- using a pure feed-forward model (i.e. one
       | that has constant-bounded processing time per output token, with
       | no ability to "think longer" about anything), and feeding it the
       | entire context (input + output-so-far) on each step, then having
       | it infer one new "next" token at a time rather than entire output
       | sequences at a time -- isn't _fundamentally_ limiting.
       | 
       | After all, models _could_ just amortize any kind of superlinear-
       | in-compute-time processing, across the inference of several
       | tokens. (And if this _was_ how we architected our models, then we
       | 'd expect them to behave a lot like humans: they'd be be
       | "gradually thinking the problem through" _while_ saying something
       | -- and then would sometimes stop themselves mid-sentence, and
       | walk back what they said, because their asynchronous long-
       | thinking process arrived at a conclusion, that invalidated
       | _previous_ outputs of their surface-level predict-the-next-word
       | process.)
       | 
       | There's nothing that says that a pure feed-forward model needs to
       | be _stateless_ between steps.  "Feed-forward" just means that,
       | unlike in a Recurrent Neural Network, there's no step where data
       | is passed "upstream" to be processed again by nodes of the
       | network that have already done work. Each vertex of a feed-
       | forward network is only visited (at most) once per inference
       | step.
       | 
       | But there's nothing stopping you from designing a feed-forward
       | network that, say, keeps an additional embedding vector between
       | each latent layer, that isn't _overwritten_ or _dropped_ between
       | layer activations, but instead persists outside the inference
       | step, getting reused by the same layer in the next inference
       | step, where the outputs of layer N-1 from inference-step T-1 are
       | combined with the outputs of layer N-1 from inference-step T to
       | form (part of) the input to layer N at inference-step T. (To have
       | a model learn to do something with this  "tool", you just need to
       | ensure its training is measuring predictive error over multi-
       | token sequences generated using this multi-step working-memory
       | persistence.)
       | 
       | ...but we aren't currently allowing models to do that. Models
       | currently "have amnesia" between steps. In order to do any kind
       | of asynchronous multi-step thinking, everything they know about
       | "what they're currently thinking about" has to somehow be encoded
       | -- _compressed_ -- into the observed-space sequence, so that it
       | can be recovered and reverse-engineered into latent context on
       | the next step. And that compression is _very_ lossy.
       | 
       | And this is why ChatGPT isn't automatically a better
       | WolframAlpha. It can tell you how all the "mental algorithms"
       | involved in higher-level maths work -- and it can _try_ to follow
       | them itself -- but it has nowhere to keep the large amount of
       | "deep" [i.e. latent-space-level] working-memory context required
       | to "carry forward" these multi-step processes between inference
       | steps.
       | 
       | You _can_ get a model (e.g. o1) to limp along by dedicating much
       | of the context to  "showing its work" in incredibly-minute detail
       | -- essentially trying to force serialization of the most
       | "surprising" output in the latent layers as the predicted token
       | -- but this fights against the model's nature, especially as the
       | model still needs to dedicate many of the feed-forward layers to
       | deciding how to encode the chosen "surprising" embedding into the
       | same observed-space vocabulary used to communicate the final
       | output product to the user.
       | 
       | Given even linear context-window-size costs, the cost of this
       | approach to working-memory serialization is superlinear vs
       | achieved intelligence. It's untenable as a long-term strategy.
       | 
       | Obviously, my prediction here is that we'll build models with
       | real inference-framework-level working memory.
       | 
       | ---
       | 
       | At that point, if you're adding mutable weights and working
       | memory, why not just admit defeat with Transformer architecture
       | and go back to RNNs?
       | 
       | Predictability, mostly.
       | 
       | The "constant-bounded compute per output token" property of
       | Transformer models, is the key guarantee that has enabled "AI" to
       | be a commercial product right now, rather than a toy in a lab.
       | Any further advancements must preserve that guarantee.
       | 
       | Write-once-per-layer long-term-durable mutable weights preserve
       | that guarantee. Write-once-per-layer retained-between-inference-
       | steps session memory cells preserve that guarantee. But anything
       | with real _recurrence_ , does not preserve that guarantee.
       | Allowing recurrence in a neural network, is like allowing
       | backward-branching jumps in a CPU program: it moves you from the
       | domain of guaranteed-to-halt co-programs to the domain of
       | unbounded Turing-machine software.
        
       | lowsong wrote:
       | Last week I had to caution a junior engineer on my team to only
       | use an LLM for the first pass, and never rely on the output
       | unmoderated.
       | 
       | They're fine as glorified autocomplete, fuzzy search, or other
       | applications where accuracy isn't required. But to rely on them
       | in any situation where accuracy is important is professional
       | negligence.
        
       | rglover wrote:
       | Yes, but not absolutely.
       | 
       | LLMs are a valuable tool for augmenting productivity. Used
       | properly, they _do_ give you a competitive advantage over someone
       | who isn 't using them.
       | 
       | The "dead end" is in them being some magical replacement for
       | skilled employees. The levels of delusion pumping out of SV and
       | AI companies desperate to make a buck is unreal. They talk about
       | _chat bots_ like they 're already solving humanity's toughest
       | problems (or will be in "just two more weeks"). In reality,
       | they're approximately good at solving certain problems (and they
       | can only ever solve them from the POV of existing human knowledge
       | --they can't create). You still have to hold their hand quite a
       | bit.
       | 
       | This current wave of tech is going to have an identical outcome
       | to the "blockchain all the things" nightmare from a few years
       | back.
       | 
       | Long-term, there's a lot of potential for AI but this is just a
       | significant step forward. We're not "there" yet and won't be for
       | some time.
        
       | ein0p wrote:
       | My take is that even if AI qualitatively stops where it is right
       | now, and only continues to get faster / more memory efficient, it
       | already represents an unprecedented value add to human
       | productivity. Most people just don't see it yet. The reason why
       | that is, is because it "fills in" the weak spots of the human
       | brain - associative memory, attention, working memory
       | constraints, aversion to menial mental work. This does for the
       | brain what industrialization did for the body. All we need to do
       | to realize its potential is emphasize _collaboration_ with AI,
       | rather than _replacement_ by AI, that the pundits currently
       | emphasize as rage (and therefore click) bait.
        
       | arvindrajnaidu wrote:
       | AI cannot use composition?
        
       | karaterobot wrote:
       | Does anyone seriously think that results of any current
       | approaches would suddenly turn into godlike, super-intelligent
       | AGI if only we threw an arbitrary number of GPUs at them? I guess
       | I assumed everyone believed this was a stepping stone at best,
       | but were happy that it turned out to have some utility.
        
       | bgnn wrote:
       | What I funny is the discussion recolve a lot around software
       | development, where LLMs excel at. Outside this and creating junk
       | text, like a government report, patent application etc they seem
       | to be pretty useless. So most of the society doesn't care about
       | it and it's not as big as a revolution as SWEs think it is at the
       | moment and the discussion for future is actually philosophical:
       | do we think the trend of development continue or we will hit a
       | wall.
        
       ___________________________________________________________________
       (page generated 2024-12-27 23:00 UTC)