[HN Gopher] Computing Inside an AI
       ___________________________________________________________________
        
       Computing Inside an AI
        
       Author : pongogogo
       Score  : 99 points
       Date   : 2024-12-14 09:52 UTC (1 days ago)
        
 (HTM) web link (willwhitney.com)
 (TXT) w3m dump (willwhitney.com)
        
       | t0lo wrote:
       | Gotten to a point where i have a visceral reaction to any
       | intersection of AI and Psychological thought. As a human it
       | dependably makes me feel sick. We're going to see a lot of
       | changes that are good, and not so good.
        
       | FezzikTheGiant wrote:
       | i think lots of apps are going to go in the adaptive/generate ui
       | direction - even if it starts a lot simpler than generating the
       | code
        
         | Towaway69 wrote:
         | Perhaps an UI passed on a Salvador Dali painting - perhaps we
         | should also be questioning our UI concepts of slider, button,
         | window and co.
        
       | ilaksh wrote:
       | I think that Cerebras and Groq would be fun to experiment with
       | using normal LLMs for generating interfaces on the fly, since
       | they are so fast.
        
         | FezzikTheGiant wrote:
         | what's the cost difference between groq/cerebras vs using
         | something else for inferencing open source models? I'm guessing
         | the speed comes at a cost?
        
           | ilaksh wrote:
           | I don't know off the top of my head, only played with it a
           | little not seriously.
        
             | FezzikTheGiant wrote:
             | fair enough
        
           | el_isma wrote:
           | 0.6/1$ per M tokens in groq/cerebras vs 0.3$ per M tokens in
           | deepinfra (for llama 3.3 70b)
           | 
           | But note the free tiers for groq and cerebras are _very_
           | generous.
        
       | deadbabe wrote:
       | "Wherever they use AI as a tool they will, in the end, do the
       | same with human beings."
        
         | falcor84 wrote:
         | Why is that in quote marks? I couldn't find any matches in TFA
         | nor elsewhere.
         | 
         | And as to the sentence itself, I'm unclear on what exactly it's
         | saying; people have been using other people at tools from
         | before recorded history. Leaving aside slavery, what is it that
         | you would say that HR departments and capitalism in general do?
        
       | llm_trw wrote:
       | >Acting like a computer means producing a graphical interface. In
       | place of the charmingly teletype linear stream of text provided
       | by ChatGPT, a model-as-computer system will generate something
       | which resembles the interface of a modern application: buttons,
       | sliders, tabs, images, plots, and all the rest. This addresses
       | key limitations of the standard model-as-person chat interface:
       | 
       | Oh boy I can't wait for GPT Electron, so I can wait 60 seconds
       | for the reply to come back and then another 60 seconds for it to
       | render a sad face because I hit some guard rail.
        
         | Towaway69 wrote:
         | Not forgetting the computing power required to generate that
         | single sad face.
        
       | doug_durham wrote:
       | I appreciated the thought given in this piece. However in the age
       | of LLMs these types of "what if we look at problems this way..."
       | seem obsolete. Instead of asking the question, just use an LLM to
       | help you build the proof of concept and see if it works.
       | 
       | Back in the pre-LLM days these types of thought pieces made sense
       | as a call to action because the economics of creating
       | sophisticated proof's of concept was beyond the abilities of any
       | one person. Now you can create implementations and iterate at
       | nearly the speed of thought. Instead of telling people about your
       | idea, show people your idea.
        
         | ilaksh wrote:
         | I'm kind of with you in that you could build something kind of
         | like it based on a fast LLM. But what they are actually talking
         | about is a new cutting edge ML model that takes a huge amount
         | of data and compute to train.
        
           | doug_durham wrote:
           | I see your point, but that's not what I took away from the
           | article. To me it seems like an alternate way to use existing
           | models. In any case I think you could make a PoC that touched
           | on main idea using an existing model.
        
             | ilaksh wrote:
             | Yes you can and there is at least one example, a web
             | application where you enter a URL and the LLM automatically
             | generates the page including links, you click a link, the
             | LLM fills it in on the fly. I can't remember the name of
             | it.
             | 
             | But they mention things like Oasis in the article that use
             | a specialized model to generate games frame-by-frame.
        
         | achierius wrote:
         | But LLMs are nowhere near being able to do what you suggest,
         | for anything that one person wouldn't have been able to do
         | beforehand.
        
           | llm_trw wrote:
           | If I cared enough about guis I could implement what the op
           | said in two months by myself with unlimited access to a good
           | coding model, something like qwq.
           | 
           | The issue is training a multi modal model that can make use
           | of said gui.
           | 
           | I don't believe that there is a better general interface than
           | text however so I won't bother.
        
           | throw646577 wrote:
           | No amount of repeating is ever, unfortunately, going to get
           | this across; LLMs are founding a new kind of alchemy.
        
           | bobxmax wrote:
           | They absolutely are. I'm somewhat non-technical but I've been
           | using Claude to hack MVPs together for months now.
        
           | doug_durham wrote:
           | Not in my experience. Used properly an LLM is an immense
           | accelerator. Every time this comes up on HN we get the same
           | debate. There is the side that say LLMs are time wasting
           | toys, and the other which says they are transformative. You
           | need to know how critically ask questions and critique
           | answers to use a search engine effectively. The same is true
           | for an LLM. Once you learn how to pose your questions and the
           | correct level to ask your questions it is a massive
           | accelerator.
        
         | beepbooptheory wrote:
         | If we are never going to take the time to write, articulate, or
         | even think about things anymore, how can we still feel like we
         | have the authority or skills or even context to evaluate what
         | we generate?
        
         | dartos wrote:
         | > because the economics of creating sophisticated proof's of
         | concept was beyond the abilities of any one person
         | 
         | What?
         | 
         | Are you trying to say it's too expensive for a single worker to
         | make a POC, or that one person can't make a POC?
         | 
         | Either way that's not true at all...
         | 
         | There have been one person software shops for a long long time.
        
       | mirekrusin wrote:
       | Why stop there? Let it figure out how to please us without need
       | for sliders etc. We'll just relax. Now that's paradigm shift.
        
         | Towaway69 wrote:
         | That was my thought, is model as a computer the best we can do?
         | 
         | Isn't that limiting our perspective of AIs models to being
         | computers and what computers can't do, so the model can't do.
        
           | uxhacker wrote:
           | So what are the other models we could use ?
        
             | Towaway69 wrote:
             | Perhaps metaphor would be better terminology than models.
             | 
             | AI as an animal we are trying to tame. Why does it have to
             | be a machine metaphor?
             | 
             | Perhaps AI is a ecosystem upon which we all interact at the
             | same time. The author pointed out that the one-on-one
             | interaction is too slow for the AI - perhaps a many-to-one
             | metaphor would be more appropriate.
             | 
             | I agree with the author that we are using the wrong
             | metaphors when interacting with AI but personally I think
             | we should go beyond repeating the mistakes of the past by
             | just extending our current state, I.e. going from a
             | physical desktop to a virtual ,,desktop".
        
               | uxhacker wrote:
               | How about powerpoint as a metaphor? The challenge we face
               | is how to explain something complex. But also do we not
               | get into the issue that the Medium is the Message? That
               | just by using voice rather than an image do we not change
               | the meaning. And is that necessarily bad?
        
               | Towaway69 wrote:
               | > And is that necessarily bad?
               | 
               | Selecting a metaphor implies that ones imagination is -
               | at least partially - constrained by the metaphor. AI as a
               | powerpoint would make using AI for anything other than
               | presentations seem unusual since that what powerpoint is
               | used for.
               | 
               | Also when the original author "models as computers" what
               | does "computer" represent? A mainframe computer the size
               | of small apartment, a smartphone, a laptop, turing
               | machine or some collection of server racks. Even the term
               | "computer" is broad enough to include many forms of
               | interaction. I interact with my smartphone visually while
               | with my server rack textually, yet both are computers.
               | 
               | At least initially, AI seems to be something completely
               | different, almost god-like in its ability to provide us
               | with insightful answers and creative suggestions. God-
               | like meaning that judged from the outside, AI has the
               | ability to provide comforting support in times of need,
               | which is one characteristic of a god-like entity.
               | 
               | Powerpoint wasn't built to be a god-like provider of
               | answers to the most important questions. It would indeed
               | be a surprising if a PP presentation made the same impact
               | as religious scriptures - to thousands/millions of
               | people, not referring to individual experiences.
        
           | TeMPOraL wrote:
           | > _That was my thought, is model as a computer the best we
           | can do?_
           | 
           | Nah, there's a better option. Instead of a computer, we
           | could... go for treating it as a _person_.
           | 
           | Yes, that's inverting the whole point of the
           | article/discussion here, but think about it: the main
           | limitation of a computer is that we have to tell it step-by-
           | step what to do, because it can't figure out what we _mean_.
           | Well, LLMs can.
           | 
           | Textual chat interface is annoying, particularly the way it
           | works now, but I'd say the models are fundamentally right
           | where they need to be - it's just that a human person doesn't
           | use a single thin pipe of a text chat to communicate with the
           | world; they may converse with others explicitly, but that's
           | augmented by orders of magnitude more of contextual inputs -
           | sights, sounds, smells, feelings, memory, all combining into
           | higher-level memories and observations.
           | 
           | This is what could be the better alternative to "LLM as
           | computer": double down on tools and automatic context
           | management, so the user inputs are merely the small fraction
           | of data that's provided explicitly; everything else, the
           | model should watch on its own. Then it might just be able to
           | reliably Do What I Mean.
        
       | holoduke wrote:
       | Amateur question: is there a point possible where a llm is using
       | less compute power to calculate a certain formula compared to
       | regular computation?
        
         | logicchains wrote:
         | When the LLM knows how to simplify/solve the formula and the
         | person using it doesn't, it could be much more efficient than
         | directly running the brute-force/inefficient version provided
         | by the user. A simple example would be summing all numbers from
         | 0 to a billion; if you ask o1 to do this, it uses the O(1)
         | analytical solution, rather than the naive brute-force O(n)
         | approach.
        
           | stevesimmons wrote:
           | Though even in this case, it is enormously more efficient to
           | simply sum the first billion integers rather than find an
           | analytic solution via a 405b parameter LLM...
        
         | piotr93 wrote:
         | Yes an llm could do it since it can predict the next token for
         | pretty much anything. But what's the error margin you are ready
         | to tolerate?
        
       | JTyQZSnP3cQGa8B wrote:
       | I wish we had some kind of _Central Processing Unit_ to do this
       | instead of relying on hallucinating remote servers that need a
       | subscription.
        
       | piotr93 wrote:
       | The only computations that an LLM does are backprops and forward
       | passes. It can not run any arbitrary program description. Yes, it
       | will hallucinate your program's output if you feed it some good
       | enough starting prompt. But that's it.
        
         | FezzikTheGiant wrote:
         | genuinely what's the point of this comment? are you allergic to
         | cool stuff? honestly curious as to what you were trying to
         | achieve by this comment.
         | 
         | nowhere in this post does the author say that it's ready with
         | the current state of models, or he'd use a foundation model for
         | this. why the hate?
        
         | logicchains wrote:
         | An LLM with chain of thought and unbounded compute/context can
         | run any program in PTIME: https://arxiv.org/abs/2310.07923 ,
         | which is a huge class of programs.
        
           | piotr93 wrote:
           | Woah super interesting, I didn't know about this. Will def
           | read it! Seems like I was wrong?
        
           | csmpltn wrote:
           | > "An LLM with unbounded compute/context"
           | 
           | This isn't a thing we have, or will have.
           | 
           | It's like saying that a computer with infinite memory, CPU
           | and power can certainly break SHA-256 and bring the world's
           | economy down with it.
        
             | stavros wrote:
             | No, it's like saying "a computer can't crack SHA hashes, it
             | can only add and subtract numbers together" "a computer can
             | crack any SHA hash" "yes, given infinite time".
             | 
             | The fact that you need infinite time for some of the stuff
             | doesn't mean you can't do any of the stuff.
        
             | hansonkd wrote:
             | I mean it doesn't need to compute _all_ programs in a human
             | length reasonable amount of time.
             | 
             | It just needs to be able to compute enough programs to be
             | useful.
             | 
             | Even our current infrastructure of precisely defined
             | programs and compilers isn't able to compute all programs.
             | 
             | It seems reasonable in the future be able to give an LLM
             | the python language specification, a python program, and it
             | iteratively returns the answer.
        
           | Vetch wrote:
           | Note that this is an expressibility (upper) bound on
           | transformers granted intermediate decoding steps. It says
           | nothing about their learnability, and modern LLMs are not
           | near that level of expressive capacity.
           | 
           | The authors also introduce projected pre-norm and layer-norm
           | hash to facilitate their proofs, another sense in which it is
           | an upper-bound on the current approach to AI, since these
           | concepts are not standard. Nonetheless, the paper shows how
           | allowing a number of intermediate decoding steps polynomial
           | in input size is already enough to run most programs of
           | interest (which are in P).
           | 
           | There are additional issues. This work relies on the concept
           | of saturated attention, however as context length grows in
           | real world transformers, self-attention deviates from this
           | model as it becomes noisier, with unimportant indices getting
           | undue focus (IIUC, due to precision issues and how softmax
           | assigns non-zero probability to every token). Finally, it's
           | worth noting that the more under-specified your problem is,
           | and the more complex the problem representation is, then the
           | quickly more intractable the induced probabilistic inference
           | problem. Unless you're explicitly (and wastefully)
           | programming a simulated turing machine through the LLM, this
           | will be far from real-time interactive. Users should expect a
           | prolog like experience of spending most of their time working
           | out how to help search.
           | 
           | Trivia: Softmax also introduces another problem: the way
           | softmax is applied forces attention to always assign
           | importance to some tokens, often leading to dumping of focus
           | on typically semantically unimportant tokens like whitespace.
           | This can lead to an overemphasis on unimportant tokens,
           | possibly inducing spurious correlations on whitespace, this
           | propagating through the network with possibly unexpected
           | negative downstream effects around whitespace.
        
       | K0balt wrote:
       | Very interesting paradigm shift.
       | 
       | Tangentially, I have considered the possible impact of
       | thermodynamic computing in its application to machine learning
       | models.
       | 
       | If (big if) we can get thermodynamic compute wells to work at
       | room temperature or cheap microcryogenics, it's foreseeable that
       | we could have flash-scale AI accelerators (thermodynamic wells
       | could be very simple in principle, like a flash cell)
       | 
       | That could give us the capability to run Tera-parameter models on
       | drive-size devices using 5-50 watts of power. In such a case, it
       | is foreseeable that it might become more efficient and economical
       | to simulate deterministic computing devices when they are
       | required for standard computing tasks.
       | 
       | My knee jerk reaction is "probably not" but still , it's a
       | foreseeable possibility.
       | 
       | Hard to say what the ramifications of that might be.
        
       | thrwthsnw wrote:
       | This is the wrong direction, it is retrograde trying to shoehorn
       | NATURAL LANGUAGE UNDERSTANDING into existing GUI metaphors.
       | 
       | Instead of showing a "discoverable" palette of buttons and
       | widgets which is limited by screen space just ASK the model what
       | it can do and make sure it can answer. People obviously don't
       | know to do that yet so a simple on screen prompt to the user will
       | be necessary.
       | 
       | Yes we should have access to "sliders" and other controls for
       | fine tuning the output or maintaining a desired setting
       | throughout generations but those are secondary to the ability of
       | the models to make sweeping and cohesive changes and provide many
       | alternatives for the user to CHOOSE from before they get to the
       | stage of making fine grained adjustments.
        
       | decasia wrote:
       | I think the proof that this is a good article is that people's
       | reactions to it are taking them in so many different directions.
       | It might or might not be very actionable this year (I for one...
       | would like to see a lower level of hallucination and mansplaining
       | in LLM output before it starts to hide itself behind a
       | dynamically generated UI) but it seems, for sure, good to think
       | with.
        
       | someothherguyy wrote:
       | This seems like an inevitability. That is, eventually, "AI" will
       | be used to create adaptive interfaces for whatever the HCI user
       | wants: graphical, immersive, textual, voice, and so on.
        
       | irthomasthomas wrote:
       | On a related line of enquiry, both gemini-2-flash and
       | sonnet-3.5-original can act like computers, interpreting and
       | responding to instructions written in code. These two models are
       | the only ones to do it reliably.
       | 
       | Here's a thread
       | https://x.com/xundecidability/status/1867044846839431614
       | 
       | And example function for Gemini written in shell, where the
       | system prompt is the function definition that interacts with the
       | model.
       | https://github.com/irthomasthomas/shelllm.sh/blob/main/shelp...
        
       | mithametacs wrote:
       | We're going to have to move past considering an LLM to just be a
       | model.
       | 
       | It's a database. The WYSIWYG example would require different
       | object types to have different UI components. So if you change
       | what a container represents in the UI, all its children should be
       | recomputed.
       | 
       | Need direct association between labels in the model space and
       | labels in the UI space.
        
         | Zr01 wrote:
         | Databases don't hallucinate.
        
           | mithametacs wrote:
           | Correct, they just don't return anything. Which is the right
           | behavior sometimes and the wrong behavior others.
        
       | padolsey wrote:
       | > communicating complex ideas in conversation is hard and lossy
       | 
       | True but..
       | 
       | > instead of building the website, the model would generate an
       | interface for you to build it, where every user input to that
       | interface queries the large model under the hood
       | 
       | This to me seems wildly _more_ lossy though, because it is by its
       | nature immediately constraining. Whereas conversation at least
       | has the possibility of expansiveness and lateral step-taking. I
       | feel like mediating via an interface might become too narrow too
       | quickly maybe?
       | 
       | For me, conversation, although linear and lossy, melds well with
       | how our brain works. I just wish the conversational UXs we had
       | access to were less rubbish, less linear. E.g. I'd love Claude or
       | any of the major AI chat interfaces to have a 'forking'
       | capability so I can go back to a certain point in time in the
       | chat and fork off a new rabbit hole of context.
       | 
       | > nobody would want an email app that occasionally sends emails
       | to your ex and lies about your inbox. But gradually the models
       | will get better.
       | 
       | I think this is a huge impasse tho. And we can never make models
       | 'better' in this regard. What needs to get 'better' - somehow -
       | is how to mediate between models and their levers into the
       | computer (what they have permission to do). It's a bad idea to
       | even have a highly 'aligned' LLM send emails on our behalf
       | without having us in the loop. The surface area for problems is
       | just too great.
        
         | gavindean90 wrote:
         | Yea forked conversations UX is definitely one of my most
         | desired features.
        
       | sheeshkebab wrote:
       | Can it run DOOM yet?
        
       | handfuloflight wrote:
       | Author seems to not engage with a core problem: humans rely on
       | muscle memory and familiar patterns. Dynamic interfaces that
       | change every session would force constant relearning. That's
       | death by a thousand micro-learning curves, no matter how
       | "optimal" each generated UI might be.
        
         | thorum wrote:
         | The solution is user interfaces that are stable, but infinitely
         | customizable by the user for their personal needs and
         | preferences, rather than being fixed until a developer updates
         | it.
        
       ___________________________________________________________________
       (page generated 2024-12-15 23:01 UTC)