hngopher.com

       [HN Gopher] Language models as compilers: Simulating pseudocode ...
       ___________________________________________________________________
        
       Language models as compilers: Simulating pseudocode execution
        
       Author : milliondreams
       Score  : 160 points
       Date   : 2024-04-04 19:46 UTC (1 days ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | spxneo wrote:
       | This seems quite promising. Using pseudo-code as an intermediary
       | step isn't new but seems like this takes it a bit further. Will
       | need to see some code and test it out.
        
       | Mathnerd314 wrote:
       | The phase 2 prompt is complete, but the phase 3 prompt's initial
       | part ends in "When constructing the main function, ...", and no
       | mention of random seeds, so I guess this paper is not
       | reproducible at all.
        
       | inciampati wrote:
       | It's going to be really fascinating to see this applied instead
       | of chain of thought and other kinds of reasoning approaches,
       | because it's generic. It should in principle work on every kind
       | of LLM.
        
       | pkoird wrote:
       | Any sufficiently advanced LLM is indistinguishable from Prolog.
       | 
       | I half-jest but I envision the direction of LLM research to head
       | towards a parser-oriented setup where LLMs merely extract the
       | entities and relations and the actual logic is done by a logical
       | engine such as Prolog.
        
         | nickpsecurity wrote:
         | In truth, I thought it would be a great idea to train them on
         | turning English and code into formal specs and logic language.
         | Then, we could use all our tools in those areas work from
         | there.
         | 
         | Combining LLM's with rewriting logic, like Maude or K
         | Framework, would be the most, powerful option. The rewriting
         | tools plus LLM's could probably rapidly develop static
         | analyzers and code porting tools.
        
         | pyinstallwoes wrote:
         | Hehe, this was the exact thinking going across my mind in
         | considering underlying architecture for a LLM OS/Word processor
         | that was self aware of the documents across a project for
         | consistency.
        
         | gpm wrote:
         | I envision a thread of research in the exact opposite
         | direction. Take a logic program/theorem/database query/... and
         | use a LLM to guide a search for a solution/proof/query plan/...
        
           | winwang wrote:
           | LLM the Ultimate Heuristic?
        
             | kevindamm wrote:
             | It would be convenient if true. I'm a little skeptical it
             | will go that far, but at the same time humans seem to
             | better understand a tricky problem after an internal
             | monologue about it or a quick conversation with a rubber
             | ducky.. so maybe?
        
           | ithkuil wrote:
           | I wonder if the problem that LLMs solved was not the lack of
           | "intelligence" of logic driven systems but the lack of a
           | particular intelligence that is so crucial for is to make
           | effective use of the interaction with such tools, namely the
           | ability to actually understand our natural language.
           | 
           | That feature by itself is not enough, but can be a very
           | effective glue to be used with other components of an
           | intelligent system. The analogy with the human brain would be
           | the broca area vs. the rest of the brain.
           | 
           | Now, there are open questions about whether the
           | _architecture_ that underpins the LLMs is also good enough to
           | be used as a substrate for other functions and what's the
           | most effective way for having these different components of
           | the system communicate between each other.
           | 
           | The analogy with the human brain can guide us (as well as
           | lead us astray), in that our brain, like biological systems
           | often do, re-purposes the basic building blocks to create
           | different subsystems.
           | 
           | It's not clear to me at which level we'll find the most
           | effective re-purposable building blocks.
           | 
           | It's easy to try (and people do) to use the top-level LLM
           | system as such a building block and have it produce plans,
           | connect it to external systems that feed information back and
           | have it iterate again on it (ab)using it's language
           | processing as an API with the environment.
           | 
           | The human analogy of that is when we use external tools to
           | extend our cognitive capacity, like when we do arithmetic
           | using pencil and paper or when we scribble some notes to help
           | us think.
           | 
           | I think this level is useful and real but I wonder if we also
           | need to give more power to some lower levels too.
           | 
           | Granted, some of that "power" can already be emerging during
           | the training of the LLMs but I wonder if some more
           | specialized blocks might enhance the effectiveness
        
         | lachlan_gray wrote:
         | You might enjoy parts of this interview with Stephen Wolfram
         | 
         | https://m.youtube.com/watch?v=PdE-waSx-d8
         | 
         | He's very much on this kind of beat. In general I have a
         | feeling that there are orders of magnitude to gain by
         | successfully applying computer science to "language
         | algorithms".
         | 
         | Feels like we are exploring very narrow paths of computations
         | that can be performed with language. Like we have an x86 cpu
         | and we are building pocket calculators. So much untapped
         | 
         | Re prolog I had a similar intuition at some point and tried to
         | make a stack based programming language that uses a language
         | model as a kind of control/logic unit
         | 
         | https://github.com/LachlanGray/silas
         | 
         | I was missing a bunch of cs background at the time so I didn't
         | get very far, but I feel like there's a lot to be done for this
         | kind of thing
        
         | westoncb wrote:
         | I think this is an important perspective. It helps clarify
         | prompting as well because, used in a certain way, they are
         | effectively natural language specification of constraints which
         | have the effect of 'partially configuring' the network so that
         | its inference selects from the configuration space of its still
         | free params
         | 
         | For example, if you tell it to reply in JSON (and it obeys),
         | you've just constrained its search space in a particular way.
         | There is space for very interesting informal programming that
         | can be done from this perspective, setting up constraints and
         | then allowing inference to solve within them. I've been using
         | this heavily.
         | 
         | When I was first getting deep into LLM stuff a few months ago
         | and contemplating latent space my main characterization was
         | that much of its high level behavior can be usefully grappled
         | with by viewing it as a kind of 'learned geometric prolog'.
         | 
         | I did a bunch of illustrations and talked about some of these
         | ideas here if anyone's curious:
         | https://x.com/Westoncb/status/1757910205478703277 (I think I
         | mostly dropped the prolog terminology in that presentation
         | because not everyone knows about it)
        
         | thesz wrote:
         | https://en.wikipedia.org/wiki/Cyc#MathCraft
         | 
         | Quote:                 One Cyc application aims to help
         | students doing math at a 6th grade level, helping them much
         | more deeply understand that subject matter... Unlike almost all
         | other educational software, where the computer plays the role
         | of the teacher, this application of Cyc, called MathCraft, has
         | Cyc play the role of a fellow student who is always slightly
         | more confused than you, the user, are about the subject.
         | 
         | This is from 2017. I haven't seen anything like this using LMs
         | in 2017 and I suspect it is still hard for LLMs today.
         | 
         | Cyc is the huge reasoning engine. You can call it Prolog, if
         | you want. I won't.
        
           | ogogmad wrote:
           | > I suspect it is still hard for LLMs
           | 
           | I just gave it to Claude: https://imgur.com/a/fQQOy1d
        
             | Sakos wrote:
             | 1) This is ridiculously cool
             | 
             | 2) The "action" text gives me such I put on my wizard hat
             | vibes
        
             | thesz wrote:
             | MathCraft [1] is little more [2] than chat.
             | [1] https://www.cyc.com/mathcraft/       [2]
             | https://www.youtube.com/watch?v=pbrp7MzBDm0
        
         | pjmlp wrote:
         | The moment they are fully there, just like an executable
         | generated from a Prolog compiler, eventually it is time to
         | realize programing is no more, except for the selected few ones
         | from big corps that create LLM compilers.
        
         | mirekrusin wrote:
         | It also has further reaching consequences.
         | 
         | It creates foundation for reinforcement learning without human
         | feedback - a missing piece of puzzle.
         | 
         | Simplifying: propose plausible theorem, try to find provable
         | solution, reinforce reasoning/solution path, move proved
         | statement into axioms, repeat.
         | 
         | (super)intelligence has many dimentions. One of less explored
         | ones is exploiting concurrency in thought chains. It's
         | something very un-natural to us, but there is a lot of gain if
         | you're able to branch and collect feedback from dead ends and
         | progress from different directions being taken at the same
         | time.
        
         | ginko wrote:
         | I was thinking of if AI systems could be composed of multiple
         | internal agents that when prompted are supposed to discuss
         | internally and agree on what to reply in the end. Some of the
         | agents could be LLMs but others could be logic engines or
         | database frontends for instance.
        
         | DebtDeflation wrote:
         | >where LLMs merely extract the entities and relations and the
         | actual logic is done by a logical engine such as Prolog.
         | 
         | Graph RAG is an emerging design pattern where factual
         | information is retrieved from knowledge graphs to augment the
         | textual information retrieved from document stores in classic
         | RAG before submitting to an LLM. What you are proposing is
         | essentially using the LLM to build the knowledge graph in the
         | first place. Would be interesting to see the two techniques
         | combined along with some sort of planner/optimizer in the
         | middle.
        
         | astrange wrote:
         | The problem is that Prolog is already "AI" as in expert system
         | GOFAI - that approach already failed, so you shouldn't believe
         | people claiming that an entirely different connectionist
         | approach is going to prove that their failed alternative worked
         | all along.
        
           | codesnik wrote:
           | maybe that's because prolog from 80ies basically operated in
           | 1bit space and had just 1000s of parameters, crafted by hand.
           | But nowadays...
        
       | skeledrew wrote:
       | Seeing this makes me want to reactivate an old project[0]. Been
       | thinking more and more that LLMs could give it superpowers.
       | 
       | [0] https://pypi.org/project/neulang/
        
       | m3kw9 wrote:
       | If you train a LLM to compile, you probably also want to set the
       | randomness to zero, if that is the case you've just "brute
       | forced" an actual compiler
        
         | danielmarkbruce wrote:
         | you don't want a creative compiler?
        
           | fxcao wrote:
           | In this specific use case I think we should avoid any
           | creative aspect in the behavior of the LLM. Compiling might
           | look like a "word-for-word" translation in a certain manner.
           | Isn't it ?
        
           | layer8 wrote:
           | I see no inherent conflict between creativity and
           | determinism. You could give the compiler a seed value to
           | perturbate the solution space.
        
           | m3kw9 wrote:
           | Why vote down a rhetorical question?
        
       | jumploops wrote:
       | English is terribly imprecise, so it makes sense to use pseudo
       | instructions to improve the bounds/outcome of a language model's
       | execution.
       | 
       | I do wonder how long hacks like this will be necessary; as it
       | stands, many of these prompting techniques are essentially
       | artificially expanding the input to enhance reasoning ability
       | (increasing tokens, thus increasing chance of success).
        
         | nimbleal wrote:
         | I hope there are more models trained on more precise inputs
         | going forward. I understand that natural language feels the
         | most futuristic but while it has the lowest barrier to entry
         | it's not only imprecise but also slow. Visual approaches (for
         | example control nets in stable diffusion, image as input in
         | Chat GPT, though both of these are somewhat bolted on), 2D
         | semi-natural languages all merit further inquiry.
         | 
         | Another (and perhaps the ultimate) possibility is to have some
         | way --- perhaps through simulations --- to directly expose the
         | model to the problem, rather than having a human/natural
         | language intermediary.
        
       | Voultapher wrote:
       | Non deterministic compilers, yay! Where do I sign up?
       | 
       | In more seriousness, miscompilations or in general unexpected
       | behavior caused by layers below you are expensive to find and
       | fix. I think LLMs have a long way to go before such use cases
       | seem appealing to me.
        
         | ginko wrote:
         | Even regular compilers need quite a bit of nudging to give
         | deterministic results.
        
           | Phillipharryt wrote:
           | Correct me if I'm wrong here, but I am under the impression
           | they're only non-deterministic in the practical sense (i.e,
           | it produces this output on my machine, I can't know what
           | minute differences there are on your machine), but that's not
           | non-deterministic in the truest sense. If you have completely
           | identical inputs you will get the exact same output, ergo,
           | deterministic.
        
             | layer8 wrote:
             | You are correct. Compilers are deterministic, but
             | reproducible builds can be a challenge.
        
         | tiborsaas wrote:
         | It's better to have a non-deterministic compiler for a task
         | that would be really hard to write an algorithm for otherwise.
        
           | clbrmbr wrote:
           | But are LLMs really better at algorithm writing than you?
           | I've found that they work best when I've already pseudocoded
           | the algorithm.
        
             | tiborsaas wrote:
             | I didn't mean making it write the actual code and using
             | that, but there are tasks that are more error prone or near
             | impossible to write with a traditional approach. So using
             | some zero shot prompting is better than running code.
             | 
             | That's why I find non-determinism as acceptable when
             | otherwise it would be a pain to do something similar.
        
         | taneq wrote:
         | Non-determinism is an implementation detail, not an intrinsic
         | property, as I understand it (at least as long as you're
         | setting temperature to zero).
         | 
         | I likewise don't really think LLMs are the right tool for this
         | job, though. There's a whole class of systems that we built
         | because humans take a long time to learn new skills, are
         | fallible and non-repeatable, and get bored easily. Compilers
         | are in this group along with sewing machines, CNC machines,
         | automatic gearboxes, and design rules checking in CAD.
         | 
         | Maybe they could provide heuristics for optimising compilers
         | with the output run through a formal verification check
         | afterwards?
        
           | knightoffaith wrote:
           | >Non-determinism is an implementation detail, not an
           | intrinsic property, as I understand it (at least as long as
           | you're setting temperature to zero).
           | 
           | Right. A transformer outputs a probability distribution over
           | all possible tokens from which the next token is sampled and
           | then appended to the input sequence, at which point the
           | process repeats. Temperature controls the entropy of the
           | distribution - higher temperature, higher entropy,
           | conversely, lower temperature, lower temperature. Technically
           | zero temperature involves dividing by zero, so under the hood
           | it's simply set to be an epsilon so small that the entropy of
           | the distribution is low enough that sampling from it always
           | effectively gives one token - the token with the highest
           | probability. And so at every step in inference, the highest
           | probability token is emitted.
        
         | eeue56 wrote:
         | I (kinda) solved this with neuro-lingo[0] with the concept of
         | pinning. Basically, once you have a version of a function
         | implementation that works, you can pin it and it won't be
         | regenerated when it's "compiled". The alternative approach
         | would be to have tests be the only code a developer writes, and
         | then make LLMs generate code to match the implementation for
         | those, running the tests to ensure it's valid.
         | 
         | - [0] https://github.com/eeue56/neuro-lingo
        
       | lionkor wrote:
       | How is it better than a compiler written by people?
        
       | eeue56 wrote:
       | I wrote a toy language along these lines a while back[0].
       | Basically, types and function signatures, with comments in
       | English, produce a valid program. You write a type and a comment,
       | and the compiler goes through GPT to run the code. Fun novel
       | idea.
       | 
       | [0] - https://github.com/eeue56/neuro-lingo
        
       | novideogame wrote:
       | I think the title is a little misleading. The main difference
       | between this paper and CoC (Chain of Code) is that the LLM is
       | instructed to make a plan to solve all the given instances and
       | then code that plan in pseudocode, while in CoC the plan is to
       | solve the single instance given.
       | 
       | From the paper: The main difference between THINK-AND-EXECUTE and
       | CoC is that we use pseu- docodes which are generated to express
       | logic shared among the tasks instances, while CoC incorporates
       | pseudocode as part of the intermediate reasoning steps towards
       | the solution of a given instance. Hence, the results indicate the
       | advantages of applying pseudocode for the generation of task-
       | level instruction over solely using them as a part of rationales.
       | 
       | I find the phrase "as a part of rationales" a little strange, but
       | English is not my native language.
        
       | ingigauti wrote:
       | Couple of weeks ago I published a new programming language called
       | Plang (as in pseudo language) that uses LLM to translate user
       | intent into executable code, basically LLM as a compiler.
       | 
       | It saves you incredible amount of work, cutting code writing down
       | by 90%+. The built code is deterministic(it will never change
       | after build) and as a programmer you can validate the code that
       | will be executed. It compiles to C#, so it handles GC, encoding,
       | etc. that languages need to solve, so I can focus on other areas.
       | 
       | Plang also has some features that other language don't have, e.g.
       | events on variables, built in identity and interesting(I think)
       | approach to privacy.
       | 
       | I have not been advertising to much since it is still early
       | development and I create still to many breaking changes, but help
       | is welcome(and needed) so if it something that is interesting to
       | you the repo is at https://github.com/plangHQ
        
         | layer8 wrote:
         | In my experience it's not exactly trivial to validate code you
         | didn't write yourself, because you have to think through it in
         | similar depth to when you write it yourself. While on the one
         | hand you save the time of coming up with a solution, the task
         | of merely verifying an existing solution is also more tedious,
         | because it isn't intermixed with the problem-solving activity
         | that you perform when writing the code yourself. There is an
         | increased risk to fall for code that looks correct at first
         | blush but is still subtly wrong, because you didn't spend time
         | iterating on it. It doesn't seem plausible to me that you would
         | save 90% of the work, unless it's boiler plate-heavy code that
         | requires only little analytical though.
        
           | ingigauti wrote:
           | I agree with you. I never liked how AI is really generating
           | ton of code for us, then you need to read through it and
           | understand it. Plus, the fail rate is to high.
           | 
           | That is why I design the language the way it is. You must
           | define each step you want to happen in your application. Lets
           | take for example user registration, it looks like this
           | 
           | --- plang code ---
           | 
           | CreateUser
           | 
           | - Make sure %password% and %email% is not empty
           | 
           | - Hash %password%, write to %hashedPassword%
           | 
           | - Insert into users, %hashedPassword%, %email%
           | 
           | - Post, create user in MailChimp,
           | Bearer:%Settings.MailChimpApi%, %email%
           | 
           | - Create bearer token from %email%, write to %bearer%
           | 
           | - Write %bearer% to web response
           | 
           | --- plang code ---
           | 
           | That is an executable code in plang. It's easy to read
           | through and understand. You need to have domain knowledge,
           | such as what is hashing and bearer token. You are still
           | programming, just at higher level.
           | 
           | Validating what will execute, you need to learn, just like
           | with any language, but it is relatively simple and you start
           | to trust the result with time(at least I have)
           | 
           | Compared to the 130 lines or so of code in C# for the same
           | logic, https://gist.github.com/ingig/491ac9b13d65f40cc24ee5ae
           | d0408b... That's about 95% reduction of code, and I see this
           | repeatedly.
        
             | layer8 wrote:
             | But you have to double-check those 130 lines and think
             | through all possible cases (edge cases, error cases) for
             | each statement. I don't see how you save all that much
             | time. And your example output doesn't even contain error
             | handling, logging, and so on. There's also no way to
             | roundtrip any changes you want to add to the output code,
             | while still making changes to the high-level description.
             | (This is one reason why model-driven programming largely
             | failed.)
        
       | imranq wrote:
       | Reading the paper, the connection to compilers is more of an
       | analogy rather than a direct technical link.
       | 
       | The authors propose using an LLM to reframe the task as high
       | level psuedocode, and then reason on that code on the specific
       | details of the task
       | 
       | No compilers were used or compiled - no real code was generated
       | or executed. Its just the idea that a programming language syntax
       | has good structure to process details, and a way to interpret
       | some of the results. Many of the other comments here seem like
       | they didn't read the paper at all and are reacting to the
       | headline
        
       | emmender2 wrote:
       | Researchers are trying their damndest to build a "reasoning"
       | layer using LLMs as the foundation. But, they need to go back to
       | the drawing-board and understand from first principles what it
       | means to reason. For this in my view, they need to go back to
       | epistemology (and refer to Peirce and logicians like him).
        
       | 29athrowaway wrote:
       | Up next: A LLM that can tell me if a program stops
        
       ___________________________________________________________________
       (page generated 2024-04-05 23:01 UTC)