[HN Gopher] Language models as compilers: Simulating pseudocode ...
___________________________________________________________________
Language models as compilers: Simulating pseudocode execution
Author : milliondreams
Score : 160 points
Date : 2024-04-04 19:46 UTC (1 days ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| spxneo wrote:
| This seems quite promising. Using pseudo-code as an intermediary
| step isn't new but seems like this takes it a bit further. Will
| need to see some code and test it out.
| Mathnerd314 wrote:
| The phase 2 prompt is complete, but the phase 3 prompt's initial
| part ends in "When constructing the main function, ...", and no
| mention of random seeds, so I guess this paper is not
| reproducible at all.
| inciampati wrote:
| It's going to be really fascinating to see this applied instead
| of chain of thought and other kinds of reasoning approaches,
| because it's generic. It should in principle work on every kind
| of LLM.
| pkoird wrote:
| Any sufficiently advanced LLM is indistinguishable from Prolog.
|
| I half-jest but I envision the direction of LLM research to head
| towards a parser-oriented setup where LLMs merely extract the
| entities and relations and the actual logic is done by a logical
| engine such as Prolog.
| nickpsecurity wrote:
| In truth, I thought it would be a great idea to train them on
| turning English and code into formal specs and logic language.
| Then, we could use all our tools in those areas work from
| there.
|
| Combining LLM's with rewriting logic, like Maude or K
| Framework, would be the most, powerful option. The rewriting
| tools plus LLM's could probably rapidly develop static
| analyzers and code porting tools.
| pyinstallwoes wrote:
| Hehe, this was the exact thinking going across my mind in
| considering underlying architecture for a LLM OS/Word processor
| that was self aware of the documents across a project for
| consistency.
| gpm wrote:
| I envision a thread of research in the exact opposite
| direction. Take a logic program/theorem/database query/... and
| use a LLM to guide a search for a solution/proof/query plan/...
| winwang wrote:
| LLM the Ultimate Heuristic?
| kevindamm wrote:
| It would be convenient if true. I'm a little skeptical it
| will go that far, but at the same time humans seem to
| better understand a tricky problem after an internal
| monologue about it or a quick conversation with a rubber
| ducky.. so maybe?
| ithkuil wrote:
| I wonder if the problem that LLMs solved was not the lack of
| "intelligence" of logic driven systems but the lack of a
| particular intelligence that is so crucial for is to make
| effective use of the interaction with such tools, namely the
| ability to actually understand our natural language.
|
| That feature by itself is not enough, but can be a very
| effective glue to be used with other components of an
| intelligent system. The analogy with the human brain would be
| the broca area vs. the rest of the brain.
|
| Now, there are open questions about whether the
| _architecture_ that underpins the LLMs is also good enough to
| be used as a substrate for other functions and what's the
| most effective way for having these different components of
| the system communicate between each other.
|
| The analogy with the human brain can guide us (as well as
| lead us astray), in that our brain, like biological systems
| often do, re-purposes the basic building blocks to create
| different subsystems.
|
| It's not clear to me at which level we'll find the most
| effective re-purposable building blocks.
|
| It's easy to try (and people do) to use the top-level LLM
| system as such a building block and have it produce plans,
| connect it to external systems that feed information back and
| have it iterate again on it (ab)using it's language
| processing as an API with the environment.
|
| The human analogy of that is when we use external tools to
| extend our cognitive capacity, like when we do arithmetic
| using pencil and paper or when we scribble some notes to help
| us think.
|
| I think this level is useful and real but I wonder if we also
| need to give more power to some lower levels too.
|
| Granted, some of that "power" can already be emerging during
| the training of the LLMs but I wonder if some more
| specialized blocks might enhance the effectiveness
| lachlan_gray wrote:
| You might enjoy parts of this interview with Stephen Wolfram
|
| https://m.youtube.com/watch?v=PdE-waSx-d8
|
| He's very much on this kind of beat. In general I have a
| feeling that there are orders of magnitude to gain by
| successfully applying computer science to "language
| algorithms".
|
| Feels like we are exploring very narrow paths of computations
| that can be performed with language. Like we have an x86 cpu
| and we are building pocket calculators. So much untapped
|
| Re prolog I had a similar intuition at some point and tried to
| make a stack based programming language that uses a language
| model as a kind of control/logic unit
|
| https://github.com/LachlanGray/silas
|
| I was missing a bunch of cs background at the time so I didn't
| get very far, but I feel like there's a lot to be done for this
| kind of thing
| westoncb wrote:
| I think this is an important perspective. It helps clarify
| prompting as well because, used in a certain way, they are
| effectively natural language specification of constraints which
| have the effect of 'partially configuring' the network so that
| its inference selects from the configuration space of its still
| free params
|
| For example, if you tell it to reply in JSON (and it obeys),
| you've just constrained its search space in a particular way.
| There is space for very interesting informal programming that
| can be done from this perspective, setting up constraints and
| then allowing inference to solve within them. I've been using
| this heavily.
|
| When I was first getting deep into LLM stuff a few months ago
| and contemplating latent space my main characterization was
| that much of its high level behavior can be usefully grappled
| with by viewing it as a kind of 'learned geometric prolog'.
|
| I did a bunch of illustrations and talked about some of these
| ideas here if anyone's curious:
| https://x.com/Westoncb/status/1757910205478703277 (I think I
| mostly dropped the prolog terminology in that presentation
| because not everyone knows about it)
| thesz wrote:
| https://en.wikipedia.org/wiki/Cyc#MathCraft
|
| Quote: One Cyc application aims to help
| students doing math at a 6th grade level, helping them much
| more deeply understand that subject matter... Unlike almost all
| other educational software, where the computer plays the role
| of the teacher, this application of Cyc, called MathCraft, has
| Cyc play the role of a fellow student who is always slightly
| more confused than you, the user, are about the subject.
|
| This is from 2017. I haven't seen anything like this using LMs
| in 2017 and I suspect it is still hard for LLMs today.
|
| Cyc is the huge reasoning engine. You can call it Prolog, if
| you want. I won't.
| ogogmad wrote:
| > I suspect it is still hard for LLMs
|
| I just gave it to Claude: https://imgur.com/a/fQQOy1d
| Sakos wrote:
| 1) This is ridiculously cool
|
| 2) The "action" text gives me such I put on my wizard hat
| vibes
| thesz wrote:
| MathCraft [1] is little more [2] than chat.
| [1] https://www.cyc.com/mathcraft/ [2]
| https://www.youtube.com/watch?v=pbrp7MzBDm0
| pjmlp wrote:
| The moment they are fully there, just like an executable
| generated from a Prolog compiler, eventually it is time to
| realize programing is no more, except for the selected few ones
| from big corps that create LLM compilers.
| mirekrusin wrote:
| It also has further reaching consequences.
|
| It creates foundation for reinforcement learning without human
| feedback - a missing piece of puzzle.
|
| Simplifying: propose plausible theorem, try to find provable
| solution, reinforce reasoning/solution path, move proved
| statement into axioms, repeat.
|
| (super)intelligence has many dimentions. One of less explored
| ones is exploiting concurrency in thought chains. It's
| something very un-natural to us, but there is a lot of gain if
| you're able to branch and collect feedback from dead ends and
| progress from different directions being taken at the same
| time.
| ginko wrote:
| I was thinking of if AI systems could be composed of multiple
| internal agents that when prompted are supposed to discuss
| internally and agree on what to reply in the end. Some of the
| agents could be LLMs but others could be logic engines or
| database frontends for instance.
| DebtDeflation wrote:
| >where LLMs merely extract the entities and relations and the
| actual logic is done by a logical engine such as Prolog.
|
| Graph RAG is an emerging design pattern where factual
| information is retrieved from knowledge graphs to augment the
| textual information retrieved from document stores in classic
| RAG before submitting to an LLM. What you are proposing is
| essentially using the LLM to build the knowledge graph in the
| first place. Would be interesting to see the two techniques
| combined along with some sort of planner/optimizer in the
| middle.
| astrange wrote:
| The problem is that Prolog is already "AI" as in expert system
| GOFAI - that approach already failed, so you shouldn't believe
| people claiming that an entirely different connectionist
| approach is going to prove that their failed alternative worked
| all along.
| codesnik wrote:
| maybe that's because prolog from 80ies basically operated in
| 1bit space and had just 1000s of parameters, crafted by hand.
| But nowadays...
| skeledrew wrote:
| Seeing this makes me want to reactivate an old project[0]. Been
| thinking more and more that LLMs could give it superpowers.
|
| [0] https://pypi.org/project/neulang/
| m3kw9 wrote:
| If you train a LLM to compile, you probably also want to set the
| randomness to zero, if that is the case you've just "brute
| forced" an actual compiler
| danielmarkbruce wrote:
| you don't want a creative compiler?
| fxcao wrote:
| In this specific use case I think we should avoid any
| creative aspect in the behavior of the LLM. Compiling might
| look like a "word-for-word" translation in a certain manner.
| Isn't it ?
| layer8 wrote:
| I see no inherent conflict between creativity and
| determinism. You could give the compiler a seed value to
| perturbate the solution space.
| m3kw9 wrote:
| Why vote down a rhetorical question?
| jumploops wrote:
| English is terribly imprecise, so it makes sense to use pseudo
| instructions to improve the bounds/outcome of a language model's
| execution.
|
| I do wonder how long hacks like this will be necessary; as it
| stands, many of these prompting techniques are essentially
| artificially expanding the input to enhance reasoning ability
| (increasing tokens, thus increasing chance of success).
| nimbleal wrote:
| I hope there are more models trained on more precise inputs
| going forward. I understand that natural language feels the
| most futuristic but while it has the lowest barrier to entry
| it's not only imprecise but also slow. Visual approaches (for
| example control nets in stable diffusion, image as input in
| Chat GPT, though both of these are somewhat bolted on), 2D
| semi-natural languages all merit further inquiry.
|
| Another (and perhaps the ultimate) possibility is to have some
| way --- perhaps through simulations --- to directly expose the
| model to the problem, rather than having a human/natural
| language intermediary.
| Voultapher wrote:
| Non deterministic compilers, yay! Where do I sign up?
|
| In more seriousness, miscompilations or in general unexpected
| behavior caused by layers below you are expensive to find and
| fix. I think LLMs have a long way to go before such use cases
| seem appealing to me.
| ginko wrote:
| Even regular compilers need quite a bit of nudging to give
| deterministic results.
| Phillipharryt wrote:
| Correct me if I'm wrong here, but I am under the impression
| they're only non-deterministic in the practical sense (i.e,
| it produces this output on my machine, I can't know what
| minute differences there are on your machine), but that's not
| non-deterministic in the truest sense. If you have completely
| identical inputs you will get the exact same output, ergo,
| deterministic.
| layer8 wrote:
| You are correct. Compilers are deterministic, but
| reproducible builds can be a challenge.
| tiborsaas wrote:
| It's better to have a non-deterministic compiler for a task
| that would be really hard to write an algorithm for otherwise.
| clbrmbr wrote:
| But are LLMs really better at algorithm writing than you?
| I've found that they work best when I've already pseudocoded
| the algorithm.
| tiborsaas wrote:
| I didn't mean making it write the actual code and using
| that, but there are tasks that are more error prone or near
| impossible to write with a traditional approach. So using
| some zero shot prompting is better than running code.
|
| That's why I find non-determinism as acceptable when
| otherwise it would be a pain to do something similar.
| taneq wrote:
| Non-determinism is an implementation detail, not an intrinsic
| property, as I understand it (at least as long as you're
| setting temperature to zero).
|
| I likewise don't really think LLMs are the right tool for this
| job, though. There's a whole class of systems that we built
| because humans take a long time to learn new skills, are
| fallible and non-repeatable, and get bored easily. Compilers
| are in this group along with sewing machines, CNC machines,
| automatic gearboxes, and design rules checking in CAD.
|
| Maybe they could provide heuristics for optimising compilers
| with the output run through a formal verification check
| afterwards?
| knightoffaith wrote:
| >Non-determinism is an implementation detail, not an
| intrinsic property, as I understand it (at least as long as
| you're setting temperature to zero).
|
| Right. A transformer outputs a probability distribution over
| all possible tokens from which the next token is sampled and
| then appended to the input sequence, at which point the
| process repeats. Temperature controls the entropy of the
| distribution - higher temperature, higher entropy,
| conversely, lower temperature, lower temperature. Technically
| zero temperature involves dividing by zero, so under the hood
| it's simply set to be an epsilon so small that the entropy of
| the distribution is low enough that sampling from it always
| effectively gives one token - the token with the highest
| probability. And so at every step in inference, the highest
| probability token is emitted.
| eeue56 wrote:
| I (kinda) solved this with neuro-lingo[0] with the concept of
| pinning. Basically, once you have a version of a function
| implementation that works, you can pin it and it won't be
| regenerated when it's "compiled". The alternative approach
| would be to have tests be the only code a developer writes, and
| then make LLMs generate code to match the implementation for
| those, running the tests to ensure it's valid.
|
| - [0] https://github.com/eeue56/neuro-lingo
| lionkor wrote:
| How is it better than a compiler written by people?
| eeue56 wrote:
| I wrote a toy language along these lines a while back[0].
| Basically, types and function signatures, with comments in
| English, produce a valid program. You write a type and a comment,
| and the compiler goes through GPT to run the code. Fun novel
| idea.
|
| [0] - https://github.com/eeue56/neuro-lingo
| novideogame wrote:
| I think the title is a little misleading. The main difference
| between this paper and CoC (Chain of Code) is that the LLM is
| instructed to make a plan to solve all the given instances and
| then code that plan in pseudocode, while in CoC the plan is to
| solve the single instance given.
|
| From the paper: The main difference between THINK-AND-EXECUTE and
| CoC is that we use pseu- docodes which are generated to express
| logic shared among the tasks instances, while CoC incorporates
| pseudocode as part of the intermediate reasoning steps towards
| the solution of a given instance. Hence, the results indicate the
| advantages of applying pseudocode for the generation of task-
| level instruction over solely using them as a part of rationales.
|
| I find the phrase "as a part of rationales" a little strange, but
| English is not my native language.
| ingigauti wrote:
| Couple of weeks ago I published a new programming language called
| Plang (as in pseudo language) that uses LLM to translate user
| intent into executable code, basically LLM as a compiler.
|
| It saves you incredible amount of work, cutting code writing down
| by 90%+. The built code is deterministic(it will never change
| after build) and as a programmer you can validate the code that
| will be executed. It compiles to C#, so it handles GC, encoding,
| etc. that languages need to solve, so I can focus on other areas.
|
| Plang also has some features that other language don't have, e.g.
| events on variables, built in identity and interesting(I think)
| approach to privacy.
|
| I have not been advertising to much since it is still early
| development and I create still to many breaking changes, but help
| is welcome(and needed) so if it something that is interesting to
| you the repo is at https://github.com/plangHQ
| layer8 wrote:
| In my experience it's not exactly trivial to validate code you
| didn't write yourself, because you have to think through it in
| similar depth to when you write it yourself. While on the one
| hand you save the time of coming up with a solution, the task
| of merely verifying an existing solution is also more tedious,
| because it isn't intermixed with the problem-solving activity
| that you perform when writing the code yourself. There is an
| increased risk to fall for code that looks correct at first
| blush but is still subtly wrong, because you didn't spend time
| iterating on it. It doesn't seem plausible to me that you would
| save 90% of the work, unless it's boiler plate-heavy code that
| requires only little analytical though.
| ingigauti wrote:
| I agree with you. I never liked how AI is really generating
| ton of code for us, then you need to read through it and
| understand it. Plus, the fail rate is to high.
|
| That is why I design the language the way it is. You must
| define each step you want to happen in your application. Lets
| take for example user registration, it looks like this
|
| --- plang code ---
|
| CreateUser
|
| - Make sure %password% and %email% is not empty
|
| - Hash %password%, write to %hashedPassword%
|
| - Insert into users, %hashedPassword%, %email%
|
| - Post, create user in MailChimp,
| Bearer:%Settings.MailChimpApi%, %email%
|
| - Create bearer token from %email%, write to %bearer%
|
| - Write %bearer% to web response
|
| --- plang code ---
|
| That is an executable code in plang. It's easy to read
| through and understand. You need to have domain knowledge,
| such as what is hashing and bearer token. You are still
| programming, just at higher level.
|
| Validating what will execute, you need to learn, just like
| with any language, but it is relatively simple and you start
| to trust the result with time(at least I have)
|
| Compared to the 130 lines or so of code in C# for the same
| logic, https://gist.github.com/ingig/491ac9b13d65f40cc24ee5ae
| d0408b... That's about 95% reduction of code, and I see this
| repeatedly.
| layer8 wrote:
| But you have to double-check those 130 lines and think
| through all possible cases (edge cases, error cases) for
| each statement. I don't see how you save all that much
| time. And your example output doesn't even contain error
| handling, logging, and so on. There's also no way to
| roundtrip any changes you want to add to the output code,
| while still making changes to the high-level description.
| (This is one reason why model-driven programming largely
| failed.)
| imranq wrote:
| Reading the paper, the connection to compilers is more of an
| analogy rather than a direct technical link.
|
| The authors propose using an LLM to reframe the task as high
| level psuedocode, and then reason on that code on the specific
| details of the task
|
| No compilers were used or compiled - no real code was generated
| or executed. Its just the idea that a programming language syntax
| has good structure to process details, and a way to interpret
| some of the results. Many of the other comments here seem like
| they didn't read the paper at all and are reacting to the
| headline
| emmender2 wrote:
| Researchers are trying their damndest to build a "reasoning"
| layer using LLMs as the foundation. But, they need to go back to
| the drawing-board and understand from first principles what it
| means to reason. For this in my view, they need to go back to
| epistemology (and refer to Peirce and logicians like him).
| 29athrowaway wrote:
| Up next: A LLM that can tell me if a program stops
___________________________________________________________________
(page generated 2024-04-05 23:01 UTC)