[HN Gopher] Beyond A*: Better Planning with Transformers
___________________________________________________________________
Beyond A*: Better Planning with Transformers
Author : jonbaer
Score : 277 points
Date : 2024-02-23 11:53 UTC (11 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| teleforce wrote:
| To the authors, please rename this new algorithm from the boring
| Searchformer to more interesting name for example Optimus Prime.
| anentropic wrote:
| also because googling for "Searchformer" brings up a different
| paper and model:
| https://www.sciencedirect.com/science/article/abs/pii/S01722...
|
| "Planformer" seems unused and would reduce confusion
|
| (also since "Search" implies RAG etc which is not what this
| model is about)
| krallistic wrote:
| "Search" has been the name for the A* space. So it makes
| absolute sense. "Planning" means in the symbolic AI space
| often something quite in the direction of PDDL/STRIPS/ICAPS
| planning.
| nick12r55t1 wrote:
| T*former or T-star
| evanmoran wrote:
| T* (T-star) is a great name.
| shrubble wrote:
| It is the copyrighted name of a lens coating by Zeiss
| Optical, however.
| TeMPOraL wrote:
| T1* and T2* are common terms in magnetic nuclear
| resonance imaging, suggesting T _n_ * for any integer _n_
| should be free of bullshit IP protection. May I thus
| suggest T10*, following the pattern of a11y and i18n?
| teleforce wrote:
| Funny that you mention RAG since the RAG authors had
| expressed their regret of not using a much better name and
| the name stucked once it becomes very popular.
|
| On second thought, in the spirit of A* perhaps they can
| rename it to O' short for Optimus Prime but this probably
| will break most of the databases in the world [1],[2].
|
| [1] Exploits of a Mom:
|
| https://xkcd.com/327/
|
| [2] Prime (symbol):
|
| https://en.m.wikipedia.org/wiki/Prime_(symbol)
| basil-rash wrote:
| I can't think of any database that doesn't support the '
| character in their varchar.
| remexre wrote:
| I think the concern was all the buggy apps that connect
| to databases will accidentally be getting SQL injected.
| basil-rash wrote:
| If they're susceptible to injections this is the absolute
| least of their problems. In fact their DB is already
| dropped.
| rdedev wrote:
| At this point we need a website cataloging all transformer
| related names.
| leosanchez wrote:
| Megatron would be better IMO.
| verticalscaler wrote:
| Clearly Megatron because LLMs are decepticons.
| leosanchez wrote:
| Yes that was my reasoning :)
| wiz21c wrote:
| just deceptive
| moffkalast wrote:
| Nvidia: "We'll pretend you didn't say that."
| johndough wrote:
| https://github.com/NVIDIA/Megatron-LM
|
| > Megatron is a large, powerful transformer [...]
| FooBarWidget wrote:
| Wouldn't that be a trademark violation?
| wongarsu wrote:
| Trademarks are specific to classes of goods. A trademark for
| toys named Optimus Prime doesn't prevent you from making
| software called Optimus Prime.
| mminer237 wrote:
| While true, for something as famous as Optimus Prime, there
| could conceivably by liability for dilution by blurring.
| nkozyra wrote:
| I think Searchformer is kind of catchy. :shrug:
| willcipriano wrote:
| Sacagawea is my proposed name.
| jan_Sate wrote:
| Wouldn't that be an overkill to train a model specifically for
| that?
|
| I wish someone could come up with an algorithm that would
| outperform a trained model. This way we would actually understand
| how it works.
| jvanderbot wrote:
| Well in this case, A* outperforms in terms of funding a
| solution when it exists. It's just the transformer produces a
| solution faster, but misses some.
| SnowflakeOnIce wrote:
| Note that the paper doesn't measure execution time of the A*
| or Transformer-based solutions; it compares length of A*
| algorithm traces with length of traces generated by a model
| trained on A* execution traces.
|
| I suspect that the actual execution time would be far faster
| with the A* implementation than any of the transformers.
| bamboozled wrote:
| Was the pun intended ?
| wongarsu wrote:
| Shortest path search is a very well understood problem, and for
| "simple" cases A* is the gold standard.
|
| You could probably tweak the heuristic used by A* to the
| problem to get similarly good results (at _much_ lower compute
| cost than running what amounts to an LLM with huge context
| length). But this is a toy problem. The interesting thing about
| the paper is that you might be able to get good-performing
| search algorithms for any problem with a cookie-cutter process,
| instead of spending many man-hours tweaking a hand-written
| search algorithm. It might also be possible to easily adapt it
| to much more complex search problems, like cooperatively
| evading other agents navigating in the same space.
| Moldoteck wrote:
| do you want the solution faster or more precise? Depending on
| the answer you would pick either A* or this model
| SnowflakeOnIce wrote:
| They don't report execution time in the paper. It's likely
| that the A* implementation would run faster in terms of CPU
| or wall clock.
| Moldoteck wrote:
| Transformer model that optimally solves previously unseen
| Sokoban puzzles 93.7% of the time, while using up to 26.8%
| fewer search steps than standard A* search. Doesn't fewer
| search steps imply faster?
| senand wrote:
| No, each step could take significantly more time (and
| resources).
| feverzsj wrote:
| Path planning is highly dynamic and requires real time
| adjustment. In mature software, the heuristics are already
| highly optimized for local road/traffic data and users. So, I
| don't think AI can actually outperform them yet.
| jvanderbot wrote:
| I am extremely optimistic about using learned heuristics in
| discrete algorithms like A* or Focal search or the various
| families of ILP.
|
| In most modern discrete optimization libraries, e.g. CPLEX it's
| the heuristics and tuning that explain the performance.
|
| I'm less understanding of using a end to end learning approach to
| replace a well understood optimal search routine, but that might
| be pearl clutching.
|
| It just seems to me the authors missed that opportunity.
| boesboes wrote:
| I think it is just the bubble/hype effect around transformers
| and AI. I might try to use transformers to solve tic-tac-toe
| and apply for some VC moneys.
|
| In a few years we'll all be writing about how much more
| efficient actual code is compared to AI perhaps ;)
| losvedir wrote:
| Everyone on Transformer News will be talking about the latest
| web framework for GPT-7 and how to horizontally scale their
| system to handle dozens of requests a second, and how when
| you need to scale beyond that it will "be a good problem to
| have".
|
| Meanwhile, you and I will be talking about how our low-level,
| handwritten Ruby on Rails code is so much faster and more
| efficient, and closer to the metal so you can understand
| what's _really_ going on.
| nvrmnd wrote:
| I agree, learning admissible heuristics will retain worst case
| performance, which has always been the measuring stick for
| these algorithms. It's not at all uncommon to find faster
| solutions for the average or even p99 cases that cannot provide
| guarantees on the worst case.
| SnowflakeOnIce wrote:
| How would one go about proving that a learned heuristic
| (something from an AI model) is in fact admissible?
| jvanderbot wrote:
| For something like focal search, it doesn't even need
| admissibility, you just apply it as a second selection
| heuristic among the choices your admissable heuristic
| returns as 'top k' choices.
| SnowflakeOnIce wrote:
| So a tiebreaker then?
| eterevsky wrote:
| While A* could be used to solve Sokoban, wouldn't some sort of
| Monte-Carlo Tree Search be better at it? I wonder how would it
| compare to this model.
| espadrine wrote:
| The part of MCTS that matters for solving it is the Bellman
| equation, which is a component of the Bellman-Ford shortest-
| path algorithm, so all of those methods are part of a
| continuum.
|
| Which one works best is pretty sensitive to the exploratory
| decisionmaking algorithm: do I search this cell first, or that
| one? That heuristic is the differentiator, whether it is put on
| top of a Dijkstra-style algorithm (A*) or a Bellman-Ford-style
| algorithm (MCTS).
| abdellah123 wrote:
| what's with Cornell University??! They produce constant amazing
| papers and projects on AI
| losvedir wrote:
| What makes you say Cornell? Isn't this from Meta?
| tekknolagi wrote:
| I think it's either a joke or misunderstanding about arxiv
| bravura wrote:
| Machine Translation used to involve complicated grammatical
| decoding, using search. Now we just use transformers for MT, with
| much simpler decoding that doesn't really need search.
|
| Perhaps we can reach full inception. Let's use our current best-
| of-breed predictive models to learn heuristics for neural
| architecture search (NAS) and search for new neural blocks
| (better than transformer and mamba).
| xg15 wrote:
| ...and finally enter a world where literally no one understands
| anymore how the technology works, not even the people
| developing them. Singularity, here we come...
| zwirbl wrote:
| We could then go on to create something akin to the
| mechanicum, create the position of tech priests and have them
| pray to the machine god on on our behalf
| moi2388 wrote:
| If you think that still needs to be created you haven't
| worked at tech support
| TeMPOraL wrote:
| The priest class is there, what we need is to level up
| the machine god itself.
| smcameron wrote:
| The tech priests will build electric monks to do that job.
| throwuwu wrote:
| Barring the singularity, just because you find something with
| search doesn't mean you can't understand why it's better.
| FeepingCreature wrote:
| "Every time I fire a linguist, the performance of the speech
| recognizer goes up." --Frederick Jelinek
| amelius wrote:
| Is someone keeping a list of classical algorithms or NP complete
| problems that are now better performed using deep learning?
| scscsc wrote:
| For your convenience, here is the list of NP-complete problems
| where "AI" works better than the state of the art in the worst
| case:.
| cs702 wrote:
| Thank you for that. I needed the laugh :-)
| Davidzheng wrote:
| Probably soon AI will be able to find improvements to most of
| them? Like using AI to do search in algorithm space
| thomasahle wrote:
| For all we know, some of the current best (still
| exponential) algorthms were guided by AI. If a
| mathematician solves a problem using mathematica, they
| don't usually write in the paper what tools they used.
| jvanderbot wrote:
| AI + classical algorithms is my sweet daydream. Trained
| heuristics (even better domain specific ones), deployed for
| classical A*, ILP families, focal search, etc etc.
|
| That is going to be really amazing.
| amelius wrote:
| Even solving large linear systems would already be amazing.
| But a SAT solver would be nice too :)
| jvanderbot wrote:
| I misunderstand your comment. We have those solvers. I'm
| suggesting AI would plug into existing solvers. This is a
| ripe area for research.
| blt wrote:
| It is happening. There are papers on deep learning to
| improve the variable choice in branch-and-bound, etc
| jvanderbot wrote:
| Yeah my first exposure was from Marco Pavone's lab, on ML
| heuristics for MICP in the context of a cubical tumbling
| robot, IIRC. Really cool stuff.
| nyrikki wrote:
| For those of you who didn't get the joke here.
|
| (S)ETH is a bit of a bummer if you only consider the dominate
| term in big-O, for the general case.
|
| https://web.stanford.edu/class/cs354/scribe/lecture17.pdf
| j2kun wrote:
| From my understanding this is very much still active research,
| without any clear wins deployed in production settings.
| k2xl wrote:
| Shameless plug, but if you are interested in Sokoban type games
| check out https://thinky.gg . Features a fun Sokoban variant
| (called Sokopath) as well as another NP hard variant called
| Pathology (goal is to get from point A to point B in the shortest
| amount of steps).
|
| Many in the community have tried to create various solvers and
| things get very hard once the grid gets to be over 5x5. However,
| some interesting levels with very high maximum step counts have
| been discovered by the thinky communnity (via simulated
| annealing)
| nikolay wrote:
| Thank you! Great games! Super nice execution! I love it!
| paidcompute wrote:
| Pretty impressive
| light_hue_1 wrote:
| > 26.8% fewer search steps than standard A* search
|
| So you're saying I shouldn't read this paper?
|
| A* is not particularly good. If all the algorithm can do is
| mirror it by training on its output what's the point?
| casebash wrote:
| Yeah, at first I read that as it using 26.8% of the original
| steps, but reducing the number of steps by 26.8% is not that
| impressive. I wonder whether it actually reduces total search
| time as there is added overhead of running the neural network.
| light_hue_1 wrote:
| There's absolutely no way it reduces search time. A* is
| trivial to run per timestep. The time must be thousands to
| maybe hundreds of thousands of times slower.
|
| I publish in this area and this is a common thing for
| reviewers to bring up when authors don't report wall clock
| time. And then for papers to be rejected. What's the value in
| making an algorithm that's drastically slower? Not much.
| ta8645 wrote:
| > What's the value in making an algorithm that's
| drastically slower?
|
| Perhaps as an important stepping stone? Deferred
| optimization and all that.
| ultra_nick wrote:
| If transformers can plan, then AGI only requires better
| education...
| throwuwu wrote:
| There's a lot more pieces, agency being a big one but online
| learning is required too and many other layers beyond that.
| th0ma5 wrote:
| That's probably the foreseeable future, ingesting ever larger
| amounts of data hoping it prevents hallucinations.
| nyrikki wrote:
| Approximating exhaustive search is not logic or causality.
| adamnemecek wrote:
| Both ml and A* are integral transforms.
| Salgat wrote:
| What do you mean by that?
| adi4213 wrote:
| For the auditory learners, here is this paper in a summarized
| audiobook format :
|
| https://player.oration.app/09fefe41-f2a7-4257-a25e-30e479b30...
| syassami wrote:
| Thanks for sharing, I've been looking for an app like this for
| ages.
| adi4213 wrote:
| Thank you so much for your comment! Apologies if my comment
| came off like an ad - I have mild dyslexia and struggle to
| keep up with ML papers; I hope others also find this useful!
| If you have any feedback or issues, feel free to email us at
| support [at] trurecord.com
| softfalcon wrote:
| I wonder if they tried the modified J* algorithm (an optimization
| of A* for games graph/path-finding) before going down this
| research route.
|
| It's in Game AI Pro 2 [0] if anyone is curious.
|
| [0] https://www.amazon.ca/Game-AI-Pro-Collected-
| Professionals/dp...
| mysterydip wrote:
| I love these books (and nice to see Steve Rabin still doing
| them), but $120 for an ebook? That's unexpected.
| hlfshell wrote:
| If you consider it like a textbook (very small niche audience
| for very specialized knowledge) it does kind of match up due
| to the small expected volume to sell.
|
| Authors still gotta eat.
| setr wrote:
| Worse yet... the game ai pro books are entirely free for
| digital copies. You'd basically be paying $120 for it to be
| stitched together into a single pdf, instead of one per
| chapter
|
| http://www.gameaipro.com/
| cjaybo wrote:
| That's one expensive imagemagick command!
| leeoniya wrote:
| also https://github.com/anvaka/ngraph.path
| ogogmad wrote:
| To be fair, they said at the end that their path-finder is not
| yet competitive with the SOTA. The paper tests how well a
| transformer does at predicting an execution trace (like in a
| JIT compiler) and whether this can help improve heuristics in
| things like path-finding. I'm weary because transformers are
| expensive.
| tintor wrote:
| "26.8% fewer search steps than standard A* search"
|
| So, slightly better than A* which is far from SOTA on Sokoban
| (https://festival-solver.site/).
|
| What is impressive in this paper? Why is this on Hacker News?
| hlfshell wrote:
| What? Someone somewhere tried to do something and wasn't the
| most optimal possible solution? We should just ban their
| account honestly.
|
| Comments like these are antithetical to a strong technical
| sharing community.
| agumonkey wrote:
| Especially considering the amount of trivial mainstream tech
| articles nowadays. It's cool to see more algorithmic topics.
| pqdbr wrote:
| I agree. OP's comment could be quickly rewritten into
| something useful and just by changing the tone, for example:
|
| "26.8% fewer search steps than standard A* search" For
| reference of prior art, it's slightly better than A*, which
| is far from SOTA on Sokoban (https://festival-solver.site/).
| airstrike wrote:
| > Why is this on Hacker News?
|
| Anything that is on HN is on HN because the community likes it
| mindwok wrote:
| Some people find things interesting regardless of if they break
| records.
| esafak wrote:
| Has anyone compared planning algorithms by complexity against
| optimality (error)?
| mromanuk wrote:
| Because they used a transformer to reach a nice solution,
| better than a typical A* search, which is a "naive" or go to
| solution. And they didn't think about designing an algorithm. I
| find it quite impressive that a simple encoder-decoder
| transformer can achieve so much.
| Negitivefrags wrote:
| So A* is the most optimal search algorithm under the specific
| constraints it specifies. You can't do better.
|
| However, sometimes the specific domain you are searching has
| other constraints that can be exploited to do better than A*.
| An example of this being Jump Point Search that exploits
| certain properties of grid searches if you can only move in
| certain ways.
|
| If you were able to write a general searching algorithm that
| can effectively exploit the whatever the specific properties of
| the underlying domain "automatically" without you having to
| actually work out what they are, that would be useful right?
| tintor wrote:
| Paper authors choose to compare against A* and Sokoban.
|
| A* can't solve even the first level of original Sokoban 90
| levels.
| Analemma_ wrote:
| Because it's further evidence for the "unreasonable
| effectiveness of transformers", i.e. that transformers are a
| fully-general approach for _all_ kinds of learning tasks, not
| just next-token prediction. Obviously there is a strong and a
| weak version of that hypothesis and the strong version is
| probably not true, but to the extent that we appear to be
| honing in Nature 's "one true way" to learn how to accomplish a
| task, that seems like important news.
| DalasNoin wrote:
| It's on hn because it sounds similar to the so called Q*
| algorithm, the supposed secret openai algo that has seen a huge
| amount of speculation.
| MooseBurger wrote:
| In certain computer science problems, a suboptimal action at
| time t may give rise to an optimal outcome at time >t.
|
| Why wouldn't this be the case for research generally? Has our
| community really devolved to the point where things should only
| be noteworthy insofar as they optimize for SOTA for a given
| problem?
|
| What a sad thought.
| RogerL wrote:
| It is in the abstract, very first line. "While Transformers
| have enabled tremendous progress in various application
| settings, such architectures still lag behind traditional
| symbolic planners for solving complex decision making tasks. In
| this work, we demonstrate how to train Transformers to solve
| complex planning tasks ..."
|
| This paper is interesting (to me) as an _example_ of how to use
| transformers for decision making. I don 't particularly care if
| it is up to A* standards at the moment.
| tulio_ribeiro wrote:
| Amazing. Now do that to sorting algorithms.
| gene-h wrote:
| There has been more interesting work on using transformers for
| robot motion planning[0]. Getting a robot arm from point a to b
| without hitting stuff is a very difficult problem, the problem is
| both high dimensional and continuous. Previous planning methods
| are both computationally intensive and not very good. This is one
| reason why robot motion appears 'unnatural' and robots generally
| being bad at many tasks we'd like them to do. This approach
| appears pretty competitive with other planning methods, planning
| near optimal paths faster.
|
| [0]https://sites.google.com/ucsd.edu/vq-mpt/home
| goggy_googy wrote:
| This paper reminds me of the Neural Network Diffusion paper which
| was on the front page of HN yesterday in the sense that we are
| training another model to bypass a number of iterative steps (in
| the previous paper, those were SGD steps, in this one, it is A*
| exploration steps).
|
| On a different note, they choose such a bad heuristic for the A*
| for Sokoban. The heuristic they choose is "A* first matches every
| box to the closest dock and then computes the sum of all
| Manhattan distances between each box and dock pair". I played
| Sokoban for 20 minutes while reading the paper and I feel like
| this is a very poor exploration heuristic (you often need to move
| boxes away from goal state to make progress).
| nyrikki wrote:
| I have a hunch they made their decision to train off that
| particular type of A* traces to avoid an exponential number of
| embeddings.
| cptroot wrote:
| I find it hard to believe that all the heavy-weight data
| processing and GPU computation really make a constant factor
| reduction in search steps worth it.
| SnowflakeOnIce wrote:
| It's also not clear to me how one would determine that an ML
| model-generated plan is indeed optimal, or how far from optimal
| it is. A*-based approaches give you these things.
| thesz wrote:
| Can transformers prove absence of the solution? E.g., did they
| tried to train them on the unsolvable problems?
|
| I once tried my hands on the PCB routing and here's a simple
| problem of simultaneous multipath search that is unsolvable:
| A.....b B.....a
|
| A and B are starting points and a and b are goal points,
| respectively for A and for B. It is a 2xN maze, the orders of
| stating and goal points are reversed.
|
| The search algorithm sometimes required to prove absence of the
| solution. A* can do that, transformers? I do not think so.
| nyrikki wrote:
| In this case, A* is in a finite, bounded 30 _30 grid. As A_ is
| just a type of branch and bound algorithm, which the degenerate
| form is exhaustive search. Which should be able to 'prove' the
| absence of a path by simply failing to find one before
| exhausting all paths.
|
| PCB routing is not as simple as this problem for real world
| problems.
|
| Digging into why TSP with a positive integer Euclidean metric
| is NP-C but with a real valued Euclidean metric is NP-hard may
| be one way to think about it.
|
| But if you choose the right metric and huristic, in what
| approximates a finite space, A* can run an exhaustive search in
| practical time.
|
| It is probably a good idea to separate search space exhaustion
| from no-instance proving which in the case of decision problems
| can cause confusion.
|
| If you think of NP as relating to yes-instances, co-NP is the
| no-instances.
| thesz wrote:
| You are mathematician, am I right? Your answer is correct
| (sorta, as far as I am concerned) and does not add to the
| discussion I tried to open.
|
| Your answer does not show anything related to the ability of
| transformers to prove absence of solution, to be precise.
| tannhaeuser wrote:
| Planning is already well taken care of by established techniques
| ranging from graph search, SAT-solvers, OR, Prolog, etc. The
| point usually is optimization over multiple feasible
| alternatives, which I have a hard time seeing transformers being
| up to. I see the role of LLM techniques more in translating
| natural language descriptions to executable programs - but then
| Prolog is already very close, it being designed for classic NLP
| after all.
___________________________________________________________________
(page generated 2024-02-23 23:01 UTC)