[HN Gopher] Beyond A*: Better Planning with Transformers
       ___________________________________________________________________
        
       Beyond A*: Better Planning with Transformers
        
       Author : jonbaer
       Score  : 277 points
       Date   : 2024-02-23 11:53 UTC (11 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | teleforce wrote:
       | To the authors, please rename this new algorithm from the boring
       | Searchformer to more interesting name for example Optimus Prime.
        
         | anentropic wrote:
         | also because googling for "Searchformer" brings up a different
         | paper and model:
         | https://www.sciencedirect.com/science/article/abs/pii/S01722...
         | 
         | "Planformer" seems unused and would reduce confusion
         | 
         | (also since "Search" implies RAG etc which is not what this
         | model is about)
        
           | krallistic wrote:
           | "Search" has been the name for the A* space. So it makes
           | absolute sense. "Planning" means in the symbolic AI space
           | often something quite in the direction of PDDL/STRIPS/ICAPS
           | planning.
        
             | nick12r55t1 wrote:
             | T*former or T-star
        
               | evanmoran wrote:
               | T* (T-star) is a great name.
        
               | shrubble wrote:
               | It is the copyrighted name of a lens coating by Zeiss
               | Optical, however.
        
               | TeMPOraL wrote:
               | T1* and T2* are common terms in magnetic nuclear
               | resonance imaging, suggesting T _n_ * for any integer _n_
               | should be free of bullshit IP protection. May I thus
               | suggest T10*, following the pattern of a11y and i18n?
        
           | teleforce wrote:
           | Funny that you mention RAG since the RAG authors had
           | expressed their regret of not using a much better name and
           | the name stucked once it becomes very popular.
           | 
           | On second thought, in the spirit of A* perhaps they can
           | rename it to O' short for Optimus Prime but this probably
           | will break most of the databases in the world [1],[2].
           | 
           | [1] Exploits of a Mom:
           | 
           | https://xkcd.com/327/
           | 
           | [2] Prime (symbol):
           | 
           | https://en.m.wikipedia.org/wiki/Prime_(symbol)
        
             | basil-rash wrote:
             | I can't think of any database that doesn't support the '
             | character in their varchar.
        
               | remexre wrote:
               | I think the concern was all the buggy apps that connect
               | to databases will accidentally be getting SQL injected.
        
               | basil-rash wrote:
               | If they're susceptible to injections this is the absolute
               | least of their problems. In fact their DB is already
               | dropped.
        
           | rdedev wrote:
           | At this point we need a website cataloging all transformer
           | related names.
        
         | leosanchez wrote:
         | Megatron would be better IMO.
        
           | verticalscaler wrote:
           | Clearly Megatron because LLMs are decepticons.
        
             | leosanchez wrote:
             | Yes that was my reasoning :)
        
               | wiz21c wrote:
               | just deceptive
        
           | moffkalast wrote:
           | Nvidia: "We'll pretend you didn't say that."
        
             | johndough wrote:
             | https://github.com/NVIDIA/Megatron-LM
             | 
             | > Megatron is a large, powerful transformer [...]
        
         | FooBarWidget wrote:
         | Wouldn't that be a trademark violation?
        
           | wongarsu wrote:
           | Trademarks are specific to classes of goods. A trademark for
           | toys named Optimus Prime doesn't prevent you from making
           | software called Optimus Prime.
        
             | mminer237 wrote:
             | While true, for something as famous as Optimus Prime, there
             | could conceivably by liability for dilution by blurring.
        
         | nkozyra wrote:
         | I think Searchformer is kind of catchy. :shrug:
        
         | willcipriano wrote:
         | Sacagawea is my proposed name.
        
       | jan_Sate wrote:
       | Wouldn't that be an overkill to train a model specifically for
       | that?
       | 
       | I wish someone could come up with an algorithm that would
       | outperform a trained model. This way we would actually understand
       | how it works.
        
         | jvanderbot wrote:
         | Well in this case, A* outperforms in terms of funding a
         | solution when it exists. It's just the transformer produces a
         | solution faster, but misses some.
        
           | SnowflakeOnIce wrote:
           | Note that the paper doesn't measure execution time of the A*
           | or Transformer-based solutions; it compares length of A*
           | algorithm traces with length of traces generated by a model
           | trained on A* execution traces.
           | 
           | I suspect that the actual execution time would be far faster
           | with the A* implementation than any of the transformers.
        
           | bamboozled wrote:
           | Was the pun intended ?
        
         | wongarsu wrote:
         | Shortest path search is a very well understood problem, and for
         | "simple" cases A* is the gold standard.
         | 
         | You could probably tweak the heuristic used by A* to the
         | problem to get similarly good results (at _much_ lower compute
         | cost than running what amounts to an LLM with huge context
         | length). But this is a toy problem. The interesting thing about
         | the paper is that you might be able to get good-performing
         | search algorithms for any problem with a cookie-cutter process,
         | instead of spending many man-hours tweaking a hand-written
         | search algorithm. It might also be possible to easily adapt it
         | to much more complex search problems, like cooperatively
         | evading other agents navigating in the same space.
        
         | Moldoteck wrote:
         | do you want the solution faster or more precise? Depending on
         | the answer you would pick either A* or this model
        
           | SnowflakeOnIce wrote:
           | They don't report execution time in the paper. It's likely
           | that the A* implementation would run faster in terms of CPU
           | or wall clock.
        
             | Moldoteck wrote:
             | Transformer model that optimally solves previously unseen
             | Sokoban puzzles 93.7% of the time, while using up to 26.8%
             | fewer search steps than standard A* search. Doesn't fewer
             | search steps imply faster?
        
               | senand wrote:
               | No, each step could take significantly more time (and
               | resources).
        
         | feverzsj wrote:
         | Path planning is highly dynamic and requires real time
         | adjustment. In mature software, the heuristics are already
         | highly optimized for local road/traffic data and users. So, I
         | don't think AI can actually outperform them yet.
        
       | jvanderbot wrote:
       | I am extremely optimistic about using learned heuristics in
       | discrete algorithms like A* or Focal search or the various
       | families of ILP.
       | 
       | In most modern discrete optimization libraries, e.g. CPLEX it's
       | the heuristics and tuning that explain the performance.
       | 
       | I'm less understanding of using a end to end learning approach to
       | replace a well understood optimal search routine, but that might
       | be pearl clutching.
       | 
       | It just seems to me the authors missed that opportunity.
        
         | boesboes wrote:
         | I think it is just the bubble/hype effect around transformers
         | and AI. I might try to use transformers to solve tic-tac-toe
         | and apply for some VC moneys.
         | 
         | In a few years we'll all be writing about how much more
         | efficient actual code is compared to AI perhaps ;)
        
           | losvedir wrote:
           | Everyone on Transformer News will be talking about the latest
           | web framework for GPT-7 and how to horizontally scale their
           | system to handle dozens of requests a second, and how when
           | you need to scale beyond that it will "be a good problem to
           | have".
           | 
           | Meanwhile, you and I will be talking about how our low-level,
           | handwritten Ruby on Rails code is so much faster and more
           | efficient, and closer to the metal so you can understand
           | what's _really_ going on.
        
         | nvrmnd wrote:
         | I agree, learning admissible heuristics will retain worst case
         | performance, which has always been the measuring stick for
         | these algorithms. It's not at all uncommon to find faster
         | solutions for the average or even p99 cases that cannot provide
         | guarantees on the worst case.
        
           | SnowflakeOnIce wrote:
           | How would one go about proving that a learned heuristic
           | (something from an AI model) is in fact admissible?
        
             | jvanderbot wrote:
             | For something like focal search, it doesn't even need
             | admissibility, you just apply it as a second selection
             | heuristic among the choices your admissable heuristic
             | returns as 'top k' choices.
        
               | SnowflakeOnIce wrote:
               | So a tiebreaker then?
        
       | eterevsky wrote:
       | While A* could be used to solve Sokoban, wouldn't some sort of
       | Monte-Carlo Tree Search be better at it? I wonder how would it
       | compare to this model.
        
         | espadrine wrote:
         | The part of MCTS that matters for solving it is the Bellman
         | equation, which is a component of the Bellman-Ford shortest-
         | path algorithm, so all of those methods are part of a
         | continuum.
         | 
         | Which one works best is pretty sensitive to the exploratory
         | decisionmaking algorithm: do I search this cell first, or that
         | one? That heuristic is the differentiator, whether it is put on
         | top of a Dijkstra-style algorithm (A*) or a Bellman-Ford-style
         | algorithm (MCTS).
        
       | abdellah123 wrote:
       | what's with Cornell University??! They produce constant amazing
       | papers and projects on AI
        
         | losvedir wrote:
         | What makes you say Cornell? Isn't this from Meta?
        
           | tekknolagi wrote:
           | I think it's either a joke or misunderstanding about arxiv
        
       | bravura wrote:
       | Machine Translation used to involve complicated grammatical
       | decoding, using search. Now we just use transformers for MT, with
       | much simpler decoding that doesn't really need search.
       | 
       | Perhaps we can reach full inception. Let's use our current best-
       | of-breed predictive models to learn heuristics for neural
       | architecture search (NAS) and search for new neural blocks
       | (better than transformer and mamba).
        
         | xg15 wrote:
         | ...and finally enter a world where literally no one understands
         | anymore how the technology works, not even the people
         | developing them. Singularity, here we come...
        
           | zwirbl wrote:
           | We could then go on to create something akin to the
           | mechanicum, create the position of tech priests and have them
           | pray to the machine god on on our behalf
        
             | moi2388 wrote:
             | If you think that still needs to be created you haven't
             | worked at tech support
        
               | TeMPOraL wrote:
               | The priest class is there, what we need is to level up
               | the machine god itself.
        
             | smcameron wrote:
             | The tech priests will build electric monks to do that job.
        
           | throwuwu wrote:
           | Barring the singularity, just because you find something with
           | search doesn't mean you can't understand why it's better.
        
         | FeepingCreature wrote:
         | "Every time I fire a linguist, the performance of the speech
         | recognizer goes up." --Frederick Jelinek
        
       | amelius wrote:
       | Is someone keeping a list of classical algorithms or NP complete
       | problems that are now better performed using deep learning?
        
         | scscsc wrote:
         | For your convenience, here is the list of NP-complete problems
         | where "AI" works better than the state of the art in the worst
         | case:.
        
           | cs702 wrote:
           | Thank you for that. I needed the laugh :-)
        
           | Davidzheng wrote:
           | Probably soon AI will be able to find improvements to most of
           | them? Like using AI to do search in algorithm space
        
             | thomasahle wrote:
             | For all we know, some of the current best (still
             | exponential) algorthms were guided by AI. If a
             | mathematician solves a problem using mathematica, they
             | don't usually write in the paper what tools they used.
        
           | jvanderbot wrote:
           | AI + classical algorithms is my sweet daydream. Trained
           | heuristics (even better domain specific ones), deployed for
           | classical A*, ILP families, focal search, etc etc.
           | 
           | That is going to be really amazing.
        
             | amelius wrote:
             | Even solving large linear systems would already be amazing.
             | But a SAT solver would be nice too :)
        
               | jvanderbot wrote:
               | I misunderstand your comment. We have those solvers. I'm
               | suggesting AI would plug into existing solvers. This is a
               | ripe area for research.
        
             | blt wrote:
             | It is happening. There are papers on deep learning to
             | improve the variable choice in branch-and-bound, etc
        
               | jvanderbot wrote:
               | Yeah my first exposure was from Marco Pavone's lab, on ML
               | heuristics for MICP in the context of a cubical tumbling
               | robot, IIRC. Really cool stuff.
        
           | nyrikki wrote:
           | For those of you who didn't get the joke here.
           | 
           | (S)ETH is a bit of a bummer if you only consider the dominate
           | term in big-O, for the general case.
           | 
           | https://web.stanford.edu/class/cs354/scribe/lecture17.pdf
        
         | j2kun wrote:
         | From my understanding this is very much still active research,
         | without any clear wins deployed in production settings.
        
       | k2xl wrote:
       | Shameless plug, but if you are interested in Sokoban type games
       | check out https://thinky.gg . Features a fun Sokoban variant
       | (called Sokopath) as well as another NP hard variant called
       | Pathology (goal is to get from point A to point B in the shortest
       | amount of steps).
       | 
       | Many in the community have tried to create various solvers and
       | things get very hard once the grid gets to be over 5x5. However,
       | some interesting levels with very high maximum step counts have
       | been discovered by the thinky communnity (via simulated
       | annealing)
        
         | nikolay wrote:
         | Thank you! Great games! Super nice execution! I love it!
        
       | paidcompute wrote:
       | Pretty impressive
        
       | light_hue_1 wrote:
       | > 26.8% fewer search steps than standard A* search
       | 
       | So you're saying I shouldn't read this paper?
       | 
       | A* is not particularly good. If all the algorithm can do is
       | mirror it by training on its output what's the point?
        
         | casebash wrote:
         | Yeah, at first I read that as it using 26.8% of the original
         | steps, but reducing the number of steps by 26.8% is not that
         | impressive. I wonder whether it actually reduces total search
         | time as there is added overhead of running the neural network.
        
           | light_hue_1 wrote:
           | There's absolutely no way it reduces search time. A* is
           | trivial to run per timestep. The time must be thousands to
           | maybe hundreds of thousands of times slower.
           | 
           | I publish in this area and this is a common thing for
           | reviewers to bring up when authors don't report wall clock
           | time. And then for papers to be rejected. What's the value in
           | making an algorithm that's drastically slower? Not much.
        
             | ta8645 wrote:
             | > What's the value in making an algorithm that's
             | drastically slower?
             | 
             | Perhaps as an important stepping stone? Deferred
             | optimization and all that.
        
       | ultra_nick wrote:
       | If transformers can plan, then AGI only requires better
       | education...
        
         | throwuwu wrote:
         | There's a lot more pieces, agency being a big one but online
         | learning is required too and many other layers beyond that.
        
         | th0ma5 wrote:
         | That's probably the foreseeable future, ingesting ever larger
         | amounts of data hoping it prevents hallucinations.
        
         | nyrikki wrote:
         | Approximating exhaustive search is not logic or causality.
        
       | adamnemecek wrote:
       | Both ml and A* are integral transforms.
        
         | Salgat wrote:
         | What do you mean by that?
        
       | adi4213 wrote:
       | For the auditory learners, here is this paper in a summarized
       | audiobook format :
       | 
       | https://player.oration.app/09fefe41-f2a7-4257-a25e-30e479b30...
        
         | syassami wrote:
         | Thanks for sharing, I've been looking for an app like this for
         | ages.
        
           | adi4213 wrote:
           | Thank you so much for your comment! Apologies if my comment
           | came off like an ad - I have mild dyslexia and struggle to
           | keep up with ML papers; I hope others also find this useful!
           | If you have any feedback or issues, feel free to email us at
           | support [at] trurecord.com
        
       | softfalcon wrote:
       | I wonder if they tried the modified J* algorithm (an optimization
       | of A* for games graph/path-finding) before going down this
       | research route.
       | 
       | It's in Game AI Pro 2 [0] if anyone is curious.
       | 
       | [0] https://www.amazon.ca/Game-AI-Pro-Collected-
       | Professionals/dp...
        
         | mysterydip wrote:
         | I love these books (and nice to see Steve Rabin still doing
         | them), but $120 for an ebook? That's unexpected.
        
           | hlfshell wrote:
           | If you consider it like a textbook (very small niche audience
           | for very specialized knowledge) it does kind of match up due
           | to the small expected volume to sell.
           | 
           | Authors still gotta eat.
        
           | setr wrote:
           | Worse yet... the game ai pro books are entirely free for
           | digital copies. You'd basically be paying $120 for it to be
           | stitched together into a single pdf, instead of one per
           | chapter
           | 
           | http://www.gameaipro.com/
        
             | cjaybo wrote:
             | That's one expensive imagemagick command!
        
         | leeoniya wrote:
         | also https://github.com/anvaka/ngraph.path
        
         | ogogmad wrote:
         | To be fair, they said at the end that their path-finder is not
         | yet competitive with the SOTA. The paper tests how well a
         | transformer does at predicting an execution trace (like in a
         | JIT compiler) and whether this can help improve heuristics in
         | things like path-finding. I'm weary because transformers are
         | expensive.
        
       | tintor wrote:
       | "26.8% fewer search steps than standard A* search"
       | 
       | So, slightly better than A* which is far from SOTA on Sokoban
       | (https://festival-solver.site/).
       | 
       | What is impressive in this paper? Why is this on Hacker News?
        
         | hlfshell wrote:
         | What? Someone somewhere tried to do something and wasn't the
         | most optimal possible solution? We should just ban their
         | account honestly.
         | 
         | Comments like these are antithetical to a strong technical
         | sharing community.
        
           | agumonkey wrote:
           | Especially considering the amount of trivial mainstream tech
           | articles nowadays. It's cool to see more algorithmic topics.
        
           | pqdbr wrote:
           | I agree. OP's comment could be quickly rewritten into
           | something useful and just by changing the tone, for example:
           | 
           | "26.8% fewer search steps than standard A* search" For
           | reference of prior art, it's slightly better than A*, which
           | is far from SOTA on Sokoban (https://festival-solver.site/).
        
         | airstrike wrote:
         | > Why is this on Hacker News?
         | 
         | Anything that is on HN is on HN because the community likes it
        
         | mindwok wrote:
         | Some people find things interesting regardless of if they break
         | records.
        
         | esafak wrote:
         | Has anyone compared planning algorithms by complexity against
         | optimality (error)?
        
         | mromanuk wrote:
         | Because they used a transformer to reach a nice solution,
         | better than a typical A* search, which is a "naive" or go to
         | solution. And they didn't think about designing an algorithm. I
         | find it quite impressive that a simple encoder-decoder
         | transformer can achieve so much.
        
         | Negitivefrags wrote:
         | So A* is the most optimal search algorithm under the specific
         | constraints it specifies. You can't do better.
         | 
         | However, sometimes the specific domain you are searching has
         | other constraints that can be exploited to do better than A*.
         | An example of this being Jump Point Search that exploits
         | certain properties of grid searches if you can only move in
         | certain ways.
         | 
         | If you were able to write a general searching algorithm that
         | can effectively exploit the whatever the specific properties of
         | the underlying domain "automatically" without you having to
         | actually work out what they are, that would be useful right?
        
           | tintor wrote:
           | Paper authors choose to compare against A* and Sokoban.
           | 
           | A* can't solve even the first level of original Sokoban 90
           | levels.
        
         | Analemma_ wrote:
         | Because it's further evidence for the "unreasonable
         | effectiveness of transformers", i.e. that transformers are a
         | fully-general approach for _all_ kinds of learning tasks, not
         | just next-token prediction. Obviously there is a strong and a
         | weak version of that hypothesis and the strong version is
         | probably not true, but to the extent that we appear to be
         | honing in Nature 's "one true way" to learn how to accomplish a
         | task, that seems like important news.
        
         | DalasNoin wrote:
         | It's on hn because it sounds similar to the so called Q*
         | algorithm, the supposed secret openai algo that has seen a huge
         | amount of speculation.
        
         | MooseBurger wrote:
         | In certain computer science problems, a suboptimal action at
         | time t may give rise to an optimal outcome at time >t.
         | 
         | Why wouldn't this be the case for research generally? Has our
         | community really devolved to the point where things should only
         | be noteworthy insofar as they optimize for SOTA for a given
         | problem?
         | 
         | What a sad thought.
        
         | RogerL wrote:
         | It is in the abstract, very first line. "While Transformers
         | have enabled tremendous progress in various application
         | settings, such architectures still lag behind traditional
         | symbolic planners for solving complex decision making tasks. In
         | this work, we demonstrate how to train Transformers to solve
         | complex planning tasks ..."
         | 
         | This paper is interesting (to me) as an _example_ of how to use
         | transformers for decision making. I don 't particularly care if
         | it is up to A* standards at the moment.
        
       | tulio_ribeiro wrote:
       | Amazing. Now do that to sorting algorithms.
        
       | gene-h wrote:
       | There has been more interesting work on using transformers for
       | robot motion planning[0]. Getting a robot arm from point a to b
       | without hitting stuff is a very difficult problem, the problem is
       | both high dimensional and continuous. Previous planning methods
       | are both computationally intensive and not very good. This is one
       | reason why robot motion appears 'unnatural' and robots generally
       | being bad at many tasks we'd like them to do. This approach
       | appears pretty competitive with other planning methods, planning
       | near optimal paths faster.
       | 
       | [0]https://sites.google.com/ucsd.edu/vq-mpt/home
        
       | goggy_googy wrote:
       | This paper reminds me of the Neural Network Diffusion paper which
       | was on the front page of HN yesterday in the sense that we are
       | training another model to bypass a number of iterative steps (in
       | the previous paper, those were SGD steps, in this one, it is A*
       | exploration steps).
       | 
       | On a different note, they choose such a bad heuristic for the A*
       | for Sokoban. The heuristic they choose is "A* first matches every
       | box to the closest dock and then computes the sum of all
       | Manhattan distances between each box and dock pair". I played
       | Sokoban for 20 minutes while reading the paper and I feel like
       | this is a very poor exploration heuristic (you often need to move
       | boxes away from goal state to make progress).
        
         | nyrikki wrote:
         | I have a hunch they made their decision to train off that
         | particular type of A* traces to avoid an exponential number of
         | embeddings.
        
       | cptroot wrote:
       | I find it hard to believe that all the heavy-weight data
       | processing and GPU computation really make a constant factor
       | reduction in search steps worth it.
        
         | SnowflakeOnIce wrote:
         | It's also not clear to me how one would determine that an ML
         | model-generated plan is indeed optimal, or how far from optimal
         | it is. A*-based approaches give you these things.
        
       | thesz wrote:
       | Can transformers prove absence of the solution? E.g., did they
       | tried to train them on the unsolvable problems?
       | 
       | I once tried my hands on the PCB routing and here's a simple
       | problem of simultaneous multipath search that is unsolvable:
       | A.....b       B.....a
       | 
       | A and B are starting points and a and b are goal points,
       | respectively for A and for B. It is a 2xN maze, the orders of
       | stating and goal points are reversed.
       | 
       | The search algorithm sometimes required to prove absence of the
       | solution. A* can do that, transformers? I do not think so.
        
         | nyrikki wrote:
         | In this case, A* is in a finite, bounded 30 _30 grid. As A_ is
         | just a type of branch and bound algorithm, which the degenerate
         | form is exhaustive search. Which should be able to  'prove' the
         | absence of a path by simply failing to find one before
         | exhausting all paths.
         | 
         | PCB routing is not as simple as this problem for real world
         | problems.
         | 
         | Digging into why TSP with a positive integer Euclidean metric
         | is NP-C but with a real valued Euclidean metric is NP-hard may
         | be one way to think about it.
         | 
         | But if you choose the right metric and huristic, in what
         | approximates a finite space, A* can run an exhaustive search in
         | practical time.
         | 
         | It is probably a good idea to separate search space exhaustion
         | from no-instance proving which in the case of decision problems
         | can cause confusion.
         | 
         | If you think of NP as relating to yes-instances, co-NP is the
         | no-instances.
        
           | thesz wrote:
           | You are mathematician, am I right? Your answer is correct
           | (sorta, as far as I am concerned) and does not add to the
           | discussion I tried to open.
           | 
           | Your answer does not show anything related to the ability of
           | transformers to prove absence of solution, to be precise.
        
       | tannhaeuser wrote:
       | Planning is already well taken care of by established techniques
       | ranging from graph search, SAT-solvers, OR, Prolog, etc. The
       | point usually is optimization over multiple feasible
       | alternatives, which I have a hard time seeing transformers being
       | up to. I see the role of LLM techniques more in translating
       | natural language descriptions to executable programs - but then
       | Prolog is already very close, it being designed for classic NLP
       | after all.
        
       ___________________________________________________________________
       (page generated 2024-02-23 23:01 UTC)