[HN Gopher] Adversarial policies beat superhuman Go AIs (2023)
       ___________________________________________________________________
        
       Adversarial policies beat superhuman Go AIs (2023)
        
       Author : amichail
       Score  : 230 points
       Date   : 2024-12-23 13:10 UTC (1 days ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | BryanLegend wrote:
       | It sometimes happens in the go world for complete amateurs to be
       | challenging to play against, because their moves are so
       | unpredictable and their shapes are so far away from being normal.
       | Wildly bizarre play sometimes works.
        
         | PartiallyTyped wrote:
         | Magnus (Carlsen, chess) does this often, he pushes people into
         | unknown territory that they are most certainly underprepared
         | for through new or obscure openings that complicate a position
         | very quickly. The game then turns tactical and they eventually
         | find themselves in a bad endgame, one against Magnus of all
         | people.
        
           | ozim wrote:
           | Just in case someone thinks Magnus comes up with those
           | openings on the spot.
           | 
           | No he has a team that uses computers to find out those plays
           | based on what other player played as all past matches are
           | available.
           | 
           | Source: I watched interview with a guy that was hired as a
           | computer scientist consulting gig by Magnus team.
           | 
           | It does not take away how good he is as I don't think many
           | people could learn to remember weird openings and win from
           | that against grand master level players anyway.
        
             | rybosworld wrote:
             | I remember reading that his memory is unrivaled - so this
             | also isn't a strategy the other top players could simply
             | copy.
             | 
             | In chess, there are basically three ways to evaluate moves
             | 
             | 1) pure calculation
             | 
             | 2) recognize the position (or a very similar one) from a
             | previous game, and remember what the best move was
             | 
             | 3) intuition - this one is harder to explain but, I think
             | of it like instinct/muscle memory
             | 
             | All the top players are good at all of these things. But
             | some are agreed upon as much better than others. Magnus is
             | widely agreed to have the best memory. The contender for
             | best calculator might be Fabiano.
             | 
             | In humans, all else being equal, memory seems to be
             | superior to calculation, because calculation takes time.
             | 
             | Chess engines seem to reverse this, with calculation being
             | better than memory, because memory is expensive.
        
               | zdragnar wrote:
               | This is the reason why I couldn't ever get into chess,
               | despite my dad and brother enjoying it. My intuition was
               | crap (having not developed it) and I lacked the ability
               | or desire to fully visualize multiple steps of the game.
               | 
               | All that remained was rote memorization, which makes for
               | a boring game indeed.
               | 
               | Despite all of that, I suspect chess will long outlive my
               | preferred entertainment of Unreal Tournament.
        
               | Retric wrote:
               | The magic of chess is in matchmaking.
               | 
               | I enjoy using nearly pure intuition when playing so I
               | just use that strategy and see the same ~50/50 win
               | percentage as most players because my ELO is based on how
               | I play past games and there's millions of online players
               | across a huge range of skill levels.
               | 
               | There's nothing wrong with staying at 1000 or even 300 if
               | that's what it takes to enjoy the game. It's only if you
               | want to beat specific people or raise your ELO that
               | forces you to try and optimize play.
        
               | stackghost wrote:
               | I hate ladder systems. Winning is fun and losing is not.
               | Why would I purposely choose to play a game/system where
               | your win rate does not meaningfully improve as you skill
               | up?
               | 
               | That sounds frustrating and tedious. If I get better I
               | want to win more often.
        
               | spiritplumber wrote:
               | I stopped enjoying chess because a game in which you
               | always lose is no fun; the only winning move is not to
               | play.
        
               | ANewFormation wrote:
               | His memory is definitely rivaled. During the recent speed
               | chess championships broadcast they had Magnus, Hikaru,
               | Alireza, and some other top players play some little
               | games testing memory, response rate, and so on.
               | 
               | The memory game involved memorizing highlighted circles
               | on a grid so even something ostesibly chess adjacent.
               | Magnus did not do particularly well. Even when playing a
               | blindfold sim against 'just' 5 people (the record is 48)
               | he lost track of the positions (slightly) multiple times
               | and would eventually lose 2 of the games on time.
               | 
               | But where Magnus is completely unrivaled is in intuition.
               | His intuition just leads him in a better direction faster
               | than other top players. This is both what makes him so
               | unstoppable in faster time controls, and also so
               | dangerous in obscure openings where he may have
               | objectively 'meh' positions, but ones where the better
               | player will still win, and that better player is just
               | about always him.
        
               | vlovich123 wrote:
               | While Magnus has a very strong memory (as do all players
               | at that caliber) his intuition is regarded by others and
               | himself as his strongest quality and he constantly talks
               | about how an intuitive player he is compared with others.
               | 
               | https://www.youtube.com/watch?v=N-gw6ChKKoo
        
             | nordsieck wrote:
             | > It does not take away how good he is
             | 
             | Honestly, your anecdote makes me respect him even more.
             | 
             | Few people go to those lengths to prepare.
        
               | llamaimperative wrote:
               | I would presume almost every chess grandmaster does the
               | same, no? And in that case there's nothing particularly
               | genius in this stroke.
               | 
               | Maybe doesn't reduce my image of any individual player,
               | but does reduce the image of the game itself.
        
               | drivers99 wrote:
               | I had a brief rabbit hole about chess at the beginning of
               | this year and found out a few things pros do to prepare
               | against their opponents. I was trying to remember one
               | specific periodical, but I found it: Chess Informant. 320
               | page paperback (and/or CD! - I see they also have a
               | downloadable version for less[2]) quarterly periodical
               | full of games since the last one. Looks like they're up
               | to volume 161.[1] I suppose pros also get specific games
               | they want sooner than that, especially now with
               | everything being streamed, but anyway. There's a lot more
               | going on in chess that is just as important as the time
               | spent actually playing in the tournament.
               | 
               | [1] https://sahovski.com/Chess-Informant-161-Olympic-
               | Spirit-p695... [2] https://sahovski.com/Chess-
               | Informant-161-DOWNLOAD-VERSION-p6...
        
             | kizer wrote:
             | That's very interesting. However it's like any of the
             | organizations that support competitors at elite levels in
             | all sports. From the doctors, nutritionists, coaches that
             | support Olympic athletes to the "high command" of any NFL
             | team coordinating over headset with one another and the
             | coach, who can even radio the quarterback on the field
             | (don't think there is another sport with this).
        
               | bronson wrote:
               | Auto racing? Even has telemetry.
        
             | tejohnso wrote:
             | Do you think that this kind of inorganic requirement is
             | part of the reason he abandoned World Chess?
        
               | tomtomtom777 wrote:
               | No. He did not abandon "World Chess". He is still an
               | active player.
               | 
               | He chooses not to participate in the FIDE World
               | Championship primarily because he doesn't like the
               | format. He prefers a tournament format instead of a long
               | 1-on-1 match against the running champion.
        
           | ANewFormation wrote:
           | Fwiw this is normal in chess nowadays. There was some brief
           | era in chess where everybody was just going down the most
           | critical lines and assuming they could outprepare their
           | opponents, or outplay them if that didn't work out. Kasparov
           | and Fischer are the typical examples of this style.
           | 
           | But computers have made this less practical in modern times
           | simply because it's so easy to lose in these sort of
           | positions to the endless number of comp-prepped novelties
           | which may be both objectively mediocre, but also nary
           | impossible to play against without preparation against a
           | prepared opponent.
           | 
           | So a lot of preparation now a days is about getting positions
           | that may not be the most critical test of an opening, but
           | that lead to interesting positions and where the first player
           | to spring a novelty isn't going to just steamroll the other
           | guy.
           | 
           | So in this brave new world you see things like the Berlin
           | Defense becoming hugely popular while the Najdorf has
           | substantially declined in popularity.
        
           | Sesse__ wrote:
           | It is true that Magnus usually prefers offbeat lines to get
           | out of the opponent's preparation. However, they're rarely
           | very sharp or otherwise tactically complicated; on the
           | contrary, he excels at slow maneuvering in strategic
           | positions (and, as you say, the endgame).
        
         | giraffe_lady wrote:
         | Challenging in the sense that you have to work through
         | positions you're not very practiced at. Not "challenging" in
         | the sense that you might lose the game though.
        
         | tasuki wrote:
         | No it does not.
         | 
         | (Source: I'm European 4 dan. I wipe the go board with weaker
         | players playing whatever unconventional moves they like.
         | Likewise, I get crushed by stronger players, faster than usual
         | if I choose unusual moves. This might work on like the double-
         | digit kyu level...)
        
       | emusan wrote:
       | There is hope for us lowly humans!
        
       | throwaway81523 wrote:
       | From 2022, revised 2023, I may have seen it before and forgotten.
       | It is pretty interesting. I wonder how well the approach works
       | against chess engines, at least Leela-style.
        
       | thiago_fm wrote:
       | I bet that there's a similarity between this and what happens to
       | LLM hallucinations.
       | 
       | At some point we will realize that AI will never be perfect, it
       | will just have much better precision than us.
        
         | kachapopopow wrote:
         | I honestly see hallucinations as an absolute win, it's
         | attempting to (predict/'reason') information from the training
         | data it has.
        
           | voidfunc wrote:
           | I don't think I see them as a win, but they're easily dealt
           | with. AI will need analysts at the latter stage to evaluate
           | the outputs but that will be a relatively short-lived
           | problem.
        
             | bubaumba wrote:
             | > I don't think I see them as a win
             | 
             | Unavoidable, probably
             | 
             | > but they're easily dealt with. AI will need analysts at
             | the latter stage to evaluate the outputs but that will be a
             | relatively short-lived problem
             | 
             | That solves only to some degree. Hallucinations may happen
             | at this stage too. Then either correct answer can get
             | rejected or false pass through.
        
           | zdragnar wrote:
           | I think this is a misuse of the term hallucination.
           | 
           | When most people talk about AI hallucinating, they're
           | referring to output which violates some desired constraints.
           | 
           | In the context of chess, this would be making an invalid
           | move, or upgrading a knight to a queen.
           | 
           | In other contexts, some real examples are fabricating court
           | cases and legal precedent (several lawyers have gotten in
           | trouble here), or a grocery store recipe generator
           | recommending mixing bleach and ammonia for a delightful
           | cocktail.
           | 
           | None of these hallucinations are an attempt to reason about
           | anything. This is why some people oppose using the term
           | hallucination- it is an anthropomorphizing term that gives
           | too much credit to the AI.
           | 
           | We can tighten the band of errors with more data or compute
           | efficiency or power, but in the search for generic AI, this
           | is a dead end.
        
             | wat10000 wrote:
             | It's weird because there's no real difference between
             | "hallucinations" and other output.
             | 
             | LLMs are prediction engines. Given the text so far, what's
             | most likely to come next? In that context, there's very
             | little difference between citing a real court case and
             | citing something that sounds like a real court case.
             | 
             | The weird thing is that they're capable of producing any
             | useful output at all.
        
         | bubaumba wrote:
         | > it will just have much better precision than us.
         | 
         | and much faster with the right hardware. And that's enough if
         | AI can do in seconds what humans takes years. With o3 the price
         | is only the limit, looks like.
        
       | JKCalhoun wrote:
       | "Not chess, Mr. Spock. Poker!"
        
         | white_beach wrote:
         | in the retard universe chess becomes an incomplete information
         | game when you add dimensions
        
       | tantalor wrote:
       | > You beat him!
       | 
       | >> No sir, it is a stalemate.
       | 
       | > What did you do?
       | 
       | >> I was playing for a standoff; a draw. While Kolrami was
       | dedicated to winning, I was able to pass up obvious avenues of
       | advancement and settle for a balance. Theoretically, I should be
       | able to challenge him indefinitely.
       | 
       | > Then you have beaten him!
       | 
       | >> In the strictest sense, I did not win.
       | 
       | > (groans) Data!
       | 
       | >> I busted him up.
       | 
       | > (everybody claps)
        
         | cwillu wrote:
         | You... have made a _mockery_ of me.
        
           | moffkalast wrote:
           | It is possible to commit no mistakes and still lose.
           | 
           | That's not a weakness.
           | 
           | That's life.
        
             | cwillu wrote:
             | But, knowing that he knows that we know that he knows, he
             | might choose to return to his usual pattern.
        
         | taneq wrote:
         | No idea if they did this on purpose but this is exactly what
         | can happen with board game AIs when they know they will win.
         | Unless the evaluation function explicitly promotes winning
         | _sooner_ they will get into an unbeatable position and then
         | just fardle around because they have no reason to win _now_ if
         | they know they can do it later.
        
           | cjbgkagh wrote:
           | Future payoffs are almost always discounted, even if for no
           | other reason than the future has a greater deal of
           | uncertainty. I.e even if it was not explicit which it almost
           | always is, it would still be implicit.
           | 
           | Their conservative style is usually due to having a better
           | fitness function. Humans tend to not be able to model
           | uncertainty as accurately and this results in more aggressive
           | play, a bird in the hand is worth two in the bush.
        
             | taneq wrote:
             | Typically yeah, but when you're trying to make it work at
             | all it can be easy to forget to add a bit of a gradient
             | towards "winning sooner is better". And this happens even
             | at the top level, the example I was thinking about as I
             | typed that was one of the AlphaGo exhibition games against
             | Lee Sedol (the first, maybe?) where it got into a crushing
             | position then seemingly messed around.
        
               | cjbgkagh wrote:
               | There is zero chance AlphaGo devs forgot about
               | discounting. Usually you relax the discount to allow for
               | optimal play, most likely the fitness function flailed a
               | bit in the long tail.
        
             | kqr wrote:
             | Indeed. Humans use "points ahead" as a proxy for "chance of
             | win" so we tend to play lines that increase our lead more,
             | even when they are a tiny bit riskier. Good software does
             | not -- it aims for maximum chance of win, which usually
             | means slower, less aggressive moves to turn uncertain
             | situations into more well-defined ones.
        
           | nilslindemann wrote:
           | Example: https://lichess.org/study/kPWZgp6s/nwqy2Hwg
        
           | gweinberg wrote:
           | Doesn't the board get filled up with stones? I could see how
           | a go player might think a win is a win so it doesn't mater
           | how many stones you win by, but I don;t see how you would go
           | about delaying winning.
        
             | zahlman wrote:
             | >Doesn't the board get filled up with stones?
             | 
             | To some extent, but a player who's way ahead could still
             | have a lot of latitude to play pointless moves without
             | endangering the win. In the case of Go it's generally not
             | so much "delaying winning" as just embarrassing the
             | opponent by playing obviously suboptimal moves (that make
             | it clearer that some key group is dead, for example).
             | 
             | Although it's possible to start irrelevant, time-wasting ko
             | positions - if the opponent accepts the offer to fight over
             | them.
        
           | zahlman wrote:
           | When I was a child, I didn't understand that episode as Data
           | demonstrating his superiority at the game by deliberately
           | keeping it evenly-matched, or that the alien opponent somehow
           | realized that Data could win at any time and simply chose not
           | to.
           | 
           | Rather, I figured Data had come up with some hitherto-unknown
           | strategy that allowed for making the game arbitrarily long;
           | and that the alien had a choice between deliberately losing,
           | accidentally losing (the way the game is depicted, it gets
           | more complex the longer you play) or continuing to play
           | (where an android wouldn't be limited by biology). (No, I
           | didn't phrase my understanding like that, or speak it aloud.)
        
         | dataviz1000 wrote:
         | Wasn't this the plot to War Games (1983)?
        
           | ncr100 wrote:
           | Q: If the AIs are trained on adversarial policies, will this
           | strategy also start to fail in these game-playing scenarios?
           | 
           | EDIT: Discussed later on
           | https://news.ycombinator.com/item?id=42503110
        
             | renewiltord wrote:
             | > _The core vulnerability uncovered by our attack persists
             | even in KataGo agents adversarially trained to defend
             | against our attack_
        
               | ncr100 wrote:
               | Thanks!
        
       | billforsternz wrote:
       | This seems amazing at first sight. It's probably just me, but I
       | find the paper to be very hard to understand even though I know a
       | little bit about Go and Go AI and a lot about chess and chess AI.
       | They seem to expend the absolute minimum amount of effort on
       | describing what they did and how it can possibly work,
       | unnecessarily using unexplained jargon to more or less mask the
       | underlying message. I can almost see through the veil they've
       | surrounded their (remarkable and quite simple?) ideas with, but
       | not quite.
        
         | dragontamer wrote:
         | https://slideslive.com/39006680/adversarial-policies-beat-su...
         | 
         | Seems to be a good intro.
         | 
         | Go uniquely has long periods of dead-man walking, as I like to
         | call it. Your group might be dead on turn 30, but your opponent
         | won't formally kill the group until turn 150 or later.
         | 
         | If your opponent knows the truth all the way back in turn30,
         | while you are led down the wrong path for those hundreds of
         | turns, you will almost certainly lose.
         | 
         | This adversarial AI tricks AlphaGo/KataGo into such situations.
         | And instead of capitalizing on it, they focus on the trickery
         | knowing that KataGo reliably fails to understand the situation
         | (aka it's better to make a suboptimal play to keep KataGo
         | tricked / glitched, rather than play an optimal move that may
         | reveal to KataGo the failure of understanding).
         | 
         | Even with adversarial training (IE: KataGo training on this
         | flaw), the flaw remains and it's not clear why.
         | 
         | ------
         | 
         | It appears that this glitch (the cyclical group) is easy enough
         | for an amateur player to understand (I'm ranked around 10kyu,
         | which is estimated to be the same level of effort as 1500Elo
         | chess. Reasonably practiced but nothing special).
         | 
         | So it seems like I as a human (even at 10kyu) could defeat
         | AlphaGo/KataGo with a bit of practice.
        
           | hinkley wrote:
           | Aji is the concept of essentially making lemonaid from lemons
           | by using the existence of the dead stones to put pressure on
           | the surrounding pieces and claw back some of your losses.
           | 
           | Because they haven't been captured yet they reduce the safety
           | (liberties) of nearby stones. And until those are fully
           | settled an incorrect move could rescue them, and the effort
           | put into preventing that may cost points in the defense.
        
           | billforsternz wrote:
           | Thank you. So the attack somehow sets up a situation where
           | AlphaGo/KataGo is the dead man walking? It doesn't realise at
           | move 30 it has a group that is dead, and continues not to
           | realise that until (close to the time that?) the group is
           | formally surrounded at move 150?
           | 
           | I still don't really understand, because this makes it sound
           | as if AlphaGo/KataGo is just not very good at Go!
        
             | dragontamer wrote:
             | To be clear, this is an adversarial neural network that
             | automatically looks for these positions.
             | 
             | So we aren't talking about 'one' Deadman walking position,
             | but multiple ones that this research group searches for,
             | categorizes and studies to see if AlphaGo / KataGo can
             | learn / defend against them with more training.
             | 
             | I'd argue that Go is specifically a game where the absurdly
             | long turn counts and long-term thinking allows for these
             | situations to ever come up in the first place. It's why the
             | game is and always fascinated players.
             | 
             | -------
             | 
             | Or in other words: if you know that a superhuman AI has a
             | flaw in its endgame calculation, then play in a deeply
             | 'dead man walking' manner, tricking the AI into thinking
             | it's winning when in truth its losing for hundreds of
             | moves.
             | 
             | MCTS is strong because it plays out reasonable games and
             | foresees and estimates endgame positions. If the neural
             | nets oracle is just plain wrong in some positions, it leads
             | to incredible vulnerabilities.
        
               | billforsternz wrote:
               | I think I'm starting to see after reading these replies
               | and some of the linked material. Basically the things
               | that confused me most about the rules of go when I first
               | looked at it are playing a role in creating the attack
               | surface: How do we decide to stop the game? How do we
               | judge whether this (not completely surrounded) stone is
               | dead? Why don't we play it out? Etc.
        
               | zahlman wrote:
               | Most rulesets allow you to "play it out" without losing
               | points. Humans don't do it because it's boring and
               | potentially insulting or obnoxious.
               | 
               | Judging whether something "is dead" emerges from a
               | combination of basic principles and skill at the game.
               | Formally, we can distinguish concepts of unconditionally
               | alive or "pass-alive" (cannot be captured by any legal
               | sequence of moves) and unconditionally dead (cannot be
               | _made unconditionally alive_ by any sequence of moves),
               | in the sense of Benson 's algorithm
               | (https://en.wikipedia.org/wiki/Benson%27s_algorithm_(Go)
               | , not the only one with that name apparently). But
               | players are more generally concerned with "cannot be
               | captured in alternating play" (i.e., if the opponent
               | starts first, it's always possible to reach a pass-alive
               | state; ideally the player has read out how to do so) and
               | "cannot be defended in alternating play" (i.e., not in
               | the previous state, and cannot be made so with any single
               | move).
               | 
               | Most commonly, an "alive" string of stones either already
               | has two separate "eyes" or can be shown to reach such a
               | configuration inevitably. (Eyes are surrounded points
               | such that neither is a legal move for the opponent;
               | supposing that playing on either fails to capture the
               | string or any other string - then it is impossible to
               | capture the string, because stones are played one at a
               | time, and capturing the string would require covering
               | both spaces at once.) In rarer cases, a "seki" (English
               | transliteration of Japanese) arises, where both player's
               | strings are kept alive by each others' weakness: any
               | attempt by either player to capture results in losing a
               | capturing race (because the empty spaces next to the
               | strings are shared, such that covering the opponent's
               | "liberty" also takes one from your own string).
        
           | zahlman wrote:
           | This is not a reasonable summary. The adversarial AI is not
           | finding some weird position that relies on KataGo not
           | understanding the status. It's relying, supposedly, on KataGo
           | not understanding the _ruleset_ which uses area scoring and
           | _doesn 't include removing dead stones_ (because in area
           | scoring you _can_ always play it out without losing points,
           | so this is a simple way to avoid disputes between computers,
           | which don 't get bored of it).
           | 
           | I assume that KataGo still has this "flaw" after adversarial
           | training simply because it doesn't overcome the training it
           | has in environments where taking dead stones off the board
           | (or denying them space to make two eyes if you passed every
           | move) isn't expected.
           | 
           | See https://boardgames.stackexchange.com/questions/58127
           | which includes an image of a position the adversarial AI
           | supposedly "won" which even at your level should appear
           | _utterly laughable_. (Sorry, I don 't mean to condescend - I
           | am only somewhere around 1dan myself.)
           | 
           | (ELO is sometimes used in Go ranking, but I don't think it
           | can fairly be compared to chess ranking nor used as a metric
           | for "level of effort".)
        
             | dragontamer wrote:
             | There are multiple examples from this research group.
             | 
             | I believe my discussion above is a reasonable survey of the
             | cyclic attack linked to at the beginning of the website.
             | 
             | https://goattack.far.ai/game-analysis#contents
        
               | roenxi wrote:
               | What we need are more sides to the argument. I'm pretty
               | sure you're both off.
               | 
               | zahlman doesn't seem to have read the part of the paper
               | dealing with cyclic adversaries, but the cyclic adversary
               | strategy doesn't depend on KataGo mis-classifying alive
               | or dead groups over long time horizons. If you watch the
               | example games play out, KataGo kills the stones
               | successfully and is trivially winning for most of the
               | game. It makes a short term & devastating mistake where
               | it doesn't seem to understand that it has a shortage of
               | liberties and lets the adversary kill a huge group in a
               | stupid way.
               | 
               | The mistake KataGo makes doesn't have anything to do with
               | long move horizons, on a long time horizon it still plays
               | excellently. The short horizon is where it mucks up.
        
               | zahlman wrote:
               | I don't suppose you could directly link to a position? It
               | would be interesting to see KataGo make a blunder of the
               | sort you describe, because traditional Go engines were
               | able to avoid them many years ago.
        
         | fipar wrote:
         | Some amount of jargon is needed (in general, not just for this)
         | to optimize communication among experts, but still, your
         | comment reminded me of Pirsig's concept (IIRC introduced in his
         | second book, "Lila") of the "cultural inmune system", as he did
         | bring jargon up in that context too.
         | 
         | I guess, unsurprisingly, for jargon it is as for almost
         | anything else: there's a utility function with one inflection
         | point past which the output value actually becomes less (if the
         | goal is to convey information as clearly as possible, for other
         | goals, I guess the utility function may be exponential ...)
        
       | kizer wrote:
       | You'd think the ability to set up elaborate tricks would imply
       | similar knowledge of the game. And also that highly skilled AI
       | would implicitly include adversarial strategies. Interesting
       | result.
        
         | dragontamer wrote:
         | The existence of KataGo and it's super-AlphaGo / AlphaZero
         | strength is because Go players noticed that AlphaGo can't see
         | ladders.
         | 
         | A simple formation that even mild amateurs must learn to reach
         | the lowest ranks.
         | 
         | KataGo recognizes the flaw and has an explicit ladder solver
         | written in traditional code. It seems like neural networks will
         | never figure out ladders (!!!!!). And it's not clear why such a
         | simple pattern is impossible for deep neural nets to figure
         | out.
         | 
         | I'm not surprised that there are other, deeper patterns that
         | all of these AIs have missed.
        
           | bwfan123 wrote:
           | >It seems like neural networks will never figure out ladders
           | (!!!!!). And it's not clear why such a simple pattern is
           | impossible for deep neural nets to figure out.
           | 
           | this is very interesting (i dont play go) can you elaborate -
           | what is the characteristic of these formations that elude AIs
           | - is it that they dont appear in the self-training or game
           | databases.
        
             | dragontamer wrote:
             | AlphaGo was trained on many human positions, all of which
             | contain numerous ladders.
             | 
             | I don't think anyone knows for sure, but ladders are very
             | calculation heavy. Unlike a lot of positions where Go is
             | played by so called instinct, a ladder switches modes into
             | "If I do X opponent does Y so I do Z.....", almost chess
             | like.
             | 
             | Except it's very easy because there are only 3 or 4 options
             | per step and really only one of those options continues the
             | ladder. So it's this position where a chess-like tree
             | breaks out in the game of Go but far simpler.
             | 
             | You still need to play Go (determining the strength of the
             | overall board and evaluate if the ladder is worth it or if
             | ladder breaker moves are possible/reasonable). But for
             | strictly the ladder it's a simple and somewhat tedious
             | calculation lasting about 20 or so turns on the average.
             | 
             | --------
             | 
             | The thing about ladders is that no one actually plays out a
             | ladder. They just sit there on the board because it's rare
             | for it to play to both players advantages (ladders are
             | sharp: they either favor white or black by significant
             | margins).
             | 
             | So as, say Black, is losing the ladder, Black will NEVER
             | play the ladder. But needs to remember that the ladder is
             | there for the rest of the game.
             | 
             | A ladder breaker is when Black places a piece that maybe in
             | 15 turns (or later) will win the ladder (often while
             | accomplishing something else). So after a ladder breaker,
             | Black is winning the ladder and White should never play the
             | ladder.
             | 
             | So the threat of the ladder breaker changes the game and
             | position severely in ways that can only be seen in the far
             | far future, dozens or even a hundred turns from now. It's
             | outside the realm of computer calculations but yet feasible
             | for humans to understand the implications.
        
             | tasuki wrote:
             | I'd argue it's clear why it's hard for a neural net to
             | figure out.
             | 
             | A ladder is a kind of a mechanical one-way sequence which
             | is quite long to read out. This is easy for humans (it's a
             | one-way street!) but hard for AI (the MCTS prefers to
             | search wide rather than deep). It is easy to tell the
             | neural net as one of its inputs eg "this ladder works" or
             | "this ladder doesn't work" -- in fact that's exactly what
             | KataGo does.
             | 
             | See the pictures for more details about ladders:
             | https://senseis.xmp.net/?Ladder
        
               | dragontamer wrote:
               | Doesn't MCTS deeply AND broadly search though?
               | 
               | Traditional MCTS searches all the way to endgame and
               | estimates how the current position leads to either win or
               | loss. I'm not sure what the latest and greatest is but
               | those % chance to win numbers are literally a search
               | result over possible endgames IIRC.
               | 
               | I guess I'd assume that MCTS should see ladders and play
               | at least some of them out.
        
               | tasuki wrote:
               | The short ones, sure. The long ones, it's hard for pure
               | MCTS to... keep the ladder straight?
        
           | erikerikson wrote:
           | Some of our neutral networks learned ladders. You forgot the
           | "a" standing for artificial. Even so amended, "never"? Good
           | luck betting on that belief.
        
           | scotty79 wrote:
           | > And it's not clear why such a simple pattern is impossible
           | for deep neural nets to figure out.
           | 
           | Maybe solving ladders is iterative? Once they make chain-of-
           | thought version of AlphaZero it might figure them out.
        
             | dwaltrip wrote:
             | It's very iterative and mechanical. I would often struggle
             | with ladders in blitz games because they require you to
             | project a diagonal line across a large board with extreme
             | precision. Misjudging by half a square could be fatal. And
             | you also must reassess the ladder whenever a stone is
             | placed near that invisible diagonal line.
             | 
             | That's a great idea. I think some sort of CoT would
             | definitely help.
        
               | dragontamer wrote:
               | These are Go AIs.
               | 
               | The MCTS search is itself a chain-of-thought.
               | 
               | Or in the case of KataGo, a dedicated Ladder-solver that
               | serves as the input to the neural network is more than
               | sufficient. IIRC all ladders of liberties 4 or less are
               | solved by the dedicated KataGo solver.
               | 
               | It's not clear why these adversarial examples pop up yet
               | IMO. It's not an issue of search depth or breadth either,
               | it seems like an instinct thing.
        
       | kevinwang wrote:
       | [2022]
        
       | Upvoter33 wrote:
       | "Our results demonstrate that even superhuman AI systems may
       | harbor surprising failure modes." This is true but really is an
       | empty conclusion. The result has no meaning for future
       | "superintelligences"; they may or may not have these kinds of
       | "failure modes".
        
         | Kapura wrote:
         | On the contrary, this is the most important part of the thesis.
         | They are arguing not only that this AI was vulnerable to this
         | specific attack, but that any AI model is vulnerable to attack
         | vectors that the original builders cannot predict or
         | preemptively guard against. if you say "well, a
         | superintelligence won't be vulnerable" you are putting your
         | faith in magic.
        
         | dragontamer wrote:
         | They developed a system / algorithm that reliably defeats the
         | most powerful Go AI, and is a simple enough system for a
         | trained human to execute.
         | 
         | Surely that's important? It was thought that AlphaGo and KataGo
         | were undefeatable by humans.
        
         | aithrowawaycomm wrote:
         | It's more a lesson about the dangers of transferring an
         | objectively true statement:                 "MuZero can beat
         | any professional Go player"
         | 
         | to a squishy but not false statement:                 "MuZero
         | is an AI which is superhuman at Go"
         | 
         | to a statement which is basically wrong:
         | "MuZero has superhuman intelligence in the domain of Go."
        
       | vouaobrasil wrote:
       | Not so encouraging. This paper will just be used to incorporate
       | defense against adversarial strategies in Go playing AIs. A
       | simple curiosity, but one reflective of the greater state of
       | affairs in AI development which is rather dismal.
        
         | brazzy wrote:
         | According to the abstract, "The core vulnerability uncovered by
         | our attack persists even in KataGo agents adversarially trained
         | to defend against our attack."
        
           | vouaobrasil wrote:
           | Well, that does not apply to future Go AIs for all of time.
        
             | kadoban wrote:
             | Okay, how does one protect against it then? Why would this
             | _not_ apply to any future ones?
        
       | 383toast wrote:
       | NOTE: this is a july 2023 paper, the defense paper in september
       | 2024 is https://arxiv.org/abs/2406.12843
        
         | 8n4vidtmkvmk wrote:
         | > We find that though some of these defenses protect against
         | previously discovered attacks, none withstand freshly trained
         | adversaries.
        
       | casey2 wrote:
       | Reminds me of how even after deep blue chess players learned
       | better anti computer strategies. Because the space of Go is so
       | much larger there are likely many more anti computer strategies
       | like this. It exploits the eval function in the same way
       | 
       | Like chess more compute will win out, as has already been shown.
       | I will remind everyone that elo is a measure of wins and losses
       | not difficulty, conflating the two will lead to poor reasoning.
        
       | nilslindemann wrote:
       | Here have some edge cases for chess, fortresses. The first three
       | are "0.0" in the fourth black wins.
       | 
       | 8/8/8/1Pk5/2Pn3p/5BbP/6P1/5K1R w - - 0 1 (white can not free the
       | rook)
       | 
       | 1B4r1/1p6/pPp5/P1Pp1k2/3Pp3/4Pp1p/5P1P/5K2 b - - 0 1 (the rook
       | can not enter white's position)
       | 
       | kqb5/1p6/1Pp5/p1Pp4/P2Pp1p1/K3PpPp/5P1B/R7 b - - 0 1 (Rook to h1.
       | King to g1, Queen can not enter via a6)
       | 
       | 2nnkn2/2nnnn2/2nnnn2/8/8/8/3QQQ2/3QKQ2 w - - 0 1 (the knights
       | advance as block, so that attacked knights are protected twice)
       | 
       | In the first both Stockfish and Lc0 think white is better
       | (slightly on a deep ply). In the second and in the third they
       | think black wins. Lc0 understands the fourth (applause),
       | Stockfish does not.
        
       | zahlman wrote:
       | Oh, no, not _this_ paper again.
       | 
       | Please see https://boardgames.stackexchange.com/questions/58127/
       | for reference. The first picture there shows a game supposedly
       | "won by Black", due to a refusal to acknowledge that Black's
       | stones are hopelessly dead everywhere except the top-right of the
       | board. The "exploit" that the adversarial AI has found is, in
       | effect, to convince KataGo to pass in this position, and then
       | claim that White has no territory. It doesn't do this by claiming
       | it could possibly make life with alternating play; it does so, in
       | effect, by _citing a ruleset that doesn 't include the idea of
       | removing dead stones_ (https://tromp.github.io/go.html) and
       | expects everything to be played out (using area scoring) for as
       | long as either player isn't satisfied.
       | 
       | Tromp comments: "As a practical shortcut, the following amendment
       | allows dead stone removal" - but this isn't part of the
       | formalization, and anyway the adversarial AI could just not
       | agree, and it's up to KataGo to make pointless moves until it
       | does. To my understanding, the formalization exists in large part
       | because early Go programs often _couldn 't_ reliably tell when
       | the position was fully settled (just like beginner players). It's
       | also relevant on a theoretical level for some algorithms - which
       | would like to know with certainty what the score is in any given
       | position, but would theoretically have to already play Go
       | perfectly in order to compute that.
       | 
       | (If you're interested in why so many rulesets exist, what kinds
       | of strange situations would make the differences matter, etc.,
       | definitely check out the work of Robert Jasiek, a relatively
       | strong amateur European player:
       | https://home.snafu.de/jasiek/rules.html . Much of this was
       | disregarded by the Go community at the time, because it's
       | _incredibly_ pedantic; but that 's exactly what's necessary when
       | it comes to rules disputes and computers.)
       | 
       | One of the authors of the paper posted on the Stack Exchange
       | question and argued
       | 
       | > Now this does all feel rather contrived from a human
       | perspective. But remember, KataGo was trained with this rule set,
       | and configured to play with it. It doesn't know that the "human"
       | rules of Go are any more important than Tromp-Taylor.
       | 
       | But I don't see anything to substantiate that claim. All sorts of
       | Go bots are happy to play against humans in online
       | implementations of the game, under a variety of human-oriented
       | rulesets; and they pass in natural circumstances, and then the
       | online implementation (sometimes using a different AI) proposes
       | group status that is almost always correct (and matches the group
       | status that the human player modeled in order to play that way).
       | As far as I know, if a human player deliberately tries to claim
       | the status is wrong, an AI will either hold its ground or request
       | to resume play and demonstrate the status more clearly. In the
       | position shown at the Stack Exchange link, even in territory
       | scoring without pass stones, White could afford dozens of plays
       | inside the territory (unfairly costing 1 point each) in order to
       | make the White stones all pass-alive and deny any mathematical
       | possibility of the Black stones reaching that status. (Sorry,
       | there really isn't a way to explain that last sentence better
       | without multiple pages of the background theory I linked and/or
       | alluded to above.)
        
         | benchaney wrote:
         | There are two strategies described in this paper. The cyclic
         | adversary, and the pass adversary. You are correct that the
         | pass adversary is super dumb. It is essentially exploiting a
         | loophole in a version of the rules that Katago doesn't actually
         | support. This is such a silly attack that IMO the paper would
         | be a lot more compelling if they had just left it out.
         | 
         | That said, the cyclic adversary is a legitimate weakness in
         | Katago, and I found it quite impressive.
        
       ___________________________________________________________________
       (page generated 2024-12-24 23:00 UTC)