[HN Gopher] Adversarial policies beat superhuman Go AIs (2023)
___________________________________________________________________
Adversarial policies beat superhuman Go AIs (2023)
Author : amichail
Score : 299 points
Date : 2024-12-23 13:10 UTC (2 days ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| BryanLegend wrote:
| It sometimes happens in the go world for complete amateurs to be
| challenging to play against, because their moves are so
| unpredictable and their shapes are so far away from being normal.
| Wildly bizarre play sometimes works.
| PartiallyTyped wrote:
| Magnus (Carlsen, chess) does this often, he pushes people into
| unknown territory that they are most certainly underprepared
| for through new or obscure openings that complicate a position
| very quickly. The game then turns tactical and they eventually
| find themselves in a bad endgame, one against Magnus of all
| people.
| ozim wrote:
| Just in case someone thinks Magnus comes up with those
| openings on the spot.
|
| No he has a team that uses computers to find out those plays
| based on what other player played as all past matches are
| available.
|
| Source: I watched interview with a guy that was hired as a
| computer scientist consulting gig by Magnus team.
|
| It does not take away how good he is as I don't think many
| people could learn to remember weird openings and win from
| that against grand master level players anyway.
| rybosworld wrote:
| I remember reading that his memory is unrivaled - so this
| also isn't a strategy the other top players could simply
| copy.
|
| In chess, there are basically three ways to evaluate moves
|
| 1) pure calculation
|
| 2) recognize the position (or a very similar one) from a
| previous game, and remember what the best move was
|
| 3) intuition - this one is harder to explain but, I think
| of it like instinct/muscle memory
|
| All the top players are good at all of these things. But
| some are agreed upon as much better than others. Magnus is
| widely agreed to have the best memory. The contender for
| best calculator might be Fabiano.
|
| In humans, all else being equal, memory seems to be
| superior to calculation, because calculation takes time.
|
| Chess engines seem to reverse this, with calculation being
| better than memory, because memory is expensive.
| zdragnar wrote:
| This is the reason why I couldn't ever get into chess,
| despite my dad and brother enjoying it. My intuition was
| crap (having not developed it) and I lacked the ability
| or desire to fully visualize multiple steps of the game.
|
| All that remained was rote memorization, which makes for
| a boring game indeed.
|
| Despite all of that, I suspect chess will long outlive my
| preferred entertainment of Unreal Tournament.
| Retric wrote:
| The magic of chess is in matchmaking.
|
| I enjoy using nearly pure intuition when playing so I
| just use that strategy and see the same ~50/50 win
| percentage as most players because my ELO is based on how
| I play past games and there's millions of online players
| across a huge range of skill levels.
|
| There's nothing wrong with staying at 1000 or even 300 if
| that's what it takes to enjoy the game. It's only if you
| want to beat specific people or raise your ELO that
| forces you to try and optimize play.
| stackghost wrote:
| I hate ladder systems. Winning is fun and losing is not.
| Why would I purposely choose to play a game/system where
| your win rate does not meaningfully improve as you skill
| up?
|
| That sounds frustrating and tedious. If I get better I
| want to win more often.
| linguistbreaker wrote:
| But winning is only fun because you do not always win and
| almost proportionally so... If you get better you get to
| play better games against better opponents.
|
| The win or loss is ancillary to the experience for me.
| stackghost wrote:
| >The win or loss is ancillary to the experience for me.
|
| Maybe because I primarily play sports and not chess but
| this attitude is completely foreign and mystifying to me.
|
| Don't you feel bad when you lose? Why would you purposely
| engage in an ELO system that results in you feeling bad
| after 50% of games, and never gives you a sense of
| progress?
|
| Isn't that profoundly discouraging?
|
| Do you think Tiger Woods or Leo Messi wish they won fewer
| matches? Like I just can't get myself into a headspace
| where you're out for competition but are satisfied with a
| 50% win rate.
| lupire wrote:
| The ELO system does give you a sense of process.
| Continuing to beat up weak players does not give you
| progress. It makes you the one eyed king of the blind.
|
| Do you think professional athletes like Woods and Messi
| are stupid because they could be playing in Farm League
| and winning every time against scrubs?
| stackghost wrote:
| >The ELO system does give you a sense of process.
|
| By definition it does not, unless your definition of
| progress is "number go up".
|
| >Do you think professional athletes are stupid because
| they could be playing in Little League and winning every
| time against kids?
|
| So let me get this straight: are you seriously suggesting
| that you don't understand the difference between e.g. the
| format of the NHL or the FIFA world cup, and playing
| against literal children to pad one's win rate?
|
| Because I think you're probably not arguing in good faith
| with that last comment. Time for me to duck out of this
| conversation.
| Retric wrote:
| Progress is in the quality of the games not just an
| irrelevant number.
|
| If you have a major skill gap, games become boring. Try
| playing Martin bot for 3 hours.
| refulgentis wrote:
| I honestly don't understand your point and do understand
| his, and definitely don't understand why you took it so
| aggressively.
|
| All he's saying is it's boring to win all the time.
| hn3er1q wrote:
| It feels bad to loose but you also need the wins to feel
| good. Beating a low ELO player is about as fun as beating
| small kids at basketball or something. For me it's not
| the win/loss that drives me but making fewer mistakes. If
| I loose a game where my opponent punished a minor
| mistake, fair enough, that took skill and I'll learn from
| it and I don't feel bad. But if I loose because I made a
| blunder (obvious tactical error) that sucks and I hate
| that.
| refulgentis wrote:
| Because that's not a Nash equilibrium: for every extra
| bit of fun you have, someone else has notfun, and thus
| has an incentive to switch their strategy (play on
| another site)
| omegaham wrote:
| You can always play in tournaments to figure out where
| you rank compared to a larger population!
| stackghost wrote:
| Indeed I much prefer a tournament format.
| lupire wrote:
| You would probably prefer the game Shooting Fish in a
| Barrel over the game Chess.
|
| Winning half the time is better because each of those
| wins means far far more than winning against bad players.
|
| Playing down is only fun for insecure, unambitious
| people. If winning is the fun part, just cheat, don't
| seek out bad players to play against. Playing against bad
| players makes you bad at chess.
| stackghost wrote:
| Edit: never mind you're the same guy constructing
| strawman arguments in the other thread
| wslh wrote:
| I haven't read this thread in that way: if you want to
| improve your skills that is great, it is your choice but
| you should know, realistically speaking, that at certain
| level you cannot improve anymore in your lifetime, except
| if you are part of the elite.
| spiritplumber wrote:
| I stopped enjoying chess because a game in which you
| always lose is no fun; the only winning move is not to
| play.
| ANewFormation wrote:
| His memory is definitely rivaled. During the recent speed
| chess championships broadcast they had Magnus, Hikaru,
| Alireza, and some other top players play some little
| games testing memory, response rate, and so on.
|
| The memory game involved memorizing highlighted circles
| on a grid so even something ostesibly chess adjacent.
| Magnus did not do particularly well. Even when playing a
| blindfold sim against 'just' 5 people (the record is 48)
| he lost track of the positions (slightly) multiple times
| and would eventually lose 2 of the games on time.
|
| But where Magnus is completely unrivaled is in intuition.
| His intuition just leads him in a better direction faster
| than other top players. This is both what makes him so
| unstoppable in faster time controls, and also so
| dangerous in obscure openings where he may have
| objectively 'meh' positions, but ones where the better
| player will still win, and that better player is just
| about always him.
| lupire wrote:
| Short term memory is extremely different from lifelong
| memory.
| ANewFormation wrote:
| For sure, but 'memory' as people think of it plays a
| fairly small role in chess - mostly relegated to opening
| preparation which is quite short term - watch any player,
| including Magnus, stream and they all constantly forget
| or mix up opening theory in various lines. But of course
| if you expect to play a e.g. Marshall Gambit in your next
| game then you'll review those lines shortly before your
| game.
|
| Instead people think players have this enormous cache of
| memorized positions in their minds where they know the
| optimal move, but it's more about lots of ideas and
| patterns, which then show themselves immediately when you
| look at a position.
|
| Watch any world class player solve puzzles and you'll
| find they have often solved it before 'you' (you being
| any person under master level) have even been able to
| figure out where all the pieces are. And it's not like
| they've ever seen the exact position before (at least not
| usually), but they've developed such an extreme intuition
| that the position just instantly reveals itself.
|
| So one could call this some sort of memory as I suspect
| you're doing here with 'lifelong memory', but I think
| intuition is a far more precise term.
| vlovich123 wrote:
| While Magnus has a very strong memory (as do all players
| at that caliber) his intuition is regarded by others and
| himself as his strongest quality and he constantly talks
| about how an intuitive player he is compared with others.
|
| https://www.youtube.com/watch?v=N-gw6ChKKoo
| nordsieck wrote:
| > It does not take away how good he is
|
| Honestly, your anecdote makes me respect him even more.
|
| Few people go to those lengths to prepare.
| llamaimperative wrote:
| I would presume almost every chess grandmaster does the
| same, no? And in that case there's nothing particularly
| genius in this stroke.
|
| Maybe doesn't reduce my image of any individual player,
| but does reduce the image of the game itself.
| drivers99 wrote:
| I had a brief rabbit hole about chess at the beginning of
| this year and found out a few things pros do to prepare
| against their opponents. I was trying to remember one
| specific periodical, but I found it: Chess Informant. 320
| page paperback (and/or CD! - I see they also have a
| downloadable version for less[2]) quarterly periodical
| full of games since the last one. Looks like they're up
| to volume 161.[1] I suppose pros also get specific games
| they want sooner than that, especially now with
| everything being streamed, but anyway. There's a lot more
| going on in chess that is just as important as the time
| spent actually playing in the tournament.
|
| [1] https://sahovski.com/Chess-Informant-161-Olympic-
| Spirit-p695... [2] https://sahovski.com/Chess-
| Informant-161-DOWNLOAD-VERSION-p6...
| kizer wrote:
| That's very interesting. However it's like any of the
| organizations that support competitors at elite levels in
| all sports. From the doctors, nutritionists, coaches that
| support Olympic athletes to the "high command" of any NFL
| team coordinating over headset with one another and the
| coach, who can even radio the quarterback on the field
| (don't think there is another sport with this).
| bronson wrote:
| Auto racing? Even has telemetry.
| maeil wrote:
| Road cycling as well maybe? Tour de France.
| tejohnso wrote:
| Do you think that this kind of inorganic requirement is
| part of the reason he abandoned World Chess?
| tomtomtom777 wrote:
| No. He did not abandon "World Chess". He is still an
| active player.
|
| He chooses not to participate in the FIDE World
| Championship primarily because he doesn't like the
| format. He prefers a tournament format instead of a long
| 1-on-1 match against the running champion.
| ANewFormation wrote:
| Fwiw this is normal in chess nowadays. There was some brief
| era in chess where everybody was just going down the most
| critical lines and assuming they could outprepare their
| opponents, or outplay them if that didn't work out. Kasparov
| and Fischer are the typical examples of this style.
|
| But computers have made this less practical in modern times
| simply because it's so easy to lose in these sort of
| positions to the endless number of comp-prepped novelties
| which may be both objectively mediocre, but also nary
| impossible to play against without preparation against a
| prepared opponent.
|
| So a lot of preparation now a days is about getting positions
| that may not be the most critical test of an opening, but
| that lead to interesting positions and where the first player
| to spring a novelty isn't going to just steamroll the other
| guy.
|
| So in this brave new world you see things like the Berlin
| Defense becoming hugely popular while the Najdorf has
| substantially declined in popularity.
| Sesse__ wrote:
| It is true that Magnus usually prefers offbeat lines to get
| out of the opponent's preparation. However, they're rarely
| very sharp or otherwise tactically complicated; on the
| contrary, he excels at slow maneuvering in strategic
| positions (and, as you say, the endgame).
| giraffe_lady wrote:
| Challenging in the sense that you have to work through
| positions you're not very practiced at. Not "challenging" in
| the sense that you might lose the game though.
| tasuki wrote:
| No it does not.
|
| (Source: I'm European 4 dan. I wipe the go board with weaker
| players playing whatever unconventional moves they like.
| Likewise, I get crushed by stronger players, faster than usual
| if I choose unusual moves. This might work on like the double-
| digit kyu level...)
| emusan wrote:
| There is hope for us lowly humans!
| throwaway81523 wrote:
| From 2022, revised 2023, I may have seen it before and forgotten.
| It is pretty interesting. I wonder how well the approach works
| against chess engines, at least Leela-style.
| thiago_fm wrote:
| I bet that there's a similarity between this and what happens to
| LLM hallucinations.
|
| At some point we will realize that AI will never be perfect, it
| will just have much better precision than us.
| kachapopopow wrote:
| I honestly see hallucinations as an absolute win, it's
| attempting to (predict/'reason') information from the training
| data it has.
| voidfunc wrote:
| I don't think I see them as a win, but they're easily dealt
| with. AI will need analysts at the latter stage to evaluate
| the outputs but that will be a relatively short-lived
| problem.
| bubaumba wrote:
| > I don't think I see them as a win
|
| Unavoidable, probably
|
| > but they're easily dealt with. AI will need analysts at
| the latter stage to evaluate the outputs but that will be a
| relatively short-lived problem
|
| That solves only to some degree. Hallucinations may happen
| at this stage too. Then either correct answer can get
| rejected or false pass through.
| zdragnar wrote:
| I think this is a misuse of the term hallucination.
|
| When most people talk about AI hallucinating, they're
| referring to output which violates some desired constraints.
|
| In the context of chess, this would be making an invalid
| move, or upgrading a knight to a queen.
|
| In other contexts, some real examples are fabricating court
| cases and legal precedent (several lawyers have gotten in
| trouble here), or a grocery store recipe generator
| recommending mixing bleach and ammonia for a delightful
| cocktail.
|
| None of these hallucinations are an attempt to reason about
| anything. This is why some people oppose using the term
| hallucination- it is an anthropomorphizing term that gives
| too much credit to the AI.
|
| We can tighten the band of errors with more data or compute
| efficiency or power, but in the search for generic AI, this
| is a dead end.
| wat10000 wrote:
| It's weird because there's no real difference between
| "hallucinations" and other output.
|
| LLMs are prediction engines. Given the text so far, what's
| most likely to come next? In that context, there's very
| little difference between citing a real court case and
| citing something that sounds like a real court case.
|
| The weird thing is that they're capable of producing any
| useful output at all.
| bubaumba wrote:
| > it will just have much better precision than us.
|
| and much faster with the right hardware. And that's enough if
| AI can do in seconds what humans takes years. With o3 the price
| is only the limit, looks like.
| JKCalhoun wrote:
| "Not chess, Mr. Spock. Poker!"
| tantalor wrote:
| > You beat him!
|
| >> No sir, it is a stalemate.
|
| > What did you do?
|
| >> I was playing for a standoff; a draw. While Kolrami was
| dedicated to winning, I was able to pass up obvious avenues of
| advancement and settle for a balance. Theoretically, I should be
| able to challenge him indefinitely.
|
| > Then you have beaten him!
|
| >> In the strictest sense, I did not win.
|
| > (groans) Data!
|
| >> I busted him up.
|
| > (everybody claps)
| cwillu wrote:
| You... have made a _mockery_ of me.
| moffkalast wrote:
| It is possible to commit no mistakes and still lose.
|
| That's not a weakness.
|
| That's life.
| cwillu wrote:
| But, knowing that he knows that we know that he knows, he
| might choose to return to his usual pattern.
| taneq wrote:
| No idea if they did this on purpose but this is exactly what
| can happen with board game AIs when they know they will win.
| Unless the evaluation function explicitly promotes winning
| _sooner_ they will get into an unbeatable position and then
| just fardle around because they have no reason to win _now_ if
| they know they can do it later.
| cjbgkagh wrote:
| Future payoffs are almost always discounted, even if for no
| other reason than the future has a greater deal of
| uncertainty. I.e even if it was not explicit which it almost
| always is, it would still be implicit.
|
| Their conservative style is usually due to having a better
| fitness function. Humans tend to not be able to model
| uncertainty as accurately and this results in more aggressive
| play, a bird in the hand is worth two in the bush.
| taneq wrote:
| Typically yeah, but when you're trying to make it work at
| all it can be easy to forget to add a bit of a gradient
| towards "winning sooner is better". And this happens even
| at the top level, the example I was thinking about as I
| typed that was one of the AlphaGo exhibition games against
| Lee Sedol (the first, maybe?) where it got into a crushing
| position then seemingly messed around.
| cjbgkagh wrote:
| There is zero chance AlphaGo devs forgot about
| discounting. Usually you relax the discount to allow for
| optimal play, most likely the fitness function flailed a
| bit in the long tail.
| kqr wrote:
| Indeed. Humans use "points ahead" as a proxy for "chance of
| win" so we tend to play lines that increase our lead more,
| even when they are a tiny bit riskier. Good software does
| not -- it aims for maximum chance of win, which usually
| means slower, less aggressive moves to turn uncertain
| situations into more well-defined ones.
| nilslindemann wrote:
| Example: https://lichess.org/study/kPWZgp6s/nwqy2Hwg
| gweinberg wrote:
| Doesn't the board get filled up with stones? I could see how
| a go player might think a win is a win so it doesn't mater
| how many stones you win by, but I don;t see how you would go
| about delaying winning.
| zahlman wrote:
| >Doesn't the board get filled up with stones?
|
| To some extent, but a player who's way ahead could still
| have a lot of latitude to play pointless moves without
| endangering the win. In the case of Go it's generally not
| so much "delaying winning" as just embarrassing the
| opponent by playing obviously suboptimal moves (that make
| it clearer that some key group is dead, for example).
|
| Although it's possible to start irrelevant, time-wasting ko
| positions - if the opponent accepts the offer to fight over
| them.
| zahlman wrote:
| When I was a child, I didn't understand that episode as Data
| demonstrating his superiority at the game by deliberately
| keeping it evenly-matched, or that the alien opponent somehow
| realized that Data could win at any time and simply chose not
| to.
|
| Rather, I figured Data had come up with some hitherto-unknown
| strategy that allowed for making the game arbitrarily long;
| and that the alien had a choice between deliberately losing,
| accidentally losing (the way the game is depicted, it gets
| more complex the longer you play) or continuing to play
| (where an android wouldn't be limited by biology). (No, I
| didn't phrase my understanding like that, or speak it aloud.)
| dataviz1000 wrote:
| Wasn't this the plot to War Games (1983)?
| ncr100 wrote:
| Q: If the AIs are trained on adversarial policies, will this
| strategy also start to fail in these game-playing scenarios?
|
| EDIT: Discussed later on
| https://news.ycombinator.com/item?id=42503110
| renewiltord wrote:
| > _The core vulnerability uncovered by our attack persists
| even in KataGo agents adversarially trained to defend
| against our attack_
| ncr100 wrote:
| Thanks!
| billforsternz wrote:
| This seems amazing at first sight. It's probably just me, but I
| find the paper to be very hard to understand even though I know a
| little bit about Go and Go AI and a lot about chess and chess AI.
| They seem to expend the absolute minimum amount of effort on
| describing what they did and how it can possibly work,
| unnecessarily using unexplained jargon to more or less mask the
| underlying message. I can almost see through the veil they've
| surrounded their (remarkable and quite simple?) ideas with, but
| not quite.
| dragontamer wrote:
| https://slideslive.com/39006680/adversarial-policies-beat-su...
|
| Seems to be a good intro.
|
| Go uniquely has long periods of dead-man walking, as I like to
| call it. Your group might be dead on turn 30, but your opponent
| won't formally kill the group until turn 150 or later.
|
| If your opponent knows the truth all the way back in turn30,
| while you are led down the wrong path for those hundreds of
| turns, you will almost certainly lose.
|
| This adversarial AI tricks AlphaGo/KataGo into such situations.
| And instead of capitalizing on it, they focus on the trickery
| knowing that KataGo reliably fails to understand the situation
| (aka it's better to make a suboptimal play to keep KataGo
| tricked / glitched, rather than play an optimal move that may
| reveal to KataGo the failure of understanding).
|
| Even with adversarial training (IE: KataGo training on this
| flaw), the flaw remains and it's not clear why.
|
| ------
|
| It appears that this glitch (the cyclical group) is easy enough
| for an amateur player to understand (I'm ranked around 10kyu,
| which is estimated to be the same level of effort as 1500Elo
| chess. Reasonably practiced but nothing special).
|
| So it seems like I as a human (even at 10kyu) could defeat
| AlphaGo/KataGo with a bit of practice.
| hinkley wrote:
| Aji is the concept of essentially making lemonaid from lemons
| by using the existence of the dead stones to put pressure on
| the surrounding pieces and claw back some of your losses.
|
| Because they haven't been captured yet they reduce the safety
| (liberties) of nearby stones. And until those are fully
| settled an incorrect move could rescue them, and the effort
| put into preventing that may cost points in the defense.
| billforsternz wrote:
| Thank you. So the attack somehow sets up a situation where
| AlphaGo/KataGo is the dead man walking? It doesn't realise at
| move 30 it has a group that is dead, and continues not to
| realise that until (close to the time that?) the group is
| formally surrounded at move 150?
|
| I still don't really understand, because this makes it sound
| as if AlphaGo/KataGo is just not very good at Go!
| dragontamer wrote:
| To be clear, this is an adversarial neural network that
| automatically looks for these positions.
|
| So we aren't talking about 'one' Deadman walking position,
| but multiple ones that this research group searches for,
| categorizes and studies to see if AlphaGo / KataGo can
| learn / defend against them with more training.
|
| I'd argue that Go is specifically a game where the absurdly
| long turn counts and long-term thinking allows for these
| situations to ever come up in the first place. It's why the
| game is and always fascinated players.
|
| -------
|
| Or in other words: if you know that a superhuman AI has a
| flaw in its endgame calculation, then play in a deeply
| 'dead man walking' manner, tricking the AI into thinking
| it's winning when in truth its losing for hundreds of
| moves.
|
| MCTS is strong because it plays out reasonable games and
| foresees and estimates endgame positions. If the neural
| nets oracle is just plain wrong in some positions, it leads
| to incredible vulnerabilities.
| billforsternz wrote:
| I think I'm starting to see after reading these replies
| and some of the linked material. Basically the things
| that confused me most about the rules of go when I first
| looked at it are playing a role in creating the attack
| surface: How do we decide to stop the game? How do we
| judge whether this (not completely surrounded) stone is
| dead? Why don't we play it out? Etc.
| zahlman wrote:
| Most rulesets allow you to "play it out" without losing
| points. Humans don't do it because it's boring and
| potentially insulting or obnoxious.
|
| Judging whether something "is dead" emerges from a
| combination of basic principles and skill at the game.
| Formally, we can distinguish concepts of unconditionally
| alive or "pass-alive" (cannot be captured by any legal
| sequence of moves) and unconditionally dead (cannot be
| _made unconditionally alive_ by any sequence of moves),
| in the sense of Benson 's algorithm
| (https://en.wikipedia.org/wiki/Benson%27s_algorithm_(Go)
| , not the only one with that name apparently). But
| players are more generally concerned with "cannot be
| captured in alternating play" (i.e., if the opponent
| starts first, it's always possible to reach a pass-alive
| state; ideally the player has read out how to do so) and
| "cannot be defended in alternating play" (i.e., not in
| the previous state, and cannot be made so with any single
| move).
|
| Most commonly, an "alive" string of stones either already
| has two separate "eyes" or can be shown to reach such a
| configuration inevitably. (Eyes are surrounded points
| such that neither is a legal move for the opponent;
| supposing that playing on either fails to capture the
| string or any other string - then it is impossible to
| capture the string, because stones are played one at a
| time, and capturing the string would require covering
| both spaces at once.)
|
| In rarer cases, a "seki" (English transliteration of
| Japanese - also see https://senseis.xmp.net/?Seki)
| arises, where both player's strings are kept alive by
| each others' weakness: any attempt by either player to
| capture results in losing a capturing race (because the
| empty spaces next to the strings are shared, such that
| covering the opponent's "liberty" also takes one from
| your own string). I say "arises", but typically the seki
| position is forced (as the least bad option for the
| opponent) by one player, in a part of the board where the
| opponent has an advantage and living by forming two eyes
| would be impossible.
|
| Even rarer forms of life may be possible depending on the
| ruleset, as well as global situations that prevent one
| from reducing the position to a sum of scores of groups.
| For example, if there is no superko restriction, a
| "triple ko" (https://senseis.xmp.net/?TripleKo) can
| emerge - three separate ko (https://senseis.xmp.net/?Ko)
| positions, such that every move must capture in the
| "next" ko in a cycle or else lose the game immediately.
|
| It gets much more complex than that
| (https://senseis.xmp.net/?GoRulesBestiary), although also
| much rarer. Many positions that challenge rulesets are
| completely implausible in real play and basically require
| cooperation between the players to achieve.
| billforsternz wrote:
| Sorry this is mostly way over my head, but perhaps you
| can explain something to me that puzzled me when I looked
| at go 50 odd years ago now.
|
| (Please note, I absolutely do understand life requires
| two eyes, and why that is so, but my knowledge doesn't
| extend much further than that).
|
| So hypothetically, if we get to the point where play
| normally stops, why can't I put a stone into my
| opponent's territory? I am reducing his territory by 1
| point. So he will presumably object and take my "dead"
| stone off, first restoring the balance and then
| penalising me one point by putting the newly captured
| stone in my territory. But can't I insist that he
| actually surrounds the stone before he takes it off? That
| would take four turns (I would pass each time) costing
| him 4 points to gain 1. There must be a rule to stop
| this, but is it easily formally expressed? Or is it a)
| Complicated or b) Require some handwaving ?
| dragontamer wrote:
| > So hypothetically, if we get to the point where play
| normally stops, why can't I put a stone into my
| opponent's territory? I am reducing his territory by 1
| point. So he will presumably object and take my "dead"
| stone off, first restoring the balance and then
| penalising me one point by putting the newly captured
| stone in my territory. But can't I insist that he
| actually surrounds the stone before he takes it off? That
| would take four turns (I would pass each time) costing
| him 4 points to gain 1. There must be a rule to stop
| this, but is it easily formally expressed? Or is it a)
| Complicated or b) Require some handwaving ?
|
| There are multiple scoring systems (American, Chinese,
| and Japanese and a couple of others).
|
| * In Chinese scoring, stones do NOT penalize your score.
| So they capture your stone and gain +1 point, and lose 0
| points.
|
| * In American scoring, passing penalizes your score. So
| you place a stone (ultimately -1 point), they place 4
| stones (-4 points), but you pass a further 4 points (4x
| passes == -4 more points). This ends with -4 points to
| the opponent, but -5 points to you. Effectively +1 point
| differential.
|
| * In Japanese scoring, the player will declare your stone
| dead. Because you continue to object the players play it
| out. Once it has been played out, time is rewound and the
| state of the stones will be declared what both players
| now agree (ie: I need 4 stones to kill your stone, if you
| keep passing I'll kill it).
|
| ---------
|
| So your question is only relevant to Japanese scoring (in
| the other two systems, you fail to gain any points). And
| in Japanese scoring, there is the "time rewind" rule for
| post-game debate. (You play out positions only to
| determine alive vs dead if there's a debate. This is
| rarely invoked because nearly everyone can instinctively
| see alive vs dead).
|
| IE: In Japanese scoring, the game has ended after both
| players have passed. Time "rewinds" to this point, any
| "play" is purely for the determination of alive vs dead
| groups.
|
| In all three cases, playing out such a position is
| considered a dick move and a waste of everyone's time.
| billforsternz wrote:
| Than you, a longstanding mystery (for me) solved!
| dragontamer wrote:
| Amusingly, the endgame ritual is the same for all styles.
|
| You play every good move. Then you traditionally play the
| neutral moves (+0 points to either player) to make
| counting easier. Then the game ends as both players pass.
|
| In Chinese, American, or Japanese scoring, this process
| works to maximize your endgame score.
| zahlman wrote:
| Thanks - I've had to do basically this exact explanation
| countless times before. It'd be nice if there were an
| _obvious_ place to refer people for this info.
|
| That said, when I teach beginners I teach them Chinese
| scoring and counting. If they understand the principles
| it'll be easy enough to adapt later, and it doesn't
| change strategy in a practical way. They can play it out
| without worry, it makes more sense from an aesthetic
| standpoint ("you're trying to have stones on the board
| and keep them there, or have a place to put them later" -
| then you only have to explain that you still get points
| for the eyes you can't fill).
|
| It's also IMX faster to score on 9x9: you can shift the
| borders around (equally benefiting both players) to make
| some simple shapes and easily see who has the majority of
| the area and you aren't worrying about arranging
| territories into rectangles.
| lupire wrote:
| Chess is Python and Go is, uh, Go?
| zahlman wrote:
| This is not a reasonable summary. The adversarial AI is not
| finding some weird position that relies on KataGo not
| understanding the status. It's relying, supposedly, on KataGo
| not understanding the _ruleset_ which uses area scoring and
| _doesn 't include removing dead stones_ (because in area
| scoring you _can_ always play it out without losing points,
| so this is a simple way to avoid disputes between computers,
| which don 't get bored of it).
|
| I assume that KataGo still has this "flaw" after adversarial
| training simply because it doesn't overcome the training it
| has in environments where taking dead stones off the board
| (or denying them space to make two eyes if you passed every
| move) isn't expected.
|
| See https://boardgames.stackexchange.com/questions/58127
| which includes an image of a position the adversarial AI
| supposedly "won" which even at your level should appear
| _utterly laughable_. (Sorry, I don 't mean to condescend - I
| am only somewhere around 1dan myself.)
|
| (ELO is sometimes used in Go ranking, but I don't think it
| can fairly be compared to chess ranking nor used as a metric
| for "level of effort".)
| dragontamer wrote:
| There are multiple examples from this research group.
|
| I believe my discussion above is a reasonable survey of the
| cyclic attack linked to at the beginning of the website.
|
| https://goattack.far.ai/game-analysis#contents
| roenxi wrote:
| What we need are more sides to the argument. I'm pretty
| sure you're both off.
|
| zahlman doesn't seem to have read the part of the paper
| dealing with cyclic adversaries, but the cyclic adversary
| strategy doesn't depend on KataGo mis-classifying alive
| or dead groups over long time horizons. If you watch the
| example games play out, KataGo kills the stones
| successfully and is trivially winning for most of the
| game. It makes a short term & devastating mistake where
| it doesn't seem to understand that it has a shortage of
| liberties and lets the adversary kill a huge group in a
| stupid way.
|
| The mistake KataGo makes doesn't have anything to do with
| long move horizons, on a long time horizon it still plays
| excellently. The short horizon is where it mucks up.
| zahlman wrote:
| I don't suppose you could directly link to a position? It
| would be interesting to see KataGo make a blunder of the
| sort you describe, because traditional Go engines were
| able to avoid them many years ago.
| roenxi wrote:
| Consider the first diagram in the linked paper (a, pg 2).
| It is pretty obvious that black could have killed the
| internal group in the top-right corner at any time for
| ~26 points. That'd be about enough to tip the game.
| Instead somehow black's group died giving white ~100
| points and white wins easily. Black would have had ~50
| moves to kill the internal group.
|
| Or if you want a replay, try
| https://goattack.far.ai/adversarial-policy-
| katago#contents - the last game (KataGo with 10,000,000
| visits - https://goattack.far.ai/adversarial-policy-
| katago#10mil_visi...} - game 1 in the table) shows KataGo
| with a trivially winning position around move 200 that it
| then throws away with a baffling sequence of about 20
| moves. I'm pretty sure even as late as move 223 KataGo
| has an easily winning position, looks like it wins the
| capture race in the extreme lower left. It would have
| figured out the game was over by the capture 8 moves
| later.
| dragontamer wrote:
| I see what you mean.
|
| So dead man walking is a bad description. From your
| perspective it's still KataGo winning but a series of
| serious blunders that occurs in these attacks positions.
| fipar wrote:
| Some amount of jargon is needed (in general, not just for this)
| to optimize communication among experts, but still, your
| comment reminded me of Pirsig's concept (IIRC introduced in his
| second book, "Lila") of the "cultural inmune system", as he did
| bring jargon up in that context too.
|
| I guess, unsurprisingly, for jargon it is as for almost
| anything else: there's a utility function with one inflection
| point past which the output value actually becomes less (if the
| goal is to convey information as clearly as possible, for other
| goals, I guess the utility function may be exponential ...)
| kizer wrote:
| You'd think the ability to set up elaborate tricks would imply
| similar knowledge of the game. And also that highly skilled AI
| would implicitly include adversarial strategies. Interesting
| result.
| dragontamer wrote:
| The existence of KataGo and it's super-AlphaGo / AlphaZero
| strength is because Go players noticed that AlphaGo can't see
| ladders.
|
| A simple formation that even mild amateurs must learn to reach
| the lowest ranks.
|
| KataGo recognizes the flaw and has an explicit ladder solver
| written in traditional code. It seems like neural networks will
| never figure out ladders (!!!!!). And it's not clear why such a
| simple pattern is impossible for deep neural nets to figure
| out.
|
| I'm not surprised that there are other, deeper patterns that
| all of these AIs have missed.
| bwfan123 wrote:
| >It seems like neural networks will never figure out ladders
| (!!!!!). And it's not clear why such a simple pattern is
| impossible for deep neural nets to figure out.
|
| this is very interesting (i dont play go) can you elaborate -
| what is the characteristic of these formations that elude AIs
| - is it that they dont appear in the self-training or game
| databases.
| dragontamer wrote:
| AlphaGo was trained on many human positions, all of which
| contain numerous ladders.
|
| I don't think anyone knows for sure, but ladders are very
| calculation heavy. Unlike a lot of positions where Go is
| played by so called instinct, a ladder switches modes into
| "If I do X opponent does Y so I do Z.....", almost chess
| like.
|
| Except it's very easy because there are only 3 or 4 options
| per step and really only one of those options continues the
| ladder. So it's this position where a chess-like tree
| breaks out in the game of Go but far simpler.
|
| You still need to play Go (determining the strength of the
| overall board and evaluate if the ladder is worth it or if
| ladder breaker moves are possible/reasonable). But for
| strictly the ladder it's a simple and somewhat tedious
| calculation lasting about 20 or so turns on the average.
|
| --------
|
| The thing about ladders is that no one actually plays out a
| ladder. They just sit there on the board because it's rare
| for it to play to both players advantages (ladders are
| sharp: they either favor white or black by significant
| margins).
|
| So as, say Black, is losing the ladder, Black will NEVER
| play the ladder. But needs to remember that the ladder is
| there for the rest of the game.
|
| A ladder breaker is when Black places a piece that maybe in
| 15 turns (or later) will win the ladder (often while
| accomplishing something else). So after a ladder breaker,
| Black is winning the ladder and White should never play the
| ladder.
|
| So the threat of the ladder breaker changes the game and
| position severely in ways that can only be seen in the far
| far future, dozens or even a hundred turns from now. It's
| outside the realm of computer calculations but yet feasible
| for humans to understand the implications.
| tasuki wrote:
| I'd argue it's clear why it's hard for a neural net to
| figure out.
|
| A ladder is a kind of a mechanical one-way sequence which
| is quite long to read out. This is easy for humans (it's a
| one-way street!) but hard for AI (the MCTS prefers to
| search wide rather than deep). It is easy to tell the
| neural net as one of its inputs eg "this ladder works" or
| "this ladder doesn't work" -- in fact that's exactly what
| KataGo does.
|
| See the pictures for more details about ladders:
| https://senseis.xmp.net/?Ladder
| dragontamer wrote:
| Doesn't MCTS deeply AND broadly search though?
|
| Traditional MCTS searches all the way to endgame and
| estimates how the current position leads to either win or
| loss. I'm not sure what the latest and greatest is but
| those % chance to win numbers are literally a search
| result over possible endgames IIRC.
|
| I guess I'd assume that MCTS should see ladders and play
| at least some of them out.
| tasuki wrote:
| The short ones, sure. The long ones, it's hard for pure
| MCTS to... keep the ladder straight?
| immibis wrote:
| I don't know that much about MCTS, but I'd think that
| since a ladder requires dozens of moves in a row before
| making any real difference to either player's position,
| they just don't get sampled if you are sampling randomly
| and don't know about ladders. You might find that all
| sampled positions lead to you losing the ladder, so you
| might as well spend the moves capturing some of your
| opponent's stones elsewhere?
| earnestinger wrote:
| https://senseis.xmp.net/?Ladder
|
| (Kind of like wikipedia for go players)
| erikerikson wrote:
| Some of our neutral networks learned ladders. You forgot the
| "a" standing for artificial. Even so amended, "never"? Good
| luck betting on that belief.
| scotty79 wrote:
| > And it's not clear why such a simple pattern is impossible
| for deep neural nets to figure out.
|
| Maybe solving ladders is iterative? Once they make chain-of-
| thought version of AlphaZero it might figure them out.
| dwaltrip wrote:
| It's very iterative and mechanical. I would often struggle
| with ladders in blitz games because they require you to
| project a diagonal line across a large board with extreme
| precision. Misjudging by half a square could be fatal. And
| you also must reassess the ladder whenever a stone is
| placed near that invisible diagonal line.
|
| That's a great idea. I think some sort of CoT would
| definitely help.
| dragontamer wrote:
| These are Go AIs.
|
| The MCTS search is itself a chain-of-thought.
|
| Or in the case of KataGo, a dedicated Ladder-solver that
| serves as the input to the neural network is more than
| sufficient. IIRC all ladders of liberties 4 or less are
| solved by the dedicated KataGo solver.
|
| It's not clear why these adversarial examples pop up yet
| IMO. It's not an issue of search depth or breadth either,
| it seems like an instinct thing.
| dwaltrip wrote:
| Can MCTS dynamically determine that it needs to analyze a
| certain line to a much higher depth than normal due to
| the specifics of the situation?
|
| That's the type of flexible reflection that is needed. I
| think most people would agree that the hard-coded ladder
| solver in Katago is not ideal, and feels like a dirty
| hack. The system should learn when it needs to do special
| analysis, not have us tell it when to. It's good that it
| works, but it'd be better if it didn't need us to hard-
| code such knowledge.
|
| Humans are capable of realizing what a ladder is on their
| own (even if many learn from external sources). And it
| definitely isn't hard-coded into us :)
| dragontamer wrote:
| Traditional MCTS analyzes each line all the way to
| endgame.
|
| I believe neural-net based MCTS (ex: AlphaZero and
| similar) use the neural-net to determine how deep any
| line should go. (Ex: which moves are worth exploring?
| Well, might as well have that itself part of the training
| / inference neural net).
| scotty79 wrote:
| > The MCTS search is itself a chain-of-thought.
|
| I'm not quite sure it's a fair characterization.
|
| Either way...
|
| MCTS evaluates current position using predictions of
| future positions.
|
| To understand value of ladders the algorithm would need
| iteratively analyse just the current layout of the pieces
| on the board.
|
| Apparently the value of ladders is hard to infer from
| probabilisticrvsample of predictions of the future.
|
| Ladders were accidental human discovery just because our
| attention is drawn to patterns. It just happens to be
| that they are valuable and can be mechanistically
| analyzed and evaluated. AI so far struggles with 1 shot
| outputting solutions that would require running small
| iterative program to calculate.
| kevinwang wrote:
| [2022]
| Upvoter33 wrote:
| "Our results demonstrate that even superhuman AI systems may
| harbor surprising failure modes." This is true but really is an
| empty conclusion. The result has no meaning for future
| "superintelligences"; they may or may not have these kinds of
| "failure modes".
| Kapura wrote:
| On the contrary, this is the most important part of the thesis.
| They are arguing not only that this AI was vulnerable to this
| specific attack, but that any AI model is vulnerable to attack
| vectors that the original builders cannot predict or
| preemptively guard against. if you say "well, a
| superintelligence won't be vulnerable" you are putting your
| faith in magic.
| dragontamer wrote:
| They developed a system / algorithm that reliably defeats the
| most powerful Go AI, and is a simple enough system for a
| trained human to execute.
|
| Surely that's important? It was thought that AlphaGo and KataGo
| were undefeatable by humans.
| aithrowawaycomm wrote:
| It's more a lesson about the dangers of transferring an
| objectively true statement: "MuZero can beat
| any professional Go player"
|
| to a squishy but not false statement: "MuZero
| is an AI which is superhuman at Go"
|
| to a statement which is basically wrong:
| "MuZero has superhuman intelligence in the domain of Go."
| vouaobrasil wrote:
| Not so encouraging. This paper will just be used to incorporate
| defense against adversarial strategies in Go playing AIs. A
| simple curiosity, but one reflective of the greater state of
| affairs in AI development which is rather dismal.
| brazzy wrote:
| According to the abstract, "The core vulnerability uncovered by
| our attack persists even in KataGo agents adversarially trained
| to defend against our attack."
| vouaobrasil wrote:
| Well, that does not apply to future Go AIs for all of time.
| kadoban wrote:
| Okay, how does one protect against it then? Why would this
| _not_ apply to any future ones?
| 383toast wrote:
| NOTE: this is a july 2023 paper, the defense paper in september
| 2024 is https://arxiv.org/abs/2406.12843
| 8n4vidtmkvmk wrote:
| > We find that though some of these defenses protect against
| previously discovered attacks, none withstand freshly trained
| adversaries.
| casey2 wrote:
| Reminds me of how even after deep blue chess players learned
| better anti computer strategies. Because the space of Go is so
| much larger there are likely many more anti computer strategies
| like this. It exploits the eval function in the same way
|
| Like chess more compute will win out, as has already been shown.
| I will remind everyone that elo is a measure of wins and losses
| not difficulty, conflating the two will lead to poor reasoning.
| snovv_crash wrote:
| Elo also takes into account the strength of the opponent, which
| is a pretty good proxy for difficulty.
| nilslindemann wrote:
| Here have some edge cases for chess, fortresses. The first three
| are "0.0" in the fourth black wins.
|
| 8/8/8/1Pk5/2Pn3p/5BbP/6P1/5K1R w - - 0 1 (white can not free the
| rook)
|
| 1B4r1/1p6/pPp5/P1Pp1k2/3Pp3/4Pp1p/5P1P/5K2 b - - 0 1 (the rook
| can not enter white's position)
|
| kqb5/1p6/1Pp5/p1Pp4/P2Pp1p1/K3PpPp/5P1B/R7 b - - 0 1 (Rook to h1.
| King to g1, Queen can not enter via a6)
|
| 2nnkn2/2nnnn2/2nnnn2/8/8/8/3QQQ2/3QKQ2 w - - 0 1 (the knights
| advance as block, so that attacked knights are protected twice)
|
| In the first both Stockfish and Lc0 think white is better
| (slightly on a deep ply). In the second and in the third they
| think black wins. Lc0 understands the fourth (applause),
| Stockfish does not.
| diziet wrote:
| Links to these fortresses to those without familiarity with
| chess:
|
| https://lichess.org/analysis/standard/8/8/8/1Pk5/2Pn3p/5BbP/...
| https://lichess.org/analysis/fromPosition/1B4r1/1p6/pPp5/P1P...
| https://lichess.org/analysis/fromPosition/kqb5/1p6/1Pp5/p1Pp...
| https://lichess.org/analysis/fromPosition/2nnkn2/2nnnn2/2nnn...
| FartyMcFarter wrote:
| I'm not surprised that engines aren't tuned / haven't learned
| to evaluate positions like the last one (and probably most of
| the others) - there's absolutely no way this kind of position
| shows up in a real chess game.
| mtlmtlmtlmtl wrote:
| The last one, for sure won't happen. The two with the crazy
| pawn chains are unlikely, but these extremely locked
| structures do occasionally occur. And the first one is
| actually pretty plausible. The situation with the king on f1
| and the rook stuck in the corner is fairly thematic in some
| opening.It's just not well suited for engine analysis and
| fairly trivial for humans because we can eliminate large
| swathes of game tree via logic.
|
| I.e. Assuming the black bishop and knight never move, we can
| see the kingside pawns will never move either. And the king
| will only ever be able to shuffle between f1 and g1.
| Therefore we can deduce the rook can never make a useful
| move. Now the only pieces that can make meaningful moves are
| the two connected passed pawns on the queenside, and the
| light-square bishop. Assume there was no bishop. The king can
| simply shuffle between b6 and c5, and the pawns are
| contained. Can the white bishop change any of this? No,
| because those two squares are dark squares, and in fact all
| of the black pieces are on dark squares. So the white bishop
| is useless. Ergo, no progress can be made. We've eliminated
| all the possible continuations based on a very shallow search
| using constraint based reasoning and basic deduction.
|
| Engines can't do any of this. No one has found a generalised
| algorithm to do this sort of thing(it's something I spend a
| silly amount of time trying to think up, and I've gotten
| nowhere with it). All they can do is explore paths to future
| possible positions, assign them a heuristic evaluation. And
| choose the best path they find.
|
| Although, I haven't actually tried to analyse position 1 with
| stockfish. I feel like on sufficient depth, it should find a
| forced repetition. Or the 50 move rule. Though it might waste
| a ton of time looking at meaningless bishop moves. Naively,
| I'd expect it to do 49 pointless bishop moves and king
| shuffles, then move a pawn, losing it, then another 49 moves,
| lose the other pawn. Then finally another 50 moves until
| running into 50 move rule. So back of the envelope, it would
| need to search to 150ply before concluding it's a draw.
| Although pruning and might actually mean it gets there
| significantly faster.
| prmph wrote:
| > Engines can't do any of this. No one has found a
| generalised algorithm to do this sort of thing(it's
| something I spend a silly amount of time trying to think
| up, and I've gotten nowhere with it).
|
| This is exactly why current AI cannot be said to actually
| think in the same fashion as humans, and why AI is very
| unlikey to reach AGI
| zahlman wrote:
| Oh, no, not _this_ paper again.
|
| Please see https://boardgames.stackexchange.com/questions/58127/
| for reference. The first picture there shows a game supposedly
| "won by Black", due to a refusal to acknowledge that Black's
| stones are hopelessly dead everywhere except the top-right of the
| board. The "exploit" that the adversarial AI has found is, in
| effect, to convince KataGo to pass in this position, and then
| claim that White has no territory. It doesn't do this by claiming
| it could possibly make life with alternating play; it does so, in
| effect, by _citing a ruleset that doesn 't include the idea of
| removing dead stones_ (https://tromp.github.io/go.html) and
| expects everything to be played out (using area scoring) for as
| long as either player isn't satisfied.
|
| Tromp comments: "As a practical shortcut, the following amendment
| allows dead stone removal" - but this isn't part of the
| formalization, and anyway the adversarial AI could just not
| agree, and it's up to KataGo to make pointless moves until it
| does. To my understanding, the formalization exists in large part
| because early Go programs often _couldn 't_ reliably tell when
| the position was fully settled (just like beginner players). It's
| also relevant on a theoretical level for some algorithms - which
| would like to know with certainty what the score is in any given
| position, but would theoretically have to already play Go
| perfectly in order to compute that.
|
| (If you're interested in why so many rulesets exist, what kinds
| of strange situations would make the differences matter, etc.,
| definitely check out the work of Robert Jasiek, a relatively
| strong amateur European player:
| https://home.snafu.de/jasiek/rules.html . Much of this was
| disregarded by the Go community at the time, because it's
| _incredibly_ pedantic; but that 's exactly what's necessary when
| it comes to rules disputes and computers.)
|
| One of the authors of the paper posted on the Stack Exchange
| question and argued
|
| > Now this does all feel rather contrived from a human
| perspective. But remember, KataGo was trained with this rule set,
| and configured to play with it. It doesn't know that the "human"
| rules of Go are any more important than Tromp-Taylor.
|
| But I don't see anything to substantiate that claim. All sorts of
| Go bots are happy to play against humans in online
| implementations of the game, under a variety of human-oriented
| rulesets; and they pass in natural circumstances, and then the
| online implementation (sometimes using a different AI) proposes
| group status that is almost always correct (and matches the group
| status that the human player modeled in order to play that way).
| As far as I know, if a human player deliberately tries to claim
| the status is wrong, an AI will either hold its ground or request
| to resume play and demonstrate the status more clearly. In the
| position shown at the Stack Exchange link, even in territory
| scoring without pass stones, White could afford dozens of plays
| inside the territory (unfairly costing 1 point each) in order to
| make the White stones all pass-alive and deny any mathematical
| possibility of the Black stones reaching that status. (Sorry,
| there really isn't a way to explain that last sentence better
| without multiple pages of the background theory I linked and/or
| alluded to above.)
| benchaney wrote:
| There are two strategies described in this paper. The cyclic
| adversary, and the pass adversary. You are correct that the
| pass adversary is super dumb. It is essentially exploiting a
| loophole in a version of the rules that Katago doesn't actually
| support. This is such a silly attack that IMO the paper would
| be a lot more compelling if they had just left it out.
|
| That said, the cyclic adversary is a legitimate weakness in
| Katago, and I found it quite impressive.
| zahlman wrote:
| What is "cyclic" about the adversarial strategy, exactly? Is
| it depending on a superko rule? That might potentially be
| interesting, and explanatory. Positions where superko matters
| are extremely rare in human games, so it might be hard to
| seed training data. It probably wouldn't come up in self-
| play, either.
| benchaney wrote:
| No, it isn't related to superko. It has to do with Katago
| misidentifying the status of groups that are wrapped around
| an opposing group. I assume the name cyclic has to do with
| the fact that the groups look like circles. There are
| images in the paper, but it is a straight forward misread
| of the life and death status of groups that are
| unambiguously dead regardless of rule set.
| immibis wrote:
| Oh, no, not _this_ response again.
|
| The AI is evaluated on the ruleset the AI is trained to play
| on, which is a Go variant designed to be easier for computers
| to implement.
|
| The fact that the AI might have won if using a different
| ruleset is completely irrelevant.
|
| The fact that the AI can be adapted to play in other rulesets,
| and this is frequently done when playing against human players,
| is irrelevant.
| benchaney wrote:
| It's not the same rule set though. The rule set they
| evaluated the AI on isn't one of the ones that it supports.
|
| Edit: This is confusing for some people because there are
| essentially two rule sets with the same name, but Tromp-
| Taylor rules as commonly implemented for actual play
| (including by Katago) involves dead stone removal, where as
| Tromp Taylor rules as defined for Computer Science research
| doesn't. One might argue that the latter is the "real" Tromp
| Taylor rules (whatever that means), but at that point it is
| obvious that you are rules lawyering with the engine authors
| rather than doing anything that could reasonably be
| considered adversarial policy research.
| zahlman wrote:
| Thank you for the clarification.
| mNovak wrote:
| FYI: discussion [1] of this attack from late 2022, notably
| including lengthy discussion from the developer (hexahedron /
| lightvector) of KataGo, probably the most widely used super-human
| Go AI.
|
| Link is mid-thread, because the earlier version of the paper was
| less interesting than the revision later on.
|
| [1] https://forums.online-go.com/t/potential-rank-inflation-
| on-o...
| leoc wrote:
| Are there or may there be similar "attacks" on the LLM chatbots?
| canjobear wrote:
| Yes, this is an area of active research. For example
| https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm...
___________________________________________________________________
(page generated 2024-12-25 23:01 UTC)