[HN Gopher] Mastering Stratego
___________________________________________________________________
Mastering Stratego
Author : beefman
Score : 97 points
Date : 2022-12-01 20:20 UTC (2 hours ago)
(HTM) web link (www.deepmind.com)
(TXT) w3m dump (www.deepmind.com)
| [deleted]
| beefman wrote:
| The paper is sadly paywalled. I believe this is the preprint:
|
| https://arxiv.org/abs/2206.15378
| waprin wrote:
| Great article. I played Stratego a lot as a kid and it always
| felt simpler than chess, go , or poker so it's surprising it's a
| much bigger game tree unless you stop and think.
|
| I'm curious about the comparisons to poker. I know the hot
| algorithm in poker solvers is counter factual regret
| minimization. The article indicates that the feedback cycle is
| too long for those algorithms to work but I'd be curious to learn
| more about the relationship from CFR to what's tried here, if
| any.
| _HMCB_ wrote:
| I played it too. My goodness what a blast from the past. To be
| honest, none of my friends liked to play. I mostly played by
| myself it seems. LOL.
| sdwr wrote:
| I remember seeing a version of the paper earlier in the year (it
| talked a lot about getting the bot to be aggressive to avoid
| stalemates).
|
| Feels like the secret sauce has to be probability distributions
| guessing what all the pieces are.
|
| Bluffing in stratego _seems_ like it requires long-term planning
| (if you move a 2 like a 10, you have to keep treating it like
| that for the bluff to work).
| dr_faustus wrote:
| Call me a cynic but the fact that after almost 10 years of AI
| hype we are still working our way down the list of popular board
| games is a bit of a downer for me. I mean, having AIs to play
| Stratego, Risk, Go, Diplomacy and what have you against sure is
| nice. But there are literally billions of dollars spent on these
| projects and I really come to the point where I just don't
| believe anymore that the current AI approaches will ever
| generalize to the real world, even in relatively limited scopes,
| without the need for significant human intervention and/or
| monitoring. What am I missing?
| zaptrem wrote:
| Have you tried this out yet? https://chat.openai.com/
|
| It's been providing real value to me over the past day for
| practicing Spanish, explaining Machine Learning concepts, and
| doing fancy write-ups in LaTeX. And this one can't even use
| Google yet! (other research teams have already created models
| capable of doing so, it's only a matter of time until these
| innovations are brought together in one place)
| VHRanger wrote:
| AI remains better than humans at anything that has well defined
| rewards and small time gap between action and feedback
| mechanism (either naturally, like poker, or by value function
| engineering, like Go or Chess)
|
| The problem here is that it's missing the "glue" to more real
| world applications. This is where more humdrum software
| engineering comes in.
|
| Diplomacy in this is much more interesting than Stratego or
| beating the next video game - it mixes cooperative game theory
| with NLP and reinforcement learning.
| runarberg wrote:
| There was a time when I thought that maybe there was something
| more to AI then a fancy statistical model when you need to fit
| non-linear data. But I'm solidly on the belief now that AI is
| precisely a very powerful statistical tool. I honestly think
| there was never any real strategy of getting AI to anything
| more then specialized learning for deeper inference using a lot
| of computational power.
|
| Don't get me wrong, using AI for that purpose is pretty amazing
| (but can also lead to some sketchy results if you don't know
| what you are doing[1]) but pretending it will lead to some
| "general AI" is nothing but hype IMO. And teaching AI to play
| these board games better then a grandmaster only serves to
| increase that hype.
|
| 1: https://www.vox.com/recode/2019/8/15/20806384/social-
| media-h...
| pbronez wrote:
| I'm not an AGI fanboy. I agree that the current line of
| inquiry (ie deep learning) won't get us there. I think
| neurosymbolic reasoning is needed. That work is still
| nascent, and worse, we don't have great ways to connect our
| current paradigm to it.
| clolege wrote:
| It's interesting to watch the videos they link of deepmind
| playing against the top-level Stratego masters [0]. I usually
| find Stratego to be a bit of a dull game (less elegant and more
| drawn out than Go and chess), but I'm a sucker for watching top-
| level AIs play.
|
| Its skills for bluffing are both fascinating and a bit scary.
|
| [0] https://www.youtube.com/watch?v=HaUdWoSMjSY
| https://www.youtube.com/watch?v=L-9ZXmyNKgs
| https://www.youtube.com/watch?v=EOalLpAfDSs
| https://www.youtube.com/watch?v=MhNoYl_g8mo
| ep103 wrote:
| Are these games against stratego masters? I'm watching the
| first one, but it doesn't say who they're playing against
| sdwr wrote:
| Dont think the player pool is very deep, doubt there are many
| masters around..
| clolege wrote:
| yep, top anonymized players
| VHRanger wrote:
| FWIW one of the big things poker AI taught humans is massive
| overbets (eg. going all in for $200 over a $15 pot).
|
| This is scary to do well in practice, because the
| mathematically optimal bluff frequency approaches 50% as you
| increase the overbet size.
| clolege wrote:
| Wow, that's crazy.
|
| It seems like it would be easier for AI to do, since it
| doesn't have any tells (it's easier to have a poker face when
| you don't have a face at all).
|
| I remember playing poker as a kid, and experimenting with
| pretending like my cards were good/bad with body language. I
| don't think that any professional players use that approach
| (they just have sunglasses and a straight face), but I wonder
| if AI could beat humans even more consistently if it
| developed a way to convey tells and fake tells?
| sdwr wrote:
| Watched some of the first game. I'd bet stratego favors
| defence, advantage to the AI that has no/minimal concept of the
| value of time.
| clolege wrote:
| Yeah this is one of the reasons why I find it more dull than
| chess.
|
| There is an incentive to just _not_ move your pieces, so that
| the other player thinks they 're bombs. As a result, players
| only activate 2-3 pieces at a time.
|
| In chess, on the other hand, you are constantly moving your
| pawns to the other side to promotion, or otherwise trying to
| activate/coordinate all of your pieces for an attack.
|
| It makes me think that if deepmind was trained to _not lose_
| instead of _win_ , then the top strategy might be shuffling
| pieces and letting the enemy come to attack. No human would
| ever have the patience to play that way though.
| jstummbillig wrote:
| Can anyone shed light on in what way this is more challenging
| than the starcraft or dota agents, which also had to work with
| imperfect information?
| Tenoke wrote:
| Starcraft and Dota benefit a lot from having good micro.
| Stratego seems to be only macro. Micro is easyish for AI and
| requires less long-term thinking to get benefits from.
| adgjlsfhk1 wrote:
| Starcraft especially only resembles a strategy game in GM
| (and maybe high masters). Below that, the strategy is mostly
| macro better so you have more units.
| machina_ex_deus wrote:
| Dota is a pretty local game. 70% tactics, 20% strategy. Maybe
| 10% information. Yes you have warding game but for an AI with
| no cost of looking at heroes inventories (humans need to waste
| attention and move their map) AI already has huge advantage
| over humans in the imperfect information part. Usually fighting
| into the imperfect information is the bad choice.
|
| Stratego is 40% information, 40% strategy, maybe 10% tactics.
| If you know where is the flag it's trivial to win in almost all
| situations. Fighting into imperfect information is literally
| all the game.
| ghostbrainalpha wrote:
| Is Tactics the same thing as execution?
|
| Like is it just the speed of your clicking? Or is it more
| than that, like the most basic kinds of strategic decisons?
| keithnz wrote:
| in dota the tactics is to do with the execution of
| abilities, often times in coordination with other agents in
| execution of their abilities to get combo effects while
| adapting to the situation as it unravels.
| dtdynasty wrote:
| As an avid dota player I wouldn't agree with your
| characterization that 70% of dota 2 is your definition of
| tactics. What I've noticed differentiates player MMR the
| most is the strategy applied to each context. It's rarely
| the execution that's the problem as you can gain such
| overwhelming advantages through strategy.
| machina_ex_deus wrote:
| There's barely any long term strategy in Dota, only
| meaningful strategic decisions are items and heroes. Even
| ultimate usage has like 2 minute window of importance.
| Wards too. And maybe the decision to push high ground
| because of how many times games are lost because of it,
| but it's the tactical errors usually making most of the
| difference there.
|
| What's your MMR, out of curiosity?
| machina_ex_deus wrote:
| It's from things like properly last hitting creeps, good
| reaction timing, good reaction decisions, coordinating real
| time actions with teammates in milliseconds resolution.
|
| It's obviously not about clicking fast, but it is about
| timing, sometimes 100 milliseconds reaction time make huge
| difference in outcome. It is usually making decisions on
| very small time scales. Do you retreat or continue? Use
| ability or hold it? Can you overextend?
|
| The only meaningful strategic decisions in dota (which you
| have long time frame of deciding and effect the game for a
| long duration) are draft (which AI doesn't really master,
| they reduced the heroes pool to simplify) and item
| purchases, and there are only a handful of them (~6) in an
| entire game. Other decisions don't really have a long
| "memory" time, a minute or two at the most. After two
| minutes every other decision is just reduced to the
| relative advantage between the teams.
|
| There used to be one hero in Dota which made it a strategy
| game instead (techies). But it was like playing a different
| game and everyone hated it and it was effectively removed.
| Techies was like playing stratego against chess players,
| they obviously get pissed off by not playing what they
| wanted.
| dtdynasty wrote:
| There are larger strategic decisions that are significant
| in dota. Which area of the map to play, which objectives
| are important and when, what type of fights will we win
| (fast and bursty) and when will we take them. Often times
| these are thought of at the beginning of the game and
| effect gameplay throughout.
| rosmax_1337 wrote:
| Dota at the mid-casual and high-casual brackets (which is
| where you find most players) is also a social game.
| Establishing efficient leadership, communication and
| cooperation in a game gives you a huge advantage. And the
| low-casual and the pro levels you find it becomes more a game
| of skill and strategy funnily enough.
|
| (The old joke is that Dota is a 1 v 9 game, not a 5 v 5)
| beefman wrote:
| There's an extra space in the link to their code (at the end of
| the article). The correct URL is:
|
| https://github.com/deepmind/open_spiel/tree/master/open_spie...
| ArtWomb wrote:
| Wow! Thanks to DeepMind for OpenSpiel! Am looking forward to ai
| experimenting with Stratego, Battleships & Hanabi ;)
___________________________________________________________________
(page generated 2022-12-01 23:00 UTC)