[HN Gopher] Solving Probabilistic Tic-Tac-Toe
___________________________________________________________________
Solving Probabilistic Tic-Tac-Toe
Author : Labo333
Score : 63 points
Date : 2024-06-11 14:25 UTC (2 days ago)
(HTM) web link (louisabraham.github.io)
(TXT) w3m dump (louisabraham.github.io)
| H8crilA wrote:
| Are you sure there is only one solution to the set of max/min
| equations? I've been trying to solve the probabilistic tic-tac-
| toe since it was published, and I believe there are actually
| multiple solutions here.
|
| In fact I'm not sure that given a board there exists a best
| solution. I.e. I suspect that for most boards and for most fixed
| strategies you can create a counter-strategy that outperforms the
| given strategy (but perhaps then loses to some other strategy).
|
| My code looks completely different, but I suspect you can uncover
| alternate solutions if you mess with the binary search. For
| example replace `x = (a + b) / 2` with `x = (a + b) / 3`, or even
| better pick a random point between a and b.
| GistNoesis wrote:
| The article give the bellman equation that you need to solve.
|
| V(s)=max c scxV'(s+c) + fcxV'(s-c) + ncxV'(s)
|
| V'(s)=min c scxV(s-c) + fcxV(s+c) + ncxV(s)
|
| You have to solve for all states simultaneously.
|
| One way to solve them is by using fixed point iteration. aka
| successive approximation.
|
| Pick some small alpha and update the value function in a
| gradient ascent kind of way.
|
| Vi+1(s) = (1-alpha) _Vi(s)+alpha_ ( max c scxV'i(s+c)+
| fcxV'i(s-c) + ncxV'i(s) )
|
| V'i+1(s) = (1-alpha) _V 'i(s)+ alpha _ ( min c scxVi(s-c) +
| fcxVi(s+c) + ncxVi(s) )
|
| https://en.wikipedia.org/wiki/Fixed-point_iteration
| salamo wrote:
| I think there is. In this case we have to clarify that an
| optimal strategy can still lose due to random chance, so the
| optimal strategy will only be clear after some large number of
| games.
| Ruphin wrote:
| There is no hidden information and turns are sequentially. This
| means both players have full access to all information, and
| that implies there must be a single "best" move. If the board
| is at a certain state and one player is to play next, the
| strategy of the opponent is irrelevant for determining the
| optimal move.
|
| Proof:
|
| Suppose that it is possible that different strategies exist
| that outperform each other. Lets consider the case that there
| are three "optimal" strategies that counter each other, call
| them Rock Paper and Scissors. Since these are the three
| "optimal" strategies, both players know the exact move produced
| by these strategies. The first important part to note is that
| there is no cost to changing strategy during the game. If an
| opponent has one fixed strategy, say Rock, the "optimal"
| strategy is to change your strategy to Paper. Since you have
| full information on the state of the board, you do not have to
| guess the strategy of the opponent, because there is no hidden
| information. Now you could consider the opponent strategy to be
| hidden information, but this doesn't hold for the following
| reason: If knowing the opponent strategy would alter your
| choice of move (by for example changing to Rock that
| specifically counters his hidden Scissors), the opponent can
| freely change to Paper after you made your move, because he can
| see that you changed to Rock before making his next move. If it
| was impossible for the opponent to tell that you changed your
| strategy, that implies that each strategy produces the same
| board state, and that implies the strategies are identical.
| blharr wrote:
| Aren't the same conditions true for chess, though there is no
| proof that a best strategy exists?
|
| What if it's not just rock/paper/scissors, and there are more
| than 3 strategies?
| sobellian wrote:
| At least one of the players has a strategy that cannot be
| countered though. That is, either white can force a win,
| black can force a win, or both players can force a draw.
| There is no situation where every strategy can be
| countered.
| phoronixrly wrote:
| I love how the other day when the Show HN[1] post was submitted
| there were a bunch of 'Oh what a great problem to solve with AI',
| when it was painfully obvious that it is not necessary... Is
| there astroturfing here or is AI the current hammer that everyone
| defaults to?
|
| [1] https://news.ycombinator.com/item?id=40635397
| Ukv wrote:
| Unless there's a comment I'm missing, I don't see anyone
| recommending deep learning and neural networks. Comments such
| as "the AI is making really bad plays" mean in the videogame
| sense (the CPU player), as far as I can tell - which was just a
| simple heuristic in the original.
| renewiltord wrote:
| Interesting. When you have a hammer ("people are talking about
| AI but you don't need AI") everything (including "I'm trying to
| make a game AI") looks like a nail. You thought they were
| talking about LLMs, ML, DL? It's a video game AI, dude. If you
| have trouble understanding English you can use an LLM to
| translate for you. Just ask it "Are they talking about LLMs
| here?" after quoting the thing.
| primitivesuave wrote:
| I implemented an expectiminimax player without the neutral state
| optimization that you described 2 days ago:
| https://keshav.is/coding/pt3
|
| It seems like even a naive search of the game tree in the browser
| produces a fairly strong computer opponent to a human player. It
| would be interesting to see if this optimization produces a
| better computer player to standard expectiminimax as I
| implemented here:
| https://github.com/keshavsaharia/pt3/blob/master/src/minimax...
| igpay wrote:
| Hey all, I'm the author of Probabilistic Tic-Tac-Toe. I'm
| currently working with Louis to integrate this solver into the
| game itself so that people can play against it. I'm hoping to
| have it released by EOD and will update this comment when it's
| ready.
| 317070 wrote:
| I'm actually interested in being coached by the solver, so it
| can tell me when I made a suboptimal move and what the optimal
| move would have been.
|
| Like in chess, playing against stockfish isn't very fun, but
| having it as a way to get better is fun.
___________________________________________________________________
(page generated 2024-06-13 23:00 UTC)