[HN Gopher] Counterfactual Regret Minimisation or How I won any ...
___________________________________________________________________
Counterfactual Regret Minimisation or How I won any money in Poker?
Author : whoami_nr
Score : 108 points
Date : 2023-12-31 11:11 UTC (1 days ago)
(HTM) web link (rnikhil.com)
(TXT) w3m dump (rnikhil.com)
| Vecr wrote:
| That's evidential decision theory, right? You minimize the
| expected regret. If that's not risk adverse enough for you, you
| can weight together multiple perturbation groups for your world
| model and utility function.
| blackbear_ wrote:
| I'm curious if one can make any money playing poker online while
| following some computer-optimized strategy. I assume many (most?)
| players are already doing this. Insights are appreciated :)
| 0xDEADFED5 wrote:
| poker bots have been around for as long as online poker has
| been. if you wanna get really sketchy have more than one bot on
| a table working together.
|
| pokertracker has been around forever too, lets you keep track
| of how certain people to play so you can optimize strategy.
| whoami_nr wrote:
| Author here. Yes, you can still make money in online Poker. Its
| not as juicy as it was and rake is very high these days making
| micro stakes unbeatable. Moreover, highest stakes usually run
| live and access is gated. The old 2000s dream of grinding from
| micros to become a high stake player is no longer possible.
| Regd. RTA and bots, they have always existed and constantly
| keep getting better. Its a cat and mouse game between websites
| and cheaters which will continue
|
| GGPoker's anti fraud detection system:
| https://www.natural8in.com/security-ecology-agreement
|
| Remember this is the public stuff and they work with
| professional players to do a lot of behind the scene stuff to
| ensure a fair game.
| ryandrake wrote:
| > GGPoker's anti fraud detection system:
|
| That just looks to me like a list of rules with vague
| promises of "sophisticated proprietary software, trust us
| bro" enforcing the rules. Who knows for sure how much of that
| is actually implemented? We have to take the company's word
| for it.
|
| I don't doubt some of that cheat detection exists, but some
| of it also seems pretty fantastic.
| rightbyte wrote:
| A approximately optimal bot would probably need 10s of
| thousand of hands to be identified as a bot. I mean, the
| node chance weights are not the same between different bots
| for the state nodes. The state is probably even compressed
| in different ways. The variance is way too high. It is not
| like chess.
|
| The "sophisticated proprietary software, trust us bro" is
| probably mainly used as a way to not do payouts. Online
| casinos are really shady.
|
| Winning in poker is about identifying "fish". Being 0.001%
| better than the other bots at the table wont beat the
| house.
| nhggfu wrote:
| guess you missed this thread OP
|
| https://forumserver.twoplustwo.com/29/news-views-
| gossip/supe...
|
| GG is doing not even basic modeling of theoretically possible
| vs actual win rates to identify outliers (cheaters..)
|
| I'd say that the assertion that these "named pro-players" are
| doing anything to ensure a game seems like utter PR fluff
| nonsense, esp given they have 0 credentials / skills in data
| science / analysis etc.
| whoami_nr wrote:
| Yes I have seen the thread. Catching a player using just
| the winrate is hard. 9k hands is peanuts and some of the
| assumptions in that thread (like a 53% VPIP player must
| have 0bb/100 winrate) are slightly far fetched.
|
| I know for a fact that GG has a GTO detection algorithm. If
| you play too close to GTO/optimal strategy they
| investigate. Lot of RTA folks got caught this way.
|
| Jason Koon and Fedor Holz are part of the team which
| manually reviews statistical anomalies. Its stupid to think
| they dont know about the data side of Poker. Moreover, a
| ton of Poker players end up becoming traders and data
| scientists. There is a lot of skill overlap.
|
| I agree that some of the stuff is PR nonsense but to
| dismiss their entire anti fraud operation is just stupid.
| They are literally the best in the world at this.
| joelthelion wrote:
| Something that is surprising to me is that there are seemingly no
| strong open-source poker AIs available. Maybe it's because
| implementing CFR for poker is genuinely difficult?
| whoami_nr wrote:
| Author here. There are CFR implementations and even open source
| implementations of Pluribus. A google search shows me many such
| implementations.
|
| The tough part is not implementing the bot or replicating the
| research(Noam Brown mentions in the linked AMA that it just
| cost couple hundred dollars for its training). The hard part is
| the infra for cheating. Setting up multiple accounts on the
| website (bypassing KYC checks) and getting reliable cashouts
| (poker sites do a ton of AML checks). Its easier to do it in
| lower stakes but the cost of running the bot will eat up the
| win rate. On the flip side, the chances of you getting caught
| are very high when you play high stakes.
| joelthelion wrote:
| I'm not looking to cheat, but to simply play against strong
| AIs to train.
|
| However I haven't found any complete ready-to-play
| implementations. There are a few simple implementations of
| CFR on github, but that's a long way from a complete poker
| bot.
| whoami_nr wrote:
| Ah then, just go buy any of the training tools like GTO+,
| GTOBase, Vision(by run it once), GTOWiz, PLOMastermind tool
| by Jnandez etc. They are essentially coaching software to
| learn the game(powered by a solver) and a ton of them have
| a practice section where you can play against an AI. One of
| the earliest Poker solver devs Oleg Ostroumov made this
| back in 2013 or so: https://holdem.olegsolvers.com/ . You
| can play with that.
|
| On a side note, he also wrote about his story of building
| the first productized solver here:
|
| https://medium.com/@olegostroumov/worlds-first-poker-
| solver-...
| joelthelion wrote:
| Oleg's solver is fantastic, thank you! It's only for
| heads up, though.
|
| I'm still surprised by the difference with the chess
| world where all the strong engines are free and open-
| source. Maybe poker is less attractive to developers, or
| maybe poker devs are simply more inclined to make money
| (which I completely understand!)
| whoami_nr wrote:
| It's the latter. All the best engines are closed source.
| It makes more sense to capitalise on it by deploying it
| for coaching/playing.
| CrazyStat wrote:
| > I'm still surprised by the difference with the chess
| world where all the strong engines are free and open-
| source.
|
| FWIW, this is a fairly recent thing in chess (within the
| last decade). Before 2015 or so closed source commercial
| engines were better than open source options.
| actionflop wrote:
| You may be interested in https://actionflop.com/. It lets
| you play heads up no limit holdem against a game theory
| optimal bot (solved by CFR). It keeps track of your profit
| as well as your loss from making mistakes. It's a work in
| progress, and a coming feature will show you precisely what
| mistakes you made in each hand.
| rightbyte wrote:
| How do it know that an action given a hand and state is a
| mistake? It's quite rare for a play to be 0 probability
| in a strategy, so any action might be considered optimal.
| The tool need to know your "probability distribution" for
| the state and action?
| actionflop wrote:
| Right now, it's only keeping track of the mistakes where
| you pick a play which you should pick 0% of the time (for
| example, you should fold or raise but never call). These
| kinds of mistakes may not be quite as rare as you would
| think. For example, today, over about 1400 hands played
| against the bot, players have lost about 1350 in expected
| value by making these kinds of pure mistakes. This
| amounts to about 48 big blinds per 100 hands. (Granted,
| people might be playing haphazardly and not trying to
| play perfectly.)
|
| I am currently thinking about how to also measure
| mistakes which involve the player's distribution of
| choices when the best play is a mixed strategy. One
| possibility is to keep track of the player's distribution
| over time. This would probably require too large of a
| sample size, so one possibility would be to merge similar
| situations in the game tree when assessing this kind of
| mistake. Another possibility is to have the player
| somehow actually choose a mixed strategy when making a
| decision.
| rightbyte wrote:
| Ok interesting. I think I way underestimated the amount
| of "always bad" actions.
| rightbyte wrote:
| The papers released at the time of CFR poker bot being a thing
| in the academia niche (mainly around a yearly bot tournament)
| miss some key implementation points and are hard to reproduce.
| jeffreyrogers wrote:
| The commercially available solvers may be using CFR, but they are
| not anywhere near as strong as Pluribus because Pluribus pre-
| computed solutions for a reduced state space, then mapped the
| hand and actions into that reduced space and solved from there.
| That meant Pluribus could come up with a much better solution in
| much less time than the commercial solvers. This is also why most
| of the solvers only solve heads up.
| whoami_nr wrote:
| You are sort of correct. This used to be the case until 2 years
| back but now there are solvers like Monker which solve multiway
| spots fairly fast.
| yodon wrote:
| The OP mentions use of CFR in transportation logistics problems.
| Does anyone know of examples?
| munchler wrote:
| I used CFR to solve another card game called Setback (aka Auction
| Pitch), which is a trick-taking game that's similar to, but
| simpler than, Bridge.
|
| CFR is very effective, but slow and requires a lot of RAM. I had
| to create a smaller, abstract version of the game, solve that,
| and then map the result back to the actual game, so I didn't end
| up with a perfect Nash equilibrium, but the solution does still
| play at a super-human level.
|
| One of the interesting things about my approach is that it
| actually uses CFR at two separate levels: First it solves a
| single-deal version of the game, then it uses that solution to
| run CFR again on a repeated version of the game where each player
| accumulates points across multiple deals. (Bidding in Setback is
| highly score-dependent.)
|
| I think a similar approach might be possible for Hearts, but I
| haven't tried it yet. Solving Bridge with CFR may be beyond our
| current capability, but could also be possible in the future.
|
| [0]: https://www.bernsrite.com/Setback
|
| [1]: https://github.com/brianberns/Setback
|
| [2]: https://github.com/brianberns/Cfrm
| rightbyte wrote:
| I made a CRM bot for Texas hold-em limited using the uni
| supercomputer in 2012 something. It was quite good, beat the
| reference bots, but it could not play online. I think there was
| house bots or already mainly good bots in the poker rooms at that
| time. Internet poker has been broken for a long long time.
| idopmstuff wrote:
| The issue of 2012 was more that the bad players all dropped off
| after the DOJ crackdown on online poker in 2011. It was still
| possible to play, but it became a pain in the ass to move money
| on and off the sites. That meant all the very casual players
| just didn't bother, and those were the ones you were making
| money from.
|
| I played very successfully from 2005-2009, and I tried a couple
| of times post-2011 - it was just devoid of the total idiots
| that would feed you money.
___________________________________________________________________
(page generated 2024-01-01 23:02 UTC)