[HN Gopher] Counterfactual Regret Minimisation or How I won any ...
       ___________________________________________________________________
        
       Counterfactual Regret Minimisation or How I won any money in Poker?
        
       Author : whoami_nr
       Score  : 108 points
       Date   : 2023-12-31 11:11 UTC (1 days ago)
        
 (HTM) web link (rnikhil.com)
 (TXT) w3m dump (rnikhil.com)
        
       | Vecr wrote:
       | That's evidential decision theory, right? You minimize the
       | expected regret. If that's not risk adverse enough for you, you
       | can weight together multiple perturbation groups for your world
       | model and utility function.
        
       | blackbear_ wrote:
       | I'm curious if one can make any money playing poker online while
       | following some computer-optimized strategy. I assume many (most?)
       | players are already doing this. Insights are appreciated :)
        
         | 0xDEADFED5 wrote:
         | poker bots have been around for as long as online poker has
         | been. if you wanna get really sketchy have more than one bot on
         | a table working together.
         | 
         | pokertracker has been around forever too, lets you keep track
         | of how certain people to play so you can optimize strategy.
        
         | whoami_nr wrote:
         | Author here. Yes, you can still make money in online Poker. Its
         | not as juicy as it was and rake is very high these days making
         | micro stakes unbeatable. Moreover, highest stakes usually run
         | live and access is gated. The old 2000s dream of grinding from
         | micros to become a high stake player is no longer possible.
         | Regd. RTA and bots, they have always existed and constantly
         | keep getting better. Its a cat and mouse game between websites
         | and cheaters which will continue
         | 
         | GGPoker's anti fraud detection system:
         | https://www.natural8in.com/security-ecology-agreement
         | 
         | Remember this is the public stuff and they work with
         | professional players to do a lot of behind the scene stuff to
         | ensure a fair game.
        
           | ryandrake wrote:
           | > GGPoker's anti fraud detection system:
           | 
           | That just looks to me like a list of rules with vague
           | promises of "sophisticated proprietary software, trust us
           | bro" enforcing the rules. Who knows for sure how much of that
           | is actually implemented? We have to take the company's word
           | for it.
           | 
           | I don't doubt some of that cheat detection exists, but some
           | of it also seems pretty fantastic.
        
             | rightbyte wrote:
             | A approximately optimal bot would probably need 10s of
             | thousand of hands to be identified as a bot. I mean, the
             | node chance weights are not the same between different bots
             | for the state nodes. The state is probably even compressed
             | in different ways. The variance is way too high. It is not
             | like chess.
             | 
             | The "sophisticated proprietary software, trust us bro" is
             | probably mainly used as a way to not do payouts. Online
             | casinos are really shady.
             | 
             | Winning in poker is about identifying "fish". Being 0.001%
             | better than the other bots at the table wont beat the
             | house.
        
           | nhggfu wrote:
           | guess you missed this thread OP
           | 
           | https://forumserver.twoplustwo.com/29/news-views-
           | gossip/supe...
           | 
           | GG is doing not even basic modeling of theoretically possible
           | vs actual win rates to identify outliers (cheaters..)
           | 
           | I'd say that the assertion that these "named pro-players" are
           | doing anything to ensure a game seems like utter PR fluff
           | nonsense, esp given they have 0 credentials / skills in data
           | science / analysis etc.
        
             | whoami_nr wrote:
             | Yes I have seen the thread. Catching a player using just
             | the winrate is hard. 9k hands is peanuts and some of the
             | assumptions in that thread (like a 53% VPIP player must
             | have 0bb/100 winrate) are slightly far fetched.
             | 
             | I know for a fact that GG has a GTO detection algorithm. If
             | you play too close to GTO/optimal strategy they
             | investigate. Lot of RTA folks got caught this way.
             | 
             | Jason Koon and Fedor Holz are part of the team which
             | manually reviews statistical anomalies. Its stupid to think
             | they dont know about the data side of Poker. Moreover, a
             | ton of Poker players end up becoming traders and data
             | scientists. There is a lot of skill overlap.
             | 
             | I agree that some of the stuff is PR nonsense but to
             | dismiss their entire anti fraud operation is just stupid.
             | They are literally the best in the world at this.
        
       | joelthelion wrote:
       | Something that is surprising to me is that there are seemingly no
       | strong open-source poker AIs available. Maybe it's because
       | implementing CFR for poker is genuinely difficult?
        
         | whoami_nr wrote:
         | Author here. There are CFR implementations and even open source
         | implementations of Pluribus. A google search shows me many such
         | implementations.
         | 
         | The tough part is not implementing the bot or replicating the
         | research(Noam Brown mentions in the linked AMA that it just
         | cost couple hundred dollars for its training). The hard part is
         | the infra for cheating. Setting up multiple accounts on the
         | website (bypassing KYC checks) and getting reliable cashouts
         | (poker sites do a ton of AML checks). Its easier to do it in
         | lower stakes but the cost of running the bot will eat up the
         | win rate. On the flip side, the chances of you getting caught
         | are very high when you play high stakes.
        
           | joelthelion wrote:
           | I'm not looking to cheat, but to simply play against strong
           | AIs to train.
           | 
           | However I haven't found any complete ready-to-play
           | implementations. There are a few simple implementations of
           | CFR on github, but that's a long way from a complete poker
           | bot.
        
             | whoami_nr wrote:
             | Ah then, just go buy any of the training tools like GTO+,
             | GTOBase, Vision(by run it once), GTOWiz, PLOMastermind tool
             | by Jnandez etc. They are essentially coaching software to
             | learn the game(powered by a solver) and a ton of them have
             | a practice section where you can play against an AI. One of
             | the earliest Poker solver devs Oleg Ostroumov made this
             | back in 2013 or so: https://holdem.olegsolvers.com/ . You
             | can play with that.
             | 
             | On a side note, he also wrote about his story of building
             | the first productized solver here:
             | 
             | https://medium.com/@olegostroumov/worlds-first-poker-
             | solver-...
        
               | joelthelion wrote:
               | Oleg's solver is fantastic, thank you! It's only for
               | heads up, though.
               | 
               | I'm still surprised by the difference with the chess
               | world where all the strong engines are free and open-
               | source. Maybe poker is less attractive to developers, or
               | maybe poker devs are simply more inclined to make money
               | (which I completely understand!)
        
               | whoami_nr wrote:
               | It's the latter. All the best engines are closed source.
               | It makes more sense to capitalise on it by deploying it
               | for coaching/playing.
        
               | CrazyStat wrote:
               | > I'm still surprised by the difference with the chess
               | world where all the strong engines are free and open-
               | source.
               | 
               | FWIW, this is a fairly recent thing in chess (within the
               | last decade). Before 2015 or so closed source commercial
               | engines were better than open source options.
        
             | actionflop wrote:
             | You may be interested in https://actionflop.com/. It lets
             | you play heads up no limit holdem against a game theory
             | optimal bot (solved by CFR). It keeps track of your profit
             | as well as your loss from making mistakes. It's a work in
             | progress, and a coming feature will show you precisely what
             | mistakes you made in each hand.
        
               | rightbyte wrote:
               | How do it know that an action given a hand and state is a
               | mistake? It's quite rare for a play to be 0 probability
               | in a strategy, so any action might be considered optimal.
               | The tool need to know your "probability distribution" for
               | the state and action?
        
               | actionflop wrote:
               | Right now, it's only keeping track of the mistakes where
               | you pick a play which you should pick 0% of the time (for
               | example, you should fold or raise but never call). These
               | kinds of mistakes may not be quite as rare as you would
               | think. For example, today, over about 1400 hands played
               | against the bot, players have lost about 1350 in expected
               | value by making these kinds of pure mistakes. This
               | amounts to about 48 big blinds per 100 hands. (Granted,
               | people might be playing haphazardly and not trying to
               | play perfectly.)
               | 
               | I am currently thinking about how to also measure
               | mistakes which involve the player's distribution of
               | choices when the best play is a mixed strategy. One
               | possibility is to keep track of the player's distribution
               | over time. This would probably require too large of a
               | sample size, so one possibility would be to merge similar
               | situations in the game tree when assessing this kind of
               | mistake. Another possibility is to have the player
               | somehow actually choose a mixed strategy when making a
               | decision.
        
               | rightbyte wrote:
               | Ok interesting. I think I way underestimated the amount
               | of "always bad" actions.
        
         | rightbyte wrote:
         | The papers released at the time of CFR poker bot being a thing
         | in the academia niche (mainly around a yearly bot tournament)
         | miss some key implementation points and are hard to reproduce.
        
       | jeffreyrogers wrote:
       | The commercially available solvers may be using CFR, but they are
       | not anywhere near as strong as Pluribus because Pluribus pre-
       | computed solutions for a reduced state space, then mapped the
       | hand and actions into that reduced space and solved from there.
       | That meant Pluribus could come up with a much better solution in
       | much less time than the commercial solvers. This is also why most
       | of the solvers only solve heads up.
        
         | whoami_nr wrote:
         | You are sort of correct. This used to be the case until 2 years
         | back but now there are solvers like Monker which solve multiway
         | spots fairly fast.
        
       | yodon wrote:
       | The OP mentions use of CFR in transportation logistics problems.
       | Does anyone know of examples?
        
       | munchler wrote:
       | I used CFR to solve another card game called Setback (aka Auction
       | Pitch), which is a trick-taking game that's similar to, but
       | simpler than, Bridge.
       | 
       | CFR is very effective, but slow and requires a lot of RAM. I had
       | to create a smaller, abstract version of the game, solve that,
       | and then map the result back to the actual game, so I didn't end
       | up with a perfect Nash equilibrium, but the solution does still
       | play at a super-human level.
       | 
       | One of the interesting things about my approach is that it
       | actually uses CFR at two separate levels: First it solves a
       | single-deal version of the game, then it uses that solution to
       | run CFR again on a repeated version of the game where each player
       | accumulates points across multiple deals. (Bidding in Setback is
       | highly score-dependent.)
       | 
       | I think a similar approach might be possible for Hearts, but I
       | haven't tried it yet. Solving Bridge with CFR may be beyond our
       | current capability, but could also be possible in the future.
       | 
       | [0]: https://www.bernsrite.com/Setback
       | 
       | [1]: https://github.com/brianberns/Setback
       | 
       | [2]: https://github.com/brianberns/Cfrm
        
       | rightbyte wrote:
       | I made a CRM bot for Texas hold-em limited using the uni
       | supercomputer in 2012 something. It was quite good, beat the
       | reference bots, but it could not play online. I think there was
       | house bots or already mainly good bots in the poker rooms at that
       | time. Internet poker has been broken for a long long time.
        
         | idopmstuff wrote:
         | The issue of 2012 was more that the bad players all dropped off
         | after the DOJ crackdown on online poker in 2011. It was still
         | possible to play, but it became a pain in the ass to move money
         | on and off the sites. That meant all the very casual players
         | just didn't bother, and those were the ones you were making
         | money from.
         | 
         | I played very successfully from 2005-2009, and I tried a couple
         | of times post-2011 - it was just devoid of the total idiots
         | that would feed you money.
        
       ___________________________________________________________________
       (page generated 2024-01-01 23:02 UTC)