[HN Gopher] Who is the most accurate world chess champion?
       ___________________________________________________________________
        
       Who is the most accurate world chess champion?
        
       Author : piotrgrudzien
       Score  : 157 points
       Date   : 2021-12-02 09:18 UTC (2 days ago)
        
 (HTM) web link (lichess.org)
 (TXT) w3m dump (lichess.org)
        
       | BlanketLogic wrote:
       | > At the time of publishing, the last decisive game in the World
       | Championship was game 10 of the World Championships 2016 -- 1835
       | days ago, or 5 years and 9 days. Is the singularity being
       | reached, with man and machine minds melding towards inevitable
       | monochromatic matches?
       | 
       | Very very unfortunate timing but still a valid question.
        
       | ItsMonkk wrote:
       | One of the reasons that Poker players prefer tournaments is
       | because it induces them to move away from perfect Nash
       | equilibrium play and into being exploitable, as someone who plays
       | unexploitable play simply doesn't make it to the money as someone
       | who does. Winning 51% of the time means nothing when you need to
       | be in the top 10% to earn anything back.
       | 
       | It seems like just looking at ACPL isn't looking at this
       | correctly. If someone makes a mistake, and loses some centi-pawn,
       | but it induces an even larger mistake in their competitor, that
       | wasn't a mistake, it was a risk.
        
       | mjw1007 wrote:
       | Is there a risk that this measure is telling us as much about how
       | likely a match was to contain difficult positions as about how
       | skilled the players were?
       | 
       | For example, Karpov and Kasparov sometimes agreed short draws. I
       | wonder if that is flattering their figures.
        
         | jeremyjh wrote:
         | Definitely - if there are lots of good moves to be found
         | accuracy will be higher. This is why when you analyze games for
         | suspicion of cheating you cannot look only at the accuracy
         | figure - you have to take into account how challenging the
         | positions are. Lichess and chess.com both do this but they do
         | not tell us how, for obvious reasons.
        
           | freediver wrote:
           | > Lichess and chess.com both do this but they do not tell us
           | how, for obvious reasons.
           | 
           | Isn't lichess open source?
        
       | tromp wrote:
       | I'm most impressed by Capablanca's jump in highly accurate play
       | back in 1921, that would not be surpassed for another 60 years.
        
         | pk2200 wrote:
         | Capablanca preferred playing quiet, positional chess, often
         | patiently nursing tiny endgame advantages into a win. It's
         | generally much easier to play accurately in simple endgames
         | than complex middlegames, and that's a big factor in his low
         | ACPL score. (I don't mean to take anything away from him - he
         | was probably the best ever at that style of play.)
        
         | marvel_boy wrote:
         | Capablanca without any doubt. Also Morphy.
        
         | LudwigNagasena wrote:
         | In part high accuracy means that the opponents are
         | approximately of equal strength. The way people win games in
         | chess is by complicating the position so much that it forces
         | your opponent to make a mistake while you are able to handle
         | it.
         | 
         | If the accuracy is high, not only it means that the players are
         | good, it also means that they don't ask each other serious
         | questions. Put any human against Stockfish and, I am sure,
         | their ACPL will increase dramatically.
        
       | ummonk wrote:
       | I'm not sure how meaningful these numbers are. I get around 40-50
       | ACPL in my games, and I certainly wouldn't have been anywhere
       | near a match for Botvinnik.
        
       | LudwigNagasena wrote:
       | Precision is a murky concept in chess because it is not a solved
       | game. First, if the move doesn't change the best play result, can
       | it really be called imprecise? Only in terms of practical
       | chances.
       | 
       | And if we are talking about practical chances, why should we rely
       | on computer-centric evaluation? If a human has to choose between
       | a move that leads to the win but they have to find 40 best moves
       | or they will lose and a move that is a theoretical draw but now
       | the opponent has to find 40 moves or they will lose, what should
       | a human choose?
       | 
       | What is even the ACPL of a move from a tablebase? There is no
       | value, it is either a win, a draw or a loss. So while the whole
       | idea behind this exercise is intuitively appealing and certainly
       | captures some sense behind the idea of accuracy, it should be
       | taken with a grain of salt.
        
         | thom wrote:
         | Yep, agree with this. It was surprising to me to find out how
         | large the error bars were in computer evaluations of their own
         | games. In the last TCEC superfinal, for example, the majority
         | of the draws saw at least one position with an evaluation
         | higher than +/-1.0, and 3 of the 26 decisive games came after a
         | 0.0 evaluation. I assume that margin should be even bigger in
         | human games, so it's hard to see what there is to learn
         | (outside some fun lines) from looking at these numbers outside
         | of opening prep.
         | 
         | As for the tablebase question, it would be nice to see
         | win/forced-draw probabilities from engines instead of the
         | increasingly artificial material evaluation.
        
         | Barrin92 wrote:
         | >Precision is a murky concept in chess because it is not a
         | solved game
         | 
         | it's ironically also a murky concept for the opposite reason.
         | In some openings the analysis of GM's goes so deep that they
         | can fairly often play almost exclusively computer-aided prep.
         | There's a big difference between a 40-move game in that kind of
         | theoretical position vs off-beat games.
         | 
         | Depending on the style of the players and how open the games
         | are it gets pretty complicated to figure out what precision
         | actually implies for the games in any real sense.
        
       | [deleted]
        
       | pgcj_poster wrote:
       | I would have liked to see this go back far enough to include
       | Morphy, whom Fischer considered "the most accurate player who
       | ever lived." I would be surprised if Stockfish agreed, but it
       | would be interesting to see.
        
         | dfan wrote:
         | Kenneth Regan's work on Intrinsic Performance Ratings includes
         | estimated ratings for Morphy, which vary widely from event to
         | event but average around 2300, which I think matches the
         | intuitive perception of his strength that modern strong players
         | have. https://cse.buffalo.edu/~regan/papers/pdf/Reg12IPRs.pdf
         | 
         | (Of course, as with all historical players, he would be
         | stronger if he were re-animated today and exposed to modern
         | principles and openings.)
        
       | nomilk wrote:
       | Centipawn loss (or simply the engine's evaluation of a position)
       | doesn't take into account how realistically a human could hold a
       | position.
       | 
       | During yesterday's WCC Game 6 the computer evaluation meant
       | little when players were in time trouble. Anything could have
       | happened going into the first time control, despite the game
       | being dead drawn for the first 3.5 hours.
       | 
       | In the final stages the computer again evaluated the game as
       | drawn, but presumed Nepo could defend _perfectly_ for tens of
       | moves without a single inaccuracy. Super GMs can 't do that given
       | hours or days, let alone minutes.
       | 
       | Last thought: did anyone else assume this was written in
       | R/ggplot2 at first glance? Seaborn and/or matplotlib look
       | strikingly like ggplot2 now days!
        
         | 4by4by4 wrote:
         | In my opinion, neural network engines like Alpha Zero or Leela
         | Zero do a much better job of assessing how difficult a position
         | is to hold. They also report the evaluation in a completely
         | different manner, win/draw/loss probability as opposed to
         | centipawn loss.
         | 
         | For example, in yesterday's game Stockfish was often giving a
         | drawn evaluation (0.00) where Leela Chess gave a win
         | probability of 30%+. I was posting about this during the game.
         | 
         | https://twitter.com/nik_king_/status/1466794534214504454?s=2...
        
           | LudwigNagasena wrote:
           | Stockfish also uses a neural network for evaluation.
        
             | 4by4by4 wrote:
             | True, it's works differently though. And Stockfish does not
             | compute win/draw/loss probabilities as part of its eval. It
             | converts cp to wdl using an exponential fit based stockfish
             | vs stockfish games. So draw percentage in Leela is a lot
             | more interesting and useful.
        
             | ummonk wrote:
             | Yes, but at its heart it's still classical alpha-beta
             | search rather than monte carlo.
        
         | polytely wrote:
         | and exhaustion also starts to play a role after playing for
         | like 7 hours, at some point mistakes will start to slip in.
        
         | eckesicle wrote:
         | I was thinking a bit about this during the game too.
         | 
         | Perhaps alongside centipawn loss (a measure of how many
         | hundredths of a pawn a player loses by making the non-optimal
         | move as determined by a chess AI engine) we could also measure
         | the difficulty of any position.
         | 
         | Stockfish (a popular chess engine) roughly works by
         | constructing a tree of possible moves and evaluating the score
         | according to some heuristic at its maximum depth. The best
         | result at depth n (25 I believe) is considered the best move
         | and incurs 0 centipawn loss.
         | 
         | Perhaps we can define the difficulty of a position by the
         | relative centipawn loss at each preceeding depth in the tree?
         | The difficulty of a position is then determined by the depth at
         | which the best move no longer changes.
        
           | ashtonbaker wrote:
           | This is an interesting thought! A couple of other scattered
           | thoughts I had about this:
           | 
           | - Engine evaluation of a leaf of the tree will always be
           | different and more sophisticated than human heuristics. So
           | there's a problem where a human can't be expected to follow
           | down some lines. Of course, this is always changing, as
           | humans seek to understand engine heuristics better. Carlsen's
           | "blunder" at move 33 was a good example of this, from my
           | memory.
           | 
           | - Maybe there's a difficulty metric like "sharpness", some
           | function of the number of moves which do not incur a
           | significant centipawn loss. Toward the end of game 6, Carlsen
           | faced a relatively low sharpness on his moves, whereas
           | Nepomniachtchi faced a high sharpness, and despite the
           | theoretical draw, this difference will prove to be decisive
           | between humans. This seems like it could interact in
           | interesting ways with your difficulty metric - for example,
           | what does it mean if sharpness is only revealed at high
           | depth?
           | 
           | - It would be interesting to take the tree generated by
           | stockfish, and weight the tree at each node by the
           | probability that a human player would evaluate the position
           | as winning. Then you could give a probability of ending up at
           | each terminal position of the tree. Maybe some sort of deep
           | learning model trained on players previous games? Time
           | controls add such a confounding factor to this, but it would
           | be so interesting to see "wild engine lines" highlighted in
           | real-time.
        
         | Someone wrote:
         | > In the final stages the computer again evaluated the game as
         | drawn, but presumed Nepo could defend perfectly for tens of
         | moves without a single inaccuracy.
         | 
         | I agree that it's not very useful to compare with table bases,
         | especially given the "30 seconds time added per move" regime
         | this was played under by the time they reached the position.
         | 
         | However, I don't think the table bases even have enough
         | information to indicate how close to losing a theoretically
         | drawn position is. So, i don't think this required perfect
         | accuracy to defend against (defining 'inaccuracy' as any move
         | for black that either makes it take longer to reach a draw or
         | moves to a losing position. That, I think, is the most
         | reasonable definition)
        
         | jeremyjh wrote:
         | You are definitely right about the evaluation; I switched
         | between several streams and I don't think there was anyone
         | saying that they didn't prefer white after around move 39,
         | despite a 0.0 eval for a lot of those positions. But part of
         | the reason the eval is misleading is because it might be
         | reflecting a sequence of "only moves" - where only one move can
         | hold that evaluation and it may be very hard to find some of
         | those moves for black, while white has lots of good moves in
         | each position. While that is a problem with human
         | interpretation of eval, I do not see how it invalidates use of
         | ACPL which is an average across the entire game.
        
         | oneoff786 wrote:
         | This is merely the seaborn "darkgrid" style option. You need to
         | set it explicitly if you want this effect.
        
         | nextaccountic wrote:
         | Why isn't the time left of each player an input to the
         | evaluator It shouldn't assume that everyone has plenty of time!
        
           | 4by4by4 wrote:
           | I've thought about this, too. Sadly, times only started being
           | recorded in the last 20 years and last time I tried I
           | couldn't find a large dataset with the times included.
           | 
           | Chess.com did do one study and found a large percentage of
           | mistakes occurs in moves 36-40 because in some time controls
           | additional time is added at move 40.
        
       | blt wrote:
       | > _If we'd used a different chess engine, even a weaker version
       | of the same one -- such as Stockfish 12 -- it may have found the
       | 2018 World Championship the most accurate in history (assuming
       | both players prepared and trained using Stockfish 12 in 2018)._
       | 
       | This would be a really good follow-up experiment. If the
       | theorized result really happens, we would have strong evidence
       | that players are "overfitting" to their training chess engine. It
       | would also be interesting to see how stable the historical
       | figures look between different engines.
        
       | KingOfCoders wrote:
       | Yes "Alan Turing, was the first recorded to have tried, creating
       | a programme called Turochamp in 1948."
       | 
       | But also
       | 
       | "Since 1941 Zuse worked on chess playing algorithms and
       | formulated program routines in Plankalkul in 1945."
       | 
       | https://www.chessprogramming.org/Konrad_Zuse
        
       | umutisik wrote:
       | If "accuracy" measures how well a player matches computer chess,
       | then as players continue to study more and more with chess
       | programs, you would expect their play to match the programs more
       | and more.
       | 
       | Personally I find it odd to measure how well the players match
       | the computer program and call it accuracy. The computers do not
       | open the game tree exhaustively so they give only one prediction
       | of true min-max accuracy.
       | 
       | When Lee Sedol made move 78 in game 4 against AlphaGo, it reduced
       | his accuracy but won him the game.
        
         | RockofStrength wrote:
         | Move 78 was humanity's last great stand against AI in a board
         | game. Lee Sedol, tired and inspired, reddens AlphaGo's ears
         | with a move plucked from a higher dimension.
         | 
         | It now seems humorous that Kasparov once accused people of
         | helping computers behind the scenes. Now chess masters have
         | been caught huddled in bathroom stalls with their smart phones.
         | Chess commentators choose to willfully ignore chess engines in
         | their presentations, in order to enable our understanding of
         | the analysis. The torch has been passed.
        
           | hyperpape wrote:
           | We should be clear: move 78 didn't really work, except that
           | the engine got confused. Other humans and later versions of
           | Go engines can refute it.
        
         | chki wrote:
         | This is addressed in the article by the way.
        
         | thepete2 wrote:
         | I don't know if this is a thing, but chess players might also
         | steer the game in a direction/position which their opponent
         | hasn't studied much, but they have. There's a "social" side to
         | this seemingly "mathematical" game, no?
        
           | JulianWasTaken wrote:
           | This is more so in the opening (the beginning of the game,
           | and separately where engines tend to be a bit less
           | informative) but yes it is definitely part of the chess
           | metagame, and you'll often see commentators talk about
           | whether someone is "still in prep" or has gotten out of it.
           | It often can lead to time advantages if one gets an opponent
           | out of prep.
        
           | thom wrote:
           | Fabiano Caruana (previous World Championship challenger) has
           | said that he's happy to find lines where the machines have
           | you slightly behind, purely because they're less like to have
           | been studied in detail by your opponent. Even with perfect
           | recall of the first 20/30 moves in various lines, players are
           | still going to steer away from some lines based on their and
           | their opponent's strengths (tough against super GMs with few
           | weaknesses though). So you're definitely right, I think
           | there's a lot of game theory here, albeit much of it settled
           | by your team ahead of the actual match.
        
       | oh_my_goodness wrote:
       | It's strange how many times the article says 'chess software' has
       | improved since (Turing's day, the 1990s, whenever). Sure, the
       | software is better, but six orders of magnitude in hardware
       | performance haven't hurt either.
        
         | ummonk wrote:
         | The hardware improvement has been huge, but on the other hand
         | if you pit Stockfish NNUE against top 1990s software on equal
         | modern hardware, Stockfish would win handily. It's really been
         | both hardware and software improving.
        
       | marcodiego wrote:
       | Why not use something similar to alphago zero to carefully
       | analyze chess games of a deceased player until it is able to
       | mimic its decisions?
       | 
       | It could bring many players "back to life". It would be even
       | possible to watch "impossible matches" like Kasparov vs
       | Capablanca!
        
         | LudwigNagasena wrote:
         | That seems like a complex problem. People don't play that many
         | chess games in their life to get a dataset on the scale
         | required for neural networks. You would need to train a general
         | chess engine and then to tweak it using few-shot learning. But
         | I doubt it could capture high level ideas behind player styles
         | unless someone comes up with a smart architecture for that.
        
         | leelin wrote:
         | Chess.com has the "personality bots" that supposedly play with
         | the style of various well-known players, streamers, and GMs.
         | 
         | But I remember watching Hikaru Nakamura stream once playing
         | through each of these bots (and beating them fairly easily). He
         | commented that several of the bots were doing things the real
         | players would never do, both in style and even the opening move
         | (1.e4 for a player that almost always opens 1.d4)
         | 
         | It was fairly early after the personality bots came out, so
         | maybe they've fixed it by now.
        
           | 8note wrote:
           | My guess on the personality bots is that they set the bot to
           | play at the players' current rating, not training ml based on
           | the games.
        
           | marcodiego wrote:
           | Chessmaster 3k had that feature. But I was never good enough
           | in chess to evaluate how well it worked. Still, I thought
           | about the simplest method:                 - get a chess
           | playing algorithm (I think it will probably well with minimax
           | or mcts) with many tunables,            - use a genetic
           | algorithm to adjust the tunables of the first algorithm; use
           | how similar it plays (make it choose a move on positions from
           | a database of games from said player) as a goal function.
           | 
           | Doesn't seem terribly complicated to do, but don't know how
           | similar to a human it would play.
        
         | thom wrote:
         | You'd be training purely on games with outdated theory, in
         | which case the engine would lose to those trained from more
         | modern repertoires. Or you'd let it learn through self play
         | after initially showing it the human games, in which case it
         | would probably quickly lose the identifiable stylistic aspects
         | of its initial training.
        
           | csee wrote:
           | The point isn't to make an unbeatable chess player, it's to
           | 'bring them back to life'.
        
             | thom wrote:
             | But what I mean is Kasparov would destroy Capablanca. Even
             | outside of what one might consider 'raw chess talent', he
             | was drawing on decades of better theory and would deploy
             | that knowledge. It would be hard to simulate Kasparov as if
             | he were taught chess in Capablanca's time (maybe not
             | impossible and a fascinating project, I just don't see how
             | you'd do it).
        
               | msbarnett wrote:
               | > It would be hard to simulate Kasparov as if he were
               | taught chess in Capablanca's time (maybe not impossible
               | and a fascinating project, I just don't see how you'd do
               | it).
               | 
               | I don't think they were suggesting that's the result they
               | wanted - if you could somehow magically reanimate
               | Capablanca in real life and pit him against peak
               | Kasparov, he might lose badly.
               | 
               | A neural net having the same outcome is essentially
               | what's being asked for. Kasparov raised on Capablanca's
               | era chess or vice versa would be unrecognizably different
               | players, and I don't think anybody expects an AI to
               | simulate their soul.
        
               | thom wrote:
               | Fair enough. I don't think this is as interesting an
               | experiment as people think though. Nobody wants to see
               | Morphy on zero points but that's what would happen.
        
               | thom wrote:
               | I already want to take this back because Morphy probably
               | would pick up points if we ran a tournament on the basis
               | of all official and unofficial world champions, plus
               | people with openings named after them. But the
               | correlation between date of peak and performance would be
               | extremely high.
        
       | lisper wrote:
       | Just in case you, like me, were wondering what the word
       | "accurate" means in this context:
       | 
       | https://support.chess.com/article/1135-what-is-accuracy-in-a...
        
       | thom wrote:
       | I think it would be worth looking at a player's accuracy in terms
       | of their cohort's standard deviation, given that theory is more
       | or less shared across all players. Even then, the best players
       | now have the best teams and computers, so a lot of Magnus's
       | accuracy in this game is a credit to Jan Gustafsson et al. I've
       | been thinking how you might capture the player's accuracy out of
       | their prep, that seems a better measure, but even then you're so
       | often choosing between five +0.0 moves by the middle-game, and
       | you could easily play many totally accurate moves if you didn't
       | feel like agreeing a draw. I know some have looked at Markov
       | models of a player's likelihood of a blunder to analyse this
       | instead.
       | 
       | Personally I've never felt Magnus enjoyed the modern game with as
       | much opening preparation as we have now. It seems like he's only
       | in the last few years invested the time in this, instead of
       | relying on his technique to win even from losing positions. I
       | hope AlphaZero proving that fun positional ideas like pawn
       | sacrifices and h4 everywhere reinvigorated him somewhat during
       | his dominant first half of 2019, so there's still hope the
       | machines haven't just drained the romance from the game, even if
       | their ideas remain dominant.
        
       | hippodamos wrote:
       | Sorry i am highjacking this thread. I am on a quest to find the
       | rules of the chess variant finesse by GM walter Browne. If anyone
       | knows them :
       | 
       | https://lookingforfinesse.github.io/lookingforfinessevariant...
        
       | raymondh wrote:
       | For historical human-to-human games, it would be more interesting
       | to see how well players targeted with weaknesses of their
       | opponents. That skill likely mattered more than absolute accuracy
       | as measured by computers.
        
       | bjourne wrote:
       | In chess ACPL roughly works like goals scored (conceded) in
       | football. Goals are made when the defending team makes mistakes.
       | A team that is a master of defense will concede few goals. But
       | will also score few goals since defending well requires playing
       | cautiously. Its the same with attacking, aggressive teams. They
       | both score and concede more goals than the average.
        
       ___________________________________________________________________
       (page generated 2021-12-04 23:01 UTC)