https://dkb.blog/p/chatgpts-chess-elo-is-1400 [https] DKB Blog SubscribeSign in Read in the Substack app Open app Share this post [https] ChatGPT's Chess Elo is 1400 dkb.blog Copy link Twitter Facebook Email ChatGPT's Chess Elo is 1400 [https] Dmitri Brereton 5 hr ago [https] ChatGPT cannot be stopped It shouldn't be surprising that ChatGPT can play chess. But many have claimed it can't be done. These people used bad prompts and came to the conclusion that ChatGPT can't play a legal chess game. With the right prompt, ChatGPT can play and win full chess games. After 19 games on Chess.com, it had an Elo of 1402. Method This experiment was done using the default GPT3.5 model on ChatGPT Plus. ChatGPT played 19 games of 30 minute chess. The prompt was: "You are a chess grandmaster playing as {white|black} and your goal is to win in as few moves as possible. I will give you the move sequence, and you will return your next move. No explanation needed." ChatGPT was given the full move sequence every time, and returned the next move. [https] Opening prompt [https] After the initial prompt, you just need to give it a move sequence. With this prompt ChatGPT almost always plays fully legal games. Occasionally it does make an illegal move, but I decided to interpret that as ChatGPT flipping the table and saying "this game is impossible, I literally cannot conceive of how to win without breaking the rules of chess." So whenever it wanted to make an illegal move, it resigned. Using this method, ChatGPT played 19 games. It won 11, lost 6, and drew 2 1 . Observations Opening ChatGPT used the same opening strategy every time. As white it always opened with e4. And as black it almost always responded with e5, unless they opened with d4, then it responded with d5. After that, it moved the knights and bishops in a very predictable way. It used the most popular beginner strategies in the early game, so it was pretty boring. But once it got out of the early game, it started making some real plays. Going insane One time, instead of giving me a move, it returned a sequence of 10 moves ending in its own checkmate. I wasn't sure if this was its way of saying "yeah I'm screwed" or what, so I just re-prompted it in a new chat and it went back to playing. I noticed that the longer the game sequence was, the more likely it was to return a sequence of moves instead of just the next move. It makes some sense that as the sequences got longer, it valued those tokens more than my actual prompt and wanted to match them. Despite these issues, it was able to play a fully legal 61 move game. [https] Internal state I didn't do any poking to see if ChatGPT really had an internal state of the board. But it always knew when it was taking pieces. It always knew when a move would cause check. It always knew when a move would be checkmate. This makes it possible that it actually knew the board state, but doesn't prove anything. GPT4 Sucks I didn't get to experiment with GPT4 very much because of the rate limits. But in the two games I attempted, it made numerous illegal moves. [https] [https] GPT4 about to output a completely incorrect board state It is possible that GPT4 is worse at chess than GPT3.5, which would be surprising. But the data is too limited to draw meaningful conclusions. It is also possible that GPT4 needs a different kind of prompt to play the game properly. 2000 Elo Prompt One thing that would be interesting to investigate is whether there exists a prompt that can get an even higher Elo. There are worse prompts that produce illegal moves, and this prompt has 1402 Elo. Maybe there's a prompt that has 2000 Elo. Appendix : Full Chess Games This is a list of all the games played by "loss-function" aka ChatGPT. You can click the links to step through the full game. Game 1 (loss) Game gif https://www.chess.com/game/live/72666664483 Game 2 (win) Game gif https://www.chess.com/game/live/72667341699 Game 3 (win) Game gif https://www.chess.com/game/live/72669842517 Game 4 (win) Game gif https://www.chess.com/game/live/72724283481 Game 5 (win) Game gif https://www.chess.com/game/live/72725460899 Game 6 (loss) Game gif https://www.chess.com/game/live/72736477771 Game 7 (loss) Game gif https://www.chess.com/game/live/72738758017 Game 8 (loss) Game gif https://www.chess.com/game/live/72739437207 Game 9 (win) Game gif https://www.chess.com/game/live/72740486535 Game 10 (win) Game gif https://www.chess.com/game/live/72745972545 Game 11 (win) Game gif https://www.chess.com/game/live/72746554187 Game 12 (loss) Game gif https://www.chess.com/game/live/72748891577 Game 13 (draw) Game gif https://www.chess.com/game/live/72751251917 Game 14 (win) Game gif https://www.chess.com/game/live/72753699827 Game 15 (win) Game gif https://www.chess.com/game/live/72756091483 Game 16 (manually decided draw, see footnote) The legendary 61 move game. Game gif https://www.chess.com/game/live/72756830467 Game 17 (win) Game gif https://www.chess.com/game/live/72766301475 Game 18 (win) Game gif https://www.chess.com/game/live/72767463351 Game 19 (loss) Game gif https://www.chess.com/game/live/72767573859 1 One of the draws was a repeated move stalemate. The other was a manual decision by me because my opponent was AFK for half the game and barely had any time to play. I decided to just give them a draw even though ChatGPT was ready to crush them. Comment Share Share this post [https] ChatGPT's Chess Elo is 1400 dkb.blog Copy link Twitter Facebook Email Previous 2 Comments [https] [ ] Ty 3 hr ago Re: Going insane [https] Did you try prefixing each of the sequence prompts with the original instructions ? Expand full comment Reply M. Han 4 hr ago Excellent! As for the why of GPT4, it probably hadn't encountered chess yet, whereas GPT 3.5 may have had more training on the game of chess. A couple of days ago, I tried to play a game of Go using ASCII code, and it was actually able to draw and keep the state of the board for maybe a [https] couple of moves, and then it completely misplaced previous moves, so I ended the experiment early. However, I did notice that it was aware of the rules, but there probably weren't enough tokens to keep the board state. Your record of moves in your prompt is probably the way to go forward with this, and possibly train it forward. Thanks for sharing your experiment. Expand full comment Reply TopNewCommunity No posts Ready for more? [ ]Subscribe (c) 2023 Dmitri Brereton Privacy [?] Terms [?] Collection notice Start WritingGet the app Substack is the home for great writing This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts