[HN Gopher] How to Build a Chess Engine and Fail
       ___________________________________________________________________
        
       How to Build a Chess Engine and Fail
        
       Author : xlinux
       Score  : 100 points
       Date   : 2024-11-19 06:38 UTC (16 hours ago)
        
 (HTM) web link (obrhubr.org)
 (TXT) w3m dump (obrhubr.org)
        
       | zelphirkalt wrote:
       | All this I guess comes after laying the ground work, like
       | implementing a bitboard representation or something else, and
       | implementing the logic for being able to tell, whether a move is
       | valid or invalid. That in itself is already a lot of work. iiuc
       | the idea here is "merely" writing the engine, and take everything
       | else as a given.
        
         | obrhubr wrote:
         | The game's implementation itself was furnished with the
         | competition by Sebastian Lague. I completely agree that writing
         | the move logic, validation, etc... is a difficult undertaking
         | especially when it comes to optimisation which is what allows
         | the bots built on top to perform well.
        
           | zelphirkalt wrote:
           | That makes sense for such a competition. Thanks for making
           | this even clearer!
           | 
           | Of course another interesting competition could be to develop
           | _all_ of a chess game and engine and see how low in code
           | people can go. But that would be an entirely different
           | competition.
        
           | james_marks wrote:
           | Is there not an open standard for this? I glanced and didn't
           | see anything, but seems crazy every chess project would need
           | to define the game.
        
             | qsort wrote:
             | There are open components for generic purposes, e.g. draw
             | the board given a PGN, show the moves, check if moves are
             | legal and so on, but they aren't suitable for engine
             | internals. There are several tricks with bits to store
             | positions, hashing schemes for transposition tables and the
             | likes that are key to unlock the maximum performance
             | possible, and they are usually tied to the engine
             | implementation.
        
       | vunderba wrote:
       | As part of my undergrad work, I used similar principles to the
       | article (steady state genetic algorithms) to create a bot capable
       | of playing Reversi where the fitness function was loosely defined
       | as a set of "weights" on each square across the 8x8 board. These
       | were used as part of the evaluation function in the classic
       | minimax algorithm.
       | 
       | We trained over the course of 5-6 days, and the end generation
       | was capable of beating an intermediate player but got completely
       | destroyed by experts. It was a fun project!
        
       | janalsncm wrote:
       | I am curious why the author chose a genetic algorithm rather than
       | standard backprop to distill the evil. Logistic regression seems
       | like a pretty reasonable choice and it'll be a lot faster than a
       | genetic algorithm. Add an L1 penalty for sparsity.
       | 
       | In the past I've tried distilling not just the eval function but
       | the outcome of the search (something like depth 20) in a neural
       | net. It kind of works, but not very well until the net is pretty
       | big. Deepmind did something similar.
        
         | obrhubr wrote:
         | I didn't think of that, but I guess the conclusion as to
         | performance of the model would have remained the same.
        
           | janalsncm wrote:
           | Reading over your article again, it seems like you made the
           | same "mistake" as I did. In other words, the evals you're
           | seeing in the Lichess PGN aren't the raw outputs of the
           | Stockfish evaluation function, they're the output of the
           | search, which is a highly nonlinear function of millions of
           | evals. If I recall from their docs, it's something like depth
           | 18, so your chances of distilling it with 20k parameters is
           | essentially zero.
           | 
           | (I put mistake in quotes here because Deepmind showed that
           | with a server farm of TPUs it is possible to distill the
           | search as well. So it's not _impossible_ per se.)
           | 
           | But that's ok! You've created a super fast evaluation
           | function. Instead of trying to distill the depth 18 output,
           | it will be much more realistic to distill the depth zero
           | output, the NNUE. If you rerun those FENs in Stockfish you
           | can pretty quickly create an easier dataset.
        
       | bob1029 wrote:
       | > To be exact every participant has a maximum of 1024 tokens at
       | his disposal to craft the best chess bot they can.
       | 
       | I'd be quite tempted to try linear genetic programming with a
       | variant of a Brainfuck-style UTM for this kind of constraint.
       | Assuming 1024 tokens = 1024 bytes = 1024 instructions, I think
       | there is _some_ degree of performance that is feasible.
       | 
       | This is really interesting to me because hand-coding BF to do
       | something like chess is absolutely beyond me. I wouldn't even
       | know where to begin. A LGP-coded BF program would almost
       | certainly outperform anything I could design with my own hands.
        
         | obrhubr wrote:
         | This is highly specific to C# which was the language imposed on
         | all participants. But I agree that some languages might be
         | especially adept to these kinds of tasks and it would be
         | interesting to see which.
        
       | senthilnayagam wrote:
       | I built a list of simple games in pygame with code and solver
       | generated by AI.
       | 
       | Have implemented 2048, Minesweeper, Sudoku and chess. First three
       | are single player games have made good progress.
       | 
       | I still don't understand UCI interface and have not thought
       | through chess solving. Hope will take another try this weekend
       | 
       | https://github.com/muonium-ai/simplegames
        
       | snickerbockers wrote:
       | This is something that's been on my bucket list for a while now,
       | although I'd do it using "normal" programming instead of DL.
       | 
       | i feel like if you can bruteforce every possible board
       | configuration for the next 3 turns and then make the move that
       | leads to more desirable outcomes, that ought to be enough to
       | thrash most amateurs. Probably not good enough to beat somebody
       | who actually understands chess tactics on a deeper level, but I
       | expect most non-masters are just "winging it" and making things
       | up as they go along, so a machine that can think farther ahead
       | than you would win more often than not.
       | 
       | But like I said, this is all just me fantasizing about a project
       | I haven't even started yet so my hypothesis could easily be
       | wrong.
        
         | immibis wrote:
         | I did exactly this when the Laser chess variant was posted on
         | Hacker News about last year. It does indeed work pretty well.
         | Players eventually learn patterns that look good in 3 moves but
         | not with a greater depth, though.
         | 
         | It's how real chess engines like Stockfish work, but they are
         | highly optimized so they can search extreme depths.
        
         | vunderba wrote:
         | > bruteforce every possible board configuration for the next 3
         | turns and then make the move that leads to more desirable
         | outcomes
         | 
         | I'd recommend familiarizing yourself with the minimax algorithm
         | - since that's exactly what it does where "desirable outcome"
         | is essentially the _tunable_ evaluation aspect of the equation.
         | 
         | https://en.wikipedia.org/wiki/Minimax
        
         | janalsncm wrote:
         | > make the move that leads to more desirable outcomes
         | 
         | This is kind of begging the question, right? How do you know
         | what a "desirable" outcome is, other than mate-in-N positions?
         | (In your case, N<=3.) The point of an evaluation function is to
         | take a position and return a value which describes how
         | desirable it is.
        
       | vikhack wrote:
       | I don't really know how to do that but I'm really putting in my
       | effort
        
       | skeltoac wrote:
       | I especially enjoyed the link to The Bitter Lesson by Rich
       | Sutton, which I hadn't read before. Now I wonder what
       | "discoveries" have been built into today's AI models and how they
       | might come to be detrimental.
       | 
       | http://www.incompleteideas.net/IncIdeas/BitterLesson.html
        
         | PaulHoule wrote:
         | Has there ever been a serious effort to play chess by "rule-
         | based" methods as opposed to search?
         | 
         | (... other than the evaluation function being based on
         | handwritten or learned rules)
        
           | janalsncm wrote:
           | In the past, yes. That was essentially the approach up until
           | the 1980s. Computers were too slow to run millions of
           | evaluations per move, so you got a lot of efforts in a more
           | heuristic-driven direction.
           | 
           | Examples of heuristic engines are Hans Berliner's CAPS-II and
           | PARADISE by David Wilkins. PARADISE used a database of 200
           | rules.
           | 
           | There are also engines that try to retroactively apply rules
           | to Stockfish evals but they're fairly confusing in my
           | experience. It's like being told the answer and having to
           | come up with a reason after the fact.
        
         | janalsncm wrote:
         | Probably LLM maximalist ideas that suggest infinite "scaling
         | laws" for LLMs (they are not laws), leading to ridiculous
         | conclusions like building a $1 trillion cluster is the fastest
         | way to AGI. People like Leopold Aschenbrenner are in this camp.
         | 
         | Imagine if LLMs were the only way we had to play chess. You'd
         | need a centralized server and peak performance wouldn't even
         | best a grandmaster. You spend $1 trillion building a super
         | cluster because that's all you know.
         | 
         | < - - This is where AI is today.
         | 
         | And then some startup creates Stockfish, a chess engine better
         | than any LLM or grandmaster and can run on a smartphone.
        
       | deadbabe wrote:
       | Why build a chess engine these days, just use an LLM?
        
         | emptybits wrote:
         | Why consume when you can create?
         | 
         | Building is often more fun and almost always more educational
         | than just using. Even in "failure". :-)
        
         | ziddoap wrote:
         | Fun, a learning experience, practicing a skill set, curiosity,
         | etc.
         | 
         | Take your pick.
        
         | achierius wrote:
         | LLMs are quite bad at chess actually, even compared to human
         | players -- and certainly compared to proper modern chess
         | engines
        
           | PaulHoule wrote:
           | They usually struggle to always generate valid moves.
           | 
           | And that's pivotal.
           | 
           | If you have a program which always makes valid moves and
           | gives up when it has lost you wrote a proper chess playing
           | program. It may play badly, but it plays.
        
         | janalsncm wrote:
         | Your question sounds flippant but it's actually quite deep. Why
         | aren't LLMs good at chess? The answer is likely that to be good
         | at chess at some level requires search (and probably some
         | ordering constraint on your evaluation function too that I'm
         | not smart enough to figure out).
         | 
         | LLMs aren't searching. They are memorizing. This explains their
         | extremely poor performance on out of domain positions, whereas
         | Stockfish can easily understand them.
        
       | binarymax wrote:
       | Doesn't look like failure to me - looks like a fun project that
       | disproved your distillation hypothesis!
       | 
       | Great write up and thanks for sharing.
        
       ___________________________________________________________________
       (page generated 2024-11-19 23:00 UTC)