Evolving Chess Puzzles. An exploration of Evolutionary AI | by Robert Elmes | Mar, 2024

An exploration of Evolutionary AI

A chess puzzle, generated using the theory of evolution. Checkmate in 2 moves for white…

Evolutionary Algorithms (EAs) are a subset of AI that solve problems using methods inspired by biological evolution. From optimizing neural networks to resource scheduling, they have a stunning range of applications in the real world. Their beauty emerges through a shift in focus in what’s required to solve a problem. Instead of describing the steps required to reach a goal, EAs describe what the goal looks like.

In this article I will explore how we can utilize this fantastic AI to generate chess puzzles, the benefits it provides, and the drawbacks we need to consider.

A chess puzzle is a legal chess position, where one unique combination of moves results in a win, often ending in a checkmate. They are typically found by analysing databases of competitive games between human players.

By generating my own puzzles using nothing but code, randomness, and a sprinkle of biology, an interesting, diverse database of puzzles can be created. Lets explore how.

Evolutionary Algorithms typically work by randomly generating a large population of results, then picking the ‘fittest’ results using a heuristic and finally taking those ‘fittest’ results and generating subsequent random populations. They are inspired by Darwin’s theory of natural selection, where the animals in a population which are more likely to survive are also more likely to pass on their traits to the next generation. After many generations, sometimes hundreds of thousands, the population converges on an optimal result. So how can we apply this to chess?

With chess, we can create a population of random legal positions by simulating games where the program takes it in turns to play random moves for black and white a random number of times. By repeating this process tens of thousands of times, large samples of random positions can be analyzed for fitness.

Below, you can see a function from my Board class, which returns a list of moves.

public List<(int[] from, int[] to)> GetAllPotentialMoves(Colour currentColour) 
{
var activePieces = ActivePieces.Find(p => p.colour == currentColour);
var allLegalMoves = new List<(int[] from, int[] to)>();

foreach (var piece in activePieces.pieces)
{
var moves = piece.GetLegalMoves(this);

allLegalMoves.AddRange(moves);
}

return allLegalMoves;
}

Once a population of positions has been generated, the real tricky bit starts. The key to any Evolutionary Algorithm is how you evaluate your heuristic. In my case, only positions where a single solution leading to a checkmate were considered for a puzzle. After narrowing those results down, heuristic is a measure of how difficult it is to choose the correct moves to win the game. But how can a computer program estimate how difficult it is for a human to interpret a chess position?

A puzzle generated using a heuristic favoring knights on the board. Checkmate in 2 moves.

One option is to look at the structure of the puzzle. Is the king safe? Are there moves that don’t solve the puzzle, but look good? Do we sacrifice any material? What pieces are we moving? By evaluating many factors, we can create a measure of difficulty. The issue with this approach is it’s really hard to decide how to create a final score from so many factors. Rigid rules also completely ignore biases in human perception. It might be that even subtle changes to a chess position make it much harder for some individuals to pick the correct move.

So, how can we get a better idea of human performance? By utilizing large databases filled with real games, machine learning models have been trained to play chess like players of certain levels. Through these models we can get a better idea how players of different abilities might attempt a puzzle. Can an AI trained on 1200 rated players solve the puzzle? What about 1600, 1900? The benefit of this approach is it delves deeper into the minds of real players. However, machine learning models are not without their drawbacks. These AIs don’t play like a real player, they play like an approximation of a player. They’re also trained on real, regular games, meaning they might be unreliable evaluating randomized chess positions.

By combining the machine learning models with complex and detailed rule based evaluation, I created a best of both worlds type scenario. A heuristic that both understands the composition of the puzzle, whilst at the same time considering how humans might approach it.

Once the best puzzles in a population have been found, the next step is to create new generations. This can be done through many evolution inspired techniques. I chose to use crossover and mutation.

Crossover involves randomly merging the features of two results in the hope you might end up with the best features of both. We can cross over similar chess positions by going back a number of moves to a shared starting place, then picking legal moves used to reach each result. Perhaps moving the queen gave one puzzle a really good property, and moving the knight made another puzzle interesting. By combining both of these features we create an even more compelling problem.

Similarly, we can mutate puzzles by backtracking and then going forwards a number of moves. Depending on the number of moves you go backwards and forwards it can change the puzzle subtly or massively. Too much mutation and you can find the algorithm never improving, too little and your best result could converge on a single value too quickly.

The most common issue with Evolutionary Algorithms is converging too fast. Initially, the puzzles I was generating stopped improving after only a few generations. In the real world, physical boundaries such as mountains, deserts and seas have prevented populations from crossing over their DNA, allowing genetic diversity to be preserved. Without enough genetic diversity, a population won’t evolve vary far. By running smaller populations of chess puzzles in parallel for a little while, I gave them breathing room enough to maintain some diversity and avoid converging too early.

Evolutionary Algorithms can also be very slow. Chess is certainly no exception. Running heuristic evaluation on millions of chess positions requires a considerable amount of processing. Generally, the longer you run a chess engine on a position the more accurate it can predict the next best move. By finding the sweet spot in time spent analysing each position, picking out the most promising ones and looking at them in much more detail, I could optimise the time a reasonable amount. Deciding when to stop generating is also crucial. If a sample has stopped improving for several generations then perhaps it’s best to start again with a new random population, as it may be unable to improve much further. After countless optimisations, my home PC is able to generate over 1000 challenging puzzles per day using evolution.

Finally, diagnosing errors can be incredibly difficult. With many programs you can expect certain outputs given certain inputs. With evolution it’s a different kettle of fish. I spent a lot of time scratching my head wondering why my population was converging too quickly. Was it position generation? Was it the evolutionary methods, perhaps the heuristic? It can be easy to not even notice if some things aren’t working as intended when the expected output of a program can not be clearly defined.

However, issues aside, the power and potential of this AI technique shines bright for all to see. Using just my old PC I’ve been able to generate almost 50,000 chess puzzles in 3 months, containing an abundance of weird and wonderful positions.

The random nature of the algorithm means that it creates an incredibly colourful and diverse set of puzzles. Interesting tactical problems we rarely see in chess such as queen sacrifices, knight promotions and en passant are easy to find using evolution, but difficult using databases of real games. However, the nonsensical nature of the puzzles makes them less applicable to real world scenarios. Although great fun, an argument could be made that puzzles based on real games are better for learning common patterns in chess games.

As well as being incredibly productive, the algorithm is also exceptionally flexible. Shatranj, lopsided chess boards, it’s easy to extend the EA to work with any derivative of chess. This extendable nature is where the evolutionary technique really excels. You just can’t do this with databases of games, as they simply don’t exist!

A Shatranj puzzle generated by the algorithm. Can you checkmate the white king in 2 moves?

Although a forgotten corner of AI to many, I’ve shown how evolution can be used to create a novel solution to a real world problem. There’s much unexplored potential with this technology. With generative AI on the rise, I wonder what other funky applications people will find for EAs in the future…

You can experience the puzzles for yourself on my website, chesspuzzler.com.

Unless otherwise noted, all images are by the author.