# Hexapawn: a simple game to illustrate reinforced learning

Hexapawn is a simple game invented by Martin Gardner. There are relatively few possible positions, which means that it makes a nice example to explore reinforcement learning strategies employed by artificial intelligences.

Firstly, here are the rules of Hexapawn:

## Build your own learning machine

In our activity at science fairs, we use an analogue machine that is a collection of drawers, each representing a position in the game (if you want to be fancy you can call this the state space). Hexapawn is a very simple game, but even still you need at least 37 drawers if you are playing black, or 33 if you are playing white. We found that small storage drawers made the ideal machine, such as those found here.

Each drawer should be labelled with its corresponding position and a token corresponding to every possible valid move from that position is placed within the drawer.

The possible positions and valid moves were computed by writing a small computer program, which is left as an exercise for the reader. To save you the work here are all possible positions and corresponding moves for machines to play white or black. Each position is given a number, below which is written the number of valid moves from that position. The corresponding valid moves are marked by the same number with a star (asterisk) next to it; for example any position marked *1 is a valid move from position 1.

You now have all the material that you need to build your own machine. Print out and cut up the appropriate set of positions and moves and use them to label and fill the drawers. For extra durability, we laminated the paper before cutting it up.

## Use the class as learning machine

As an alternative, you can get a class (or two) to be the learning machine. Each student represents one or two positions and they should have cards that represent every possible valid move from their position(s). For this purpose, here are all the possible moves for white or black formatted so that each move will fill one piece of A4 paper. You can print these out and give them to the class.

## Training the machine

Once you have your machine you need train it to play the game and you do that by playing against it. You can play on a portion of a chessboard, or if you are really dedicated you can cut up an old chessboard to make a 3x3 grid, which is what we did. Part of the fun is trying to find the correct position in the set of drawers, or having the students work out whether it is their move or not.

Below is one way of training a machine that plays black. Of course, you can train your machine by using variations of the same basic idea which is to penalise bad moves and reward good moves. If you wish to pursue these ideas further than you can read about machine learning in general and, more specifically, reinforcement learning.

## Extension Activities

There are many possible extensions to this activity.

• Consider finding the optimum strategy by using a systematic case-by-case analysis rather than a probabilistic approach.

In fact, hexapawn is a solved game and it is known that for a 3x3 grid white always wins. If you have a 4x3 grid then black always wins.

• How can the strategy be applied to other simple games?
• Can you translate the algorithm being used into a computer program?
• What is the computational complexity of the algorithm?