Hexapawn is a simple game invented by Martin Gardner. There are relatively few possible positions, which means that it makes a nice example to explore reinforcement learning strategies employed by artificial intelligences.

Firstly, here are the rules of Hexapawn:

In our activity at science fairs, we use an analogue machine that is a collection of drawers, each representing a position in the game (if you want to be fancy you can call this the state space). Hexapawn is a very simple game, but even still you need at least 37 drawers if you are playing black, or 33 if you are playing white. We found that small storage drawers made the ideal machine, such as those found here.

Each drawer should be labelled with its corresponding
position and a token corresponding to ** every possible **
valid move from that position is placed within the drawer.

The possible positions and valid moves were computed by writing a
small computer program, which is left as an exercise for the
reader. To save you the work here are all
possible positions and corresponding moves for machines
to play white or
black. Each position is given a
number, below which is written the number of valid moves from that
position. The corresponding valid moves are marked by the same
number with a star (asterisk) next to it; for example any position
marked ^{*}1 is a valid move from position 1.

You now have all the material that you need to build your own machine. Print out and cut up the appropriate set of positions and moves and use them to label and fill the drawers. For extra durability, we laminated the paper before cutting it up.

Once you have your machine you need **train** it to play
the game and you do that by playing against it. You can play on a
portion of a chessboard, or if you are really dedicated you can cut
up an old chessboard to make a 3x3 grid, which is what we did. Part
of the fun is trying to find the correct position in the set of
drawers, or having the students work out whether it is their move or not.

Below is one way of training a machine that plays black. Of course, you can train your machine by using variations of the same basic idea which is to penalise bad moves and reward good moves. If you wish to pursue these ideas further than you can read about machine learning in general and, more specifically, reinforcement learning.

There are many possible extensions to this activity.

- Consider finding the optimum
strategy by using a systematic case-by-case analysis
rather than a probabilistic approach.

In fact, hexapawn is a solved game and it is known that for a 3x3 grid white always wins. If you have a 4x3 grid then black always wins.

- How can the strategy be applied to other simple games?
- Can you translate the algorithm being used into a computer program?
- What is the computational complexity of the algorithm?