Games

Games require different search methods since we are often dealing with an adversary.

Game Tree

A tree in which the nodes denote board configurations, and the branches indicate how one board configuration can be transformed into another by a single move.  A pair of moves is often called a ply.

Need a way to rank game outcomes. The ranking function is called a utility function.  A simple utility function may give -1,0 or +1 for lose, draw and win.
Consider noughts and crosses with two players called MAX and MIN. If MAX is X and goes first then the tree is as follows.

Max will always try and maximize the utility function for the terminal state and min will always try to minimise it.
An exhaustive search of the game tree for a solution is usually not possible.
Consider the following simple game tree.
The utility function is only applied to terminal nodes. If max makes move A1 then Min should make move A11. Thus the result of making move A1is a value of 3 for the utility function. Similarly A2 leads to a value of 2 and A3 a value of 2. Max wants to maximise the utility function and so A1 is the best move to make.

Minimax consists of 5 steps.

  1. Generate the complete game tree.
  2. Apply the utility function to all the terminal states.
  3. Use the utility of the terminal states to calculate a utility value for their parents (either max or min) depending on depth.
  4. Continue up to root node.
  5. Choose move with highest value from root node.
The chosen move is called the minimax decision because it maximises the utility under the assumption that the opponent will play perfectly to try and minimise it.

If the maximum depth of the tree is m and there are b moves at each node then the time complexity of minimax is O(bm) .  The algorithm is depth first so memory requirement is O(bm).

Evaluation functions

Minimax assumes that we can search all the way to the terminal states, this is not often practical.

Instead of doing this we could use a heuristic evaluation function to estimate which moves leads to better terminal states. Now a cutoff test must be used to limit the search.

Choosing a good evaluation function is very important to the efficiency of this method. The evaluation function must agree with the utility function for terminal states and it must be a good predictor of the terminal values. If the evaluation function is infallible then no search is necessary, just choose the move that gives the best position. The better the evaluation function the less search need to be done.

e.g. the following shows some evaluations of chess positions.

The evaluation function is often a linear combination of features.

Cutting off search

The easiest way to stop the search is to have a fixed depth or to use iterative deepening.

Heuristic Continuation

You search for N ply in a tree, and find a very good move. But, perhaps, if you had searched just one ply further you would have discovered
that this move is actually very bad.

In general:

The analysis may stop just before your opponent captures one of your pieces or the analysis stops just before you capture one your opponent's pieces.

This is called the horizon effect: a good (or bad) move may be just over the horizon.

The Singular Extension Heuristic:

Search should continue as long as one moves static value stands out from the rest. If we don't use this heuristic, we risk harm from the Horizon Effect.

e.g. consider the following chess position. Black is ahead in material but if white can reach the eighth row with it's pawn then it can win. Black can stall this for some time by checking the white and so will never see that this is a bad position.
 

Alpha Beta Pruning

Using the Minimax algorithm allows a chess playing program to look ahead about 3 or 4 ply. This is roughly as good an intermediate human player, but still fairly easy to beat. Minimax, however is wasteful because it evaluates some branches of the game tree that it need not. The principle behind Alpha-Beta pruning is:

If you have an idea that is surely bad, do not take time to see how terrible it is.

Consider the following example:

Search proceeds exactly as for minimax evaluating A11 A12 A13 and A1 until, after looking at A21, we find that A2 must have a utility less than 2. We now know that A2 is a bad choice because A1 has a higher utility and we need not evaluate any more branches below A2 .

The following  diagram shows this for a general case.

If we have a better choice m, either at the parent node of n or at any choice point further up, then n will never be reached in actual play.

Minimax is depth first, so all that need be remembered is the best choice found so far for the player (MAX) and the best choice found so far for the opponent (MIN). These are called alpha and beta respectively.

The following algorithm illustrates this.

Effectiveness of alpha-beta pruning.

What are the maximum savings possible?

Suppose the tree is ordered as follows:
 
 

^                ____________________________o____________________________
                /                            |                            \
v      ________o________              _______o________             ________o________
      /        |        \           /        |        \           /        |        \
^    o         o         o         o         o         o         o         o         o
   / | \     / | \     / | \     / | \     / | \     / | \     / | \     / | \     / | \
 14 15 16  17 18 19  20 21 22  13 14 15  26 27 28  29 30 31  12 13 14  35 36 37  38 39 40
  *  *  *  *         *         *  *  *                       *  *  *
Only those nodes marked (*) need be evaluated.
How many static evaluations are needed?.

If b is the branching factor (3 above) and d is the depth (3 above)

s = 2bd/2 - 1 IF d even

s = b(d+1)/2 + b(d-1)/2 - 1 IF d odd

For our tree d=3, b=3 and so s=11.

This is only for the perfectly arranged tree. It gives a lower bound on the number of evaluations of approximately 2bd/2 .The worst case is bd (minimax)
In practice, for reasonable games, the complexity is O(b3d/4). Using minimax with alpha beta pruning allows us to look ahead about half as far again as without.
 

Games of Chance

Many games, such as backgammon, involve chance.

To represent this game we need to add chance nodes to the game tree. In the tree below chance nodes are shown as circles. Below each circle are the possible outcomes of the dice throw with the probability of the throw occurring. (1/36 for a double, 1/18 otherwise).


Now each position has no known outcome only an expected outcome. The expected value for a chance node above a max node is:

Where The expectimin value is similarly defined for a chance node above a min node.

Complexity of expectiminimax

Minimax has a complexity of O(bm) where b is the branching factor and m is the depth. If there are now also a number (n) of chance nodes at each level, then the complexity becomes: O(bmnm)
This extra cost can make games of chance very difficult to solve.
 

State of the art

The following diagram suggests that in 1997 a chess program is as good as the best human. This is in fact what happened.
Deep Thought has 32 PowerPC CPUs and 256 dedicated hardware devices for evaluating board positions. It also has 100 years worth of grandmaster chess games and an opening and closing move database.

see: http://www.chess.ibm.com/meet/html/d.3.2.html for more details of Deep Blue