If you have an idea that is surely bad, do not take time to see how terrible it is.

Consider the example: It is not necessary to evaluate the branch at the right (with the 8) since the 1 is already lower than the 2 and will not be chosen.

Procedure To perform minimax search with the ALPHA-BETA procedure,

- If the level is the top level, let alpha be negative maximum and let beta be positive maximum.
- If the limit of search has been reached, compute the value of the current position relative to the appropriate player. Report the result.
- If the level is a minimizing level,
- Until all children are examined with ALPHA-BETA or until alpha is equal to or greater than beta,
- Use the ALPHA-BETA procedure, with the current alpha and beta values, on a child; note the value reported.
- Compare the value reported with the beta value; if the reported value is smaller, reset beta to the new value.
- Report beta.
- Otherwise, the level is a maximizing level:
- Until all children are examined with ALPHA-BETA or alpha is equal to or greater than beta,
- Use the ALPHA-BETA procedure, with the current alpha and beta value, on a child; note the value reported.
- Compare the value reported with the alpha value; if the reported value is larger, reset alpha to the new value.
- Report alpha.

What are the maximum savings possible?

Suppose the tree is ordered such that: Best move for both players is always the leftmost node at each decision point.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 * * * * * * * * * * *

How many static evaluations are need to determine this.

If b is the branching factor (3 above) and d is the depth (3 above)

-----------------------------------------------------

s = 2b^{d/2} - 1 IF d even

s = b^{(d+1)/2} + b^{(d-1)/2} - 1 IF d
odd

-----------------------------------------------------

For our tree d=3, b=3 and so s=11.

This is only for the perfectly arranged tree. It gives a lower bound on the number of evaluations.

This is approximately 2b^{d/2}

The worst case is b^{d}

In practice we do end up nearer the best case rather than the worst case.

The depth of search tree is often called the **ply**.

Alpha-Beta search reduces the exponent on the expression for the number of nodes searched.

Other methods to handle exponential growth:

Minimax with lookahead 1 Minimax with lookahead 2 Minimax with lookahead 3 Minimax with lookahead 4 . . . etc. until time runs out

If time runs out at level M then use results at level M - 1.

This will be useful when there is a time constraint since we have always seen as far ahead as possible in the time.

How many more nodes are we evaluating if we apply the static evaluator at every ply of the tree

b = branching factor d = level/depth

number of nodes at bottom (leaves) of tree depth d is
b^{d}

number of nodes in the rest of the tree tree is

b^{0}+ b^{1}+b^{2}+........+b^{d-1}

=(b^{d} - 1)/(b - 1)

The ratio of the number of nodes in the bottom level to the number of nodes up to the bottom level is:

b^{d}(b- 1)/(b^{d} - 1)

This is approximately b-1.

e.g. if b=16 then the bottom level has about 15 times as many nodes as the rest of the tree.

If b is not very small, you don't lose much by evaluating all nodes versus only leaf nodes.

You search for N ply in a tree, and find a very good move.

but...

Perhaps, if you had searched just one ply further you would have discovered that this move is actually very bad.

In general:

Your analysis may stop just before your opponent captures one of your pieces.

or

Your analysis stops just before you capture one your opponent's pieces.

This is called the horizon effect: a good (or bad) move may be just over the horizon.

The Singular Extension Heuristic:

Search should continue as long as one moves static value stands out from the rest.

If we don't use this heuristic, we risk harm from the Horizon Effect.

Use a heuristic (the static evaluation perhaps) to *order
moves*.

Tapered Search: Use a fast static evaluator to rank each node's children.

r(child) = r(parent) - r(child among siblings)

Pattern recognition involves labeling a pattern.

e.g.

Picture of Tree -> "Tree"

Input in this case is a digitised image.

There are a large number of possible input images. For example

- An image is 256*256 pixels (picture elements)
- Each pixel is 256 levels of grey (colour is not important for recognising trees)

then there are 256^{65536} Possible images. We
can think of this as a 65536 dimensional space where a point in this space
represents a possible image or *pattern.*

This space is called the *pattern space.*

The task of a pattern recognition program is to label points in the pattern space. If we can label all the points in pattern space that are trees then we will be able to recognise trees.

There may be a number of different things we want to recognise
(label). Each of these is called a *class.*

It is obviously infeasible to label all the points so we have to label areas. For example: Rugby Players and Ballet Dancers.

Use two measurements height and weight. Rugby players are heavy and short and Ballet dancers are generally tall and light.

Simplest Method

- Find an
**ideal example**for each class. - Define a
**measure of similarity**in the pattern space.

For an unknown pattern we can now measure its similarity with all the stored examples and choose the most similar.

**Problem:** The classes may occupy areas in the pattern
space such that the similarity measure does not work.

e.g.

Q and O

THE and CAT