Solving Problems by Searching

Simple reflex agents are limited because they cannot plan ahead. They also have no knowledge of what their actions do nor of what they are trying to achieve.
Now we investigate one type of goal-based agent called a problem-solving agent. This type of agent decides what to do by finding sequences of actions that lead to desirable states.
How can an agent formulate an appropriate view of the problem it faces?

Problem-Solving Agents
Intelligent agents are supposed to act in such a way that the environment goes through a sequence of states that maximizes the performance measure.
Unfortunately, this specification is difficult to translate into a successful agent design. The task is simplified if the agent can adopt a goal and aim to satisfy it.

Example Suppose the agent is in Auckland and wishes to get to Wellington. There are a number of factors to consider e.g.  cost, speed and comfort of journey.

Goals such as this help to organize behaviour by limiting the objectives that the agent is trying to achieve. Goal formulation, based on the current situation is the first step in problem solving. In addition to formulating a goal, the agent may wish to decide on some other factors that affect the desirability of different ways of achieving the goal.

We will consider a goal to be a set of states - just those states in which the goal is satisfied.
Actions can be viewed as causing transitions between states.
How can the agent decide on what types of actions to consider?

Problem formulation is the process of deciding what actions and states to consider. For now let us assume that the agent will consider actions at the level of driving from one city to another.

The states will then correspond to being in particular towns along the way.

The agent has now adopted the goal of getting to Wellington, so unless it is already there, it must transform the current state into the desired one. Suppose that there are three roads leaving Auckland but that none of them lead directly to Wellington. What should the agent do? If it does not know the geography it can do no better than to pick one of the roads at random.

However, suppose the agent has a map of the area. The purpose of a map is to provide the agent with information about the states it might get itself into and the actions it can take. The agent can use the map to consider subsequent steps of a hypothetical journey that will eventually reach its goal.

In general, an agent with several intermediate options of unknown value can decide what to do by first examining different possible sequences of actions that lead to states of known value and then choosing the best one.

This process is called search. A search algorithm takes a problem as input and returns a solution in the form of an action sequence. Once a solution is found, the actions it recommends can be carried out. This is called the execution phase. Hence, we have a simple "formulate, search, execute" design for the agent.

Formulating Problems
Formulating problems is an art. First, we look at the different amounts of knowledge that an agent can have concerning its actions and the state that it is in. This depends on how the agent is connected to its environment.

There are four essentially different types of problems.

Knowledge and problem types
Let us consider the vacuum world - we need to clean the world using a vacuum cleaner. For the moment we will simplify it even further and suppose that the world has just two locations. In this case there are eight possible world states. There are three possible actions: left, right, and suck. The goal is to clean up all the dirt, i.e., the goal is equivalent to the set of states {7,8}.

First, suppose that the agent's sensors give it enough information to tell exactly which state it is in (i.e., the world is completely accessible) and suppose it knows exactly what each of its actions does.

Then it can calculate exactly what state it will be in after any sequence of actions. For example, if the initial state is 5, then it can calculate the result of the actions sequence {right, suck}.

This simplest case is called a single-state problem.

Now suppose the agent knows all of the effects of its actions, but world access is limited. For example, in the extreme case, the agent has no sensors so it knows only that its initial state is one of the set {1,2,3,4,5,6,7,8}. In this simple world, the agent can succeed even though it has no sensors. It knows that the action {right} will cause it to be in one of the states {2,4,6,8}. In fact, it can discover that the sequence {right, suck, left, suck} is guaranteed to reach a goal state no matter what the initial state is.

In this case, when the world is not fully accessible, the agent must reason about sets of states that it might get to, rather than single states. This is called the multiple-state problem.

The case of ignorance about the effects of actions can be treated similarly.

Example Suppose the suck action sometimes deposits dirt when there is none. Then if the agent is in state 4, sucking will place it in one of {2,4}. There is still a sequence of actions that is guaranteed to reach the goal state.

However, sometimes ignorance prevents the agent from finding a guaranteed solution sequence. Suppose that the agent has the nondeterministic suck action as above and that it has a position sensor and a local dirt sensor. Suppose the agent is in one of the states {1,3}. The agent might formulate the action sequence {suck, right, suck}. The first action would change the state to one of {5,7}; moving right would then change the state to {6,8}. If the agent is, in fact, in state 6, the plan will succeed, otherwise it will fail. It turns out there is no fixed action sequence that guarantees a solution to this problem.

The agent can solve the problem if it can perform sensing actions during execution. For example, starting from one of {1,3}: first suck dirt, then move right, then suck only if there is dirt there. In this case the agent must calculate a whole tree of actions rather than a single sequence, i.e., plans now have conditionals in them that are based on the results of sensing actions.

For this reason, we call this a contingency problem. Many problems in the real world are contingency problems. This is why most people keep their eyes open when walking and driving around.

Single-state and multiple-state problems can be handled by similar search techniques. Contingency problems require more complex algorithms. They also lend them selves to an agent design in which planning and execution are interleaved.

We will only consider cases where guaranteed solutions consist of a single action sequence.

The last type of problem is the exploration problem. In this type of problem, the agent has no information about the effects of its actions. The agent must experiment, gradually discovering what its actions do and what sorts of states exist. This is search in the real world rather than in a model.

Well-defined problems and solutions

We have seen that the basic elements of a problem definition are the states and actions. To capture these ideas more precisely, we need the following:

Together these define the state space of the problem: the set of all states reachable from the initial state by any sequence of actions.

A path in the state space is simply any sequence of actions leading from one state to another.

It may also be the case that one solution is preferable to another, even though they both reach the goal. To capture this idea, we use the notion of path cost. Together the initial state, operator set, goal test, and path cost function define a problem.

To deal with multiple-state problems, a problem consists of an initial state set; a set of operators specifying for each action the set of states reached from any given state; and a goal test and path cost function. The state space is replaced by the state set space.

Measuring problem-solving performance
The effectiveness of a search can be measured in at least three ways:

The total cost of a search is the sum of the path cost and search cost.
What makes problem solving an art is deciding what goes into the description of states and operators and what should be left out. The process of removing detail is called abstraction. Without it, intelligent agents would be swamped.

Example Problems

We will talk both about toy problems and real-world problems. Toy problems are used to illustrate and exercise various techniques. Real-world problems are usually much harder and we usually care about the solutions.

Toy problems
The 8-puzzle problem

One important trick is to notice that rather than use operators such as "move the 3 tile into the blank space," it is more sensible to have operators such as "the blank space changes places with the tile to its left." This is because there are fewer of the latter kind of operator. This leads to the following formulation:

The 8-queens problem
The goal of this problem is to place 8 queens on the board so that none can attack the others. The following is not a solution!

 There are two main kinds of formulations. The incremental formulation involves placing queens one-by-one on the board. The complete-state formulation starts with all 8 queens on the board and moves them around until a solution is found. In this formulation, we have 648 possible sequences to investigate. A more sensible choice would use the fact that placing a queen where it is already under attack cannot work because subsequent placings will not undo the attack. So, try the following instead: It is easy to see that the actions given can generate only states with no attacks; but sometimes no actions will be possible. For example, after making the first seven choices (left to right) in the above figure, there is no action available. However, in this formulation there are only 2057 possible sequences to investigate.
The right formulation makes a big difference to the size of the search space.

Vacuum World

State Space with sensors

State space with no sensors (self loops omitted) - initial state is centre-top

Real-world problems

Route finding, e.g., find a route from Auckland to Wellington.

The Travelling Salesperson Problem is a famous touring problem in which each city in a network of connected cities must be visited exactly once. The goal is to find the shortest tour. This problem is NP-hard.

VLSI layout - a VLSI chips can have millions of gates and the positioning and connections to every gate are crucial to the successful operation of the chip. CAD tools are used in every phase of the process. Two difficult subtasks are cell layout and channel routing. The goal in cell layout is to place every cell on the chip with no overlap and enough room in between to place connecting wires. There are additional constrains such as placing particular cells near each other. Channel routing in the process of placing the connecting wires.

Robot navigation - Guide a robot about Mars.

Assembly sequencing - Robot assembly of complex objects.

Searching for Solutions

The idea behind most search techniques is to maintain and extend a set of partial solution paths.

Generating action sequences
To solve the route planning example problem, the first step is to test if the current state is the goal state. If not, we must consider some other states. This is done by applying the operators to the current state, thereby generating a new set of states. This process is called expanding the state. Whenever there are multiple possibilities, a choice must be made about which one to consider further.

This is the essence of search - choosing one option and putting the others aside for later, in case the first choice does not lead to a solution. We continue choosing, testing, and expanding until a solution is found or until there are no more states that can be expanded. The choice of which state to expand first is determined by the search strategy.

It is helpful to think of the process as building a search tree that is superimposed over the state space. The root of the tree is a search node corresponding to the initial state. The leaves of the tree are nodes that do not have successors either because they are goal states or because no operators can be applied to them.

The general search algorithm is described as follows:

Data structures for search trees
We will assume a search tree node is a five component structure with the following components:

Remember that a node is a bookkeeping data structure used to represent a search tree for a particular problem instance. This is what makes a node different from a state.
Nodes have depths and parents, whereas states do not.

The EXPAND function is responsible for calculating each of the five components above for the nodes that it generates.
We also need to represent the collection of nodes that are waiting to be expanded - this collection is called the fringe or frontier.
For efficiency, the collection of node on the fringe will be implemented as a queue using the following operations:

Different varieties of queuing functions produce different search algorithms.

Now we can write a more formal version of the general search algorithm.

Search Strategies

We will evaluate search strategies in terms of four criteria: Uninformed (or blind) search strategies have no information about the number of steps or the path cost from the current state to the goal.
In a route finding problem, given several choices of cities to go to next, uniformed search strategies have no way to prefer any particular choices.
Informed (or heuristic) search strategies use considerations to prefer choices. For example, in the route finding problem with a map, if a choice is in the direction of the goal city, prefer it.

Even though uninformed search is less effective than heuristic search, uninformed search is still important because there are many problems for which information used to make informed choices is not available.

We now present six uninformed search strategies:

Breadth-first search
In this strategy, the root node is expanded first, then all of the nodes generated by the root are expanded before any of their successors. Then these successors are all expanded before any of their successors.

In general, all the nodes at depth d in the search tree are expanded before any nodes at d+1. This kind of search can be implemented by using the queuing function ENQUEUE-AT-END.

Breadth-first search is complete and optimal provided the path cost is a nondecreasing function of the depth of the node. Lets look at the amount of time and memory it requires. Consider a hypothetical search tree with b successors for every node. Such a tree is said to have a branching factor of b. If the first solution is a depth d, then the maximum number of nodes expanded before reaching a solution is 1+b+b2+b3+...+bd.

This is not good. For example consider the following table for b=10, 1 node/ms, 100 bytes/node:

Note that memory requirements are a bigger problem here than execution time.
In general, exponential complexity search problems cannot be solved for any but the smallest instances.

Uniform cost search (Dijkstra's Algorithm)
This strategy modifies breadth-first search to account for general path cost. Now the lowest cost node on the fringe is always the first one expanded (recall that the cost of a node is the cost of the path to that node).

When certain conditions are met, the first solution found is guaranteed to be the cheapest.

Consider an example

Uniform cost search finds the cheapest solution provided the cost of a path never decreases as it gets longer, i.e., g(SUCCESSOR(n)>=g(n) for every node n.

Depth-first search

This search strategy always expands one node to the deepest level of the tree. Only when a dead-end is encountered does the search backup and expand nodes at shallower levels.

This strategy can be implemented by calling GENERAL-SEARCH with a queuing function that always puts the newest node on the front ( ENQUEUE-AT-FRONT)
Depth-first search has modest memory requirements, storing only a single path from root to the current node along with the remaining unexpanded sibling nodes for each node on the path. For a state space with branching factor b and maximum depth m, depth first search requires storage of only bm nodes, in contrast to bd for breadth-first search. Using the same assumptions as our analysis of breadth-first search, depth-first search would require 12 kilobytes instead of 111 terabytes at depth 12, a factor of 10 billion times less space.

For problems that have a lot of solutions, depth-first search is often faster than breadth-first search because it has a good chance of finding a solution after exploring only a small portion of the search space.

However, depth-first search can get stuck on a wrong path in infinite search spaces.

It is common to implement depth-first search with a recursive function that calls itself on each of its children in turn. In this case, the queue is stored implicitly in the local state of each invocation on the calling stack.

Depth-limited search
This strategy avoids the pitfalls of depth-first search by imposing a cut-off on the maximum depth of a path. Depth-first search is used to search to the given depth.
Depth-limited search is complete but non optimal and if we choose a depth-limit that is too shallow its not even complete.

Iterative deepening search
The hard part about depth-limited search is picking the right depth limit. Most of the time we will not know what to pick for a depth limit. Iterative deepening search is a strategy that side-steps the issue of choosing the depth limit by trying all possible limits in increasing order.

In effect, this search strategy keeps the best features of both breadth-first and depth-first search. It has the modest memory requirements of depth-first search but is optimal and complete like breadth-first search. Also it does not get stuck in dead ends.

Iterative deepening may seem wasteful because many states are expanded multiple times. For most problems, however, the overhead of this multiple expansion is actually rather small because a majority of the nodes are at the bottom of the tree.

So instead of:
we have:

For b=10 and d=5, the numbers are 111,111 and 123,456. So iterative deepening is 11% more costly than depth-limited in this case.
In terms of complexity numbers, iterative deepening has the same asymptotic complexity as breadth-first, but has space complexity O(bd).
In general, iterative deepening is the preferred search method when there is a large search space and the depth of the solution is unknown.

Bidirectional Search
The idea here is to search forward from the initial state and backward from the goal state simultaneously. When it works well, the search meets in the middle. For problems where the branching factor is b in both directions, bidirectional search can make a big difference.

The solution will be found in O(bd/2) steps because both searches have to go only half way. For example, if b=10 and d=6, breadth-first generates 1,111,111 nodes, whereas bidirectional search generates only 2,222 nodes. Even though this sounds great in theory, there are several issues that must be addressed before this algorithm can be implemented.

The complexity figure assumes that the process of testing for intersection of the two frontiers can be done in constant time. Also in order to have the two searches meet, the frontier of one of them must all be kept in memory, so the space complexity is O(bd/2).

Comparison of Search Strategies

Avoiding Repeated States

We have so far ignored one of the most important complications to the search process: the possibility of wasting time by expanding states that have already been encountered. For many problems, this is unavoidable. Any problem where operators are reversible has this problem. For example in route planning problems, the search spaces are infinite, but if we prune some of the repeated states, we can cut the tree down to a finite size.

Even when the tree is finite, avoiding repeated states can yield exponential speedups. The classic example is a space that contains only m+1 states. The tree contains every possible path which is 2m branches.

There are three ways to deal with repeated states, in increasing order of effectiveness and complexity:

To implement this last option, search algorithms often make use of a hash table that stores all the nodes that are generated. This makes checking for repeated nodes fairly efficient. Whether or not to do this depends on how "loopy" the state space is.

Constraint satisfaction search

A constraint satisfaction problem (CSP) is a special kind of search problem in which states are defined by the values of a set of variables and the goal test specifies a set of constraints that the values must obey.

Example The 8-queens problem can be posed as a CSP in which the variables are the locations of the eight queens; the possible values are squares on the board; and the constraints state that no two queens can be in the same row, column, or diagonal.

A solution to a CSP specifies values for all of the variables such that the constraints are met.

Cryptoarithmetic and VLSI layout can be posed as CSPs.

Constraints can be unary, binary, or higher-order:

Constraints can also be either absolute or preference.

Each variable vi in a CSP has a domain Di which is the set of possible values that the variable can take on. The domain can be either discrete or continuous.

A unary constraint specifies the allowable subset of a variable's domain and a binary constraint specifies the allowable subset of the cross-product of the two domains. In discrete CSPs with finite domains, constraints can be represented simply by enumerating the allowable combinations of values.

Example in 8-queens, the no attack constraint can be represented by a set of pairs of allowable values: {<1,3>,<1,4>,<1,5>,...,<2,4>,<2,5>,...}.

Constraints involving continuous variables cannot be enumerated in this way. Here we consider only CSPs with discrete, absolute, binary or unary constraints.

The initial state will be the state in which all variables are unassigned. Operators will assign a value to a variable from the set of possible values. The goal test will check if all variables are assigned and all constraints satisfied.

We can use depth-limited search because all solutions are of depth n, where n is the number of variables.

In the most naive implementation, any unassigned variable in a state can be assigned a value by and operator. Hence, the branching factor is the sum of the cardinalities of the different domains, i.e., , e.g., 64 in the 8-queens problem. Then the search space would be 648.
But note that the order of assignments makes no difference to the final solution. So we can choose a value for a single variables at each node. This results in a search space that is , or 88 in 8-queens.

A straightforward depth-limited search will examine all of these possible states. CSP contain some well known NP-complete problems and so in the worst case any algorithm to solve CSPs will be exponential.

In most real problems, however, we can take advantage of problem structure to eliminate large parts of the search space.

In CSPs, the goal test is a set of constraints which can be considered individually, as opposed to all together. Depth-limited search on a CSP wastes time searching when constraints have already been violated. For example, suppose we put the first two queens in the top row. DLS will examine all 86 possibilities for the remaining queens before discovering that no solution exists in that subtree.

A substantial improvement is to insert a test before the successor generation step to check whether any constraint has been violated by the variable assignments up to this point. The resulting algorithm backtracks immediately to try something else. This is called backtracking search.

Backtracking still has some failings. Suppose the squares chosen for the first six queens make it impossible to place the seventh queen, because they attack all 8 squares in the last column. Backtracking will still try all possible placings of the seventh queen. Forward checking avoids this problem by looking ahead to detect unsolvability. Each time a variable is instantiated, forward checking deletes from the domains of the as-yet-uninstantiated variables all of those values that conflict with the variables assigned so far. If any of the domains becomes empty, then the search backtracks immediately.

Forward checking is a special case of arc consistency checking. A state is arc-consistent if every variable has a value in its domain that is consistent with each of the constraints on that variable. This can be achieved by successive deletion of values that are inconsistent with some constraint. As values are deleted, other values may become inconsistent because they relied on the deleted values. This behaviour is called constraint propagation: choices are gradually narrowed down. Sometimes arc consistency is all that is needed to completely solve a problem.