CSE 150 LECTURE NOTES

Tuesday January 13, 2004
 

ANNOUNCEMENTS 

Our discussion board is up and running at http://webboard.ucsd.edu/.  Your login is your UCSD email username, and your password is your PID number.


A* HEURISTIC SEARCH

The A* algorithm is
  fringe := make-queue(initial-node)

  loop
   if empty?(fringe) then return FAIL
   else
     n := remove-front(fringe)
     if isa-goal?(state(n)) then return n
     else
        fringe := sorted-insert(fringe,eval(expand(n)))
 end loop

This variety of best-first search is called A* for historical reasons.

The idea is that if we want to find the optimal path to the goal, i.e. the shortest or cheapest path, then the next node to explore should be the one that looks like it will yield the overall cheapest path.

The best estimate of the total cost of the path through a candidate node n is f(n) = g(n) + h(n) where h is the same estimator function of remaining distance as above and  g(n) = cost of path from the initial state to state(n).

The heuristic scoring function h is defined as follows:

h(n) = estimated cost of the cheapest path from the state represented by the node n to a goal state

The names f, g, h, and n are standard, so you should remember to use this notation. 

Remember we assume that  (1) the h function is admissible, and that (2) the f = g + h function is monotonic.  In words, h must be conservative, i.e. it must be an under-estimator.

 

THE IMPORTANT THEOREM ABOUT A*

Theorem:
(a) Optimality: the first path to a goal found by A* is an optimal path to a goal.
(b) Completeness: if a path to a goal exists, then A* will find it.
(c) Efficiency: no algorithm that is optimal and complete and  has only g and h available as knowledge will always expand  fewer nodes than A*.
Proof of optimality:   See previous lecture.

Proof of completeness: Suppose there are a finite number of nodes with f value less than the optimal cost c*.  Then A* will eventually expand all these nodes with lower f value and arrive at a solution.  Can there be infinitely many such nodes?  No, if (a) all nodes have finite branching factor and (b) the cost of each edge is greater than some epsilon > 0.

Proof of efficiency (idea only!): Consider an algorithm that uses the same information, i.e. the g and h functions, but does not visit all nodes with sub-optimal f value.  Then we can construct an adversary who chooses the true costs so that one of the unvisited nodes is actually the optimal solution.
 
 

REMARKS ON A*

An "online" exploration algorithm is one that guides searching and acting simultaneously.  A* and related search algorithms are "offline" since the agent uses search to "imagine" an optimal path, and only afterwards executes the optimal path.

A* is not fully offline: the agent cannot explicitly "see" in a global way the entire state space.  Rather, the state space becomes revealed to the agent one state at a time as the search algorithm proceeds.  The initial state is known explicitly.  A complete map is available implicitly; running the search algorithm makes this map explicit.

A* is a global search algorithm: the next node expanded is the one that looks best among all seen so far.  We'll discuss local search methods later.

A* expands all nodes with f(n) = g(n) + h(n) < c* and some nodes with f(n) = c*, where c* is the minimum cost of any goal node.  For a given search problem, g(n) and c* are fixed.  Everything else being fixed, if h(n) is bigger then A* explores fewer nodes.  In the limit where h(n) is perfect, A* explores only nodes along optimal path(s).  (If there is more than one such path, we would like to explore only one such path, but A* would need a tie-breaker for choosing which node to remove from the fringe in order to guarantee this.)

The proof of optimality depends on the heuristic function h being admissible, but not on monotonicity.  The function f(n) is monotonic if f never decreases along any path from the root.  Combined with admissibility, this assumption says that f = g+h does not become less accurate as an estimate of the total cost as you move towards a goal.

Let's draw the state space that A* explores, and let's draw contours of equal value of f(n).  If the function f is monotonic, then the contours are properly nested.

What is called optimality here is an "offline" property, while efficiency is an "online" property.  (Note on terminology: Optimality is sometimes called admissibility, in which case efficiency is sometimes called optimality.)

Like all variants of the general search algorithm, A* may explore multiple nodes that correspond to the same real-world state.  Some variants of the general search algorithm use a data structure called the closed list to remember already-visited nodes.  Each node discovered through expansion is then compared against all nodes on the closed list to detect whether the new node corresponds to an already-visited state.  Using this terminology, the fringe is called the open list.  


FINDING HEURISTIC FUNCTIONS

It is common in AI that a heuristic idea is good but proving that it is good is difficult, precisely because heuristics are not guaranteed.  Here is an important general principle for studying a heuristic:  look for assumptions under which the heuristic is guaranteed to be correct.

The practical usefulness of A* depends entirely on having a good heuristic function.  Consider the 8 puzzle and two heuristic functions: Both h1 and h2 always underestimate the number of tiles that must be moved to reach a solution, so they are both admissible.  (Each move affects only one tile, so summing over each tile considered separately is legitimate for h2.)

For all states of the 8 puzzle, h1(n) <= h2(n).  Hence h2 is a better estimator than h1.  See Table 4.8 for statistics on the effectiveness of these heuristics.

If one cannot guarantee that a heuristic function h is always an under-estimator, then A* is not guaranteed to find a shortest path.  However it can be proved that if h is "close to" an under-estimator, then A* finds a path that is "close to" being shortest.


NOTES ON NOT EXPANDING THE SAME STATE REPEATEDLY

This issue is often called "eliminating duplicates."  We did not discuss it in class.

Definition:  A duplicate is a node that contains the same state as some previously generated node.

It is not enough to look for duplicates in the "fringe".  We also must check for duplicates among nodes already expanded, hence removed from the fringe; otherwise we miss some duplicates.  Suppose state s1 leads to s2 leads to s1.  The second time we create a node with state s1, the first node with state s1 will no longer be in the fringe.  So if we check for duplicates inside the fringe only, we won't realize that the new node with s1 is  a duplicate.

The nodes that have already been expanded (hence removed from the fringe) are often called the "closed" list.  The fringe is then called the "open" list; see Section 3.5 of the book.  Using the "open" and "closed" lists gives a variant of the "general search algorithm" called "graph-search."  The last paragraph before "Memory-bounded heuristic search" on page 101 indicates that A* means "A* with graph-search" for the authors.  Hence, Figure 4.8 does use duplicate elimination, it seems.

When we eliminate duplicates, the question arises of what g and h values to use for the node we keep, because the duplicates may have different g and/or h values.  This issue is discussed on page 99 to motivate the concept of monotonicity.

Let m and n be duplicate nodes, i.e. both containing the same state s.  We might as well take the best path from the initial state to s, so we should take whatever g-value is smaller.  Assume that h(n) is a deterministic function of states.  (This is the usual case.)  Then h(n) will be the same for all nodes representing the same state; these nodes can differ in g(n) only.

With monotonicity, a newly generated node must have f-value >= its parent's f-value.  Nodes are taken from the fringe in f-value order, so the generated node must have f-value >= f-value of every node in the "closed" list.  So it must have g-value >= g-value of each of its
duplicates in the closed list.  Hence the best g-value for any state is the g-value obtained the first time a node for the state is created.

Therefore, if an old node n with state s is in the closed list, and a new node with state s is created, the new node can be discarded.  If the old node is in the open list (i.e. the fringe) then either the old or the new node may have the best g-value.

The book says that A* runs out of memory before it runs out of time on page 101.  Suppose the average branching factor is 1+x where x >= 1 always.  Then for every 1000 nodes removed from the fringe, 1000+1000*x are added.  This means that memory use increases by 1000*x, so the size of the fringe increases linearly with time, with no maximum.  Important note: the size of the closed list is linearly proportional to the size of the open list, so the closed list does not increase big-O memory usage.

The book does not suggest any tie-breaking policy for nodes that have same f-value.  The policy "when f-values are equal, expand the node with larger g-value first" is sensible.  Larger g-value means smaller h-value, so this can be implemented easily when g-values are integers under 100 (e.g. for the 8-puzzle) as f(n) = g(n) + 1.01*h(n).  Of course we still need a different tie-breaker for when g and h are both equal.




Copyright (c) by Charles Elkan, 2004.