CSE 150 LECTURE NOTES

Tuesday January 20, 2004
 

ANNOUNCEMENTS 

See Kristin Branson's notes from last Friday's section and this web board summary for advice about the current assignment.

 

BACKTRACKING: DFS FOR CSPs

We can apply the general search algorithm to a CSP as follows.  We need to say what the states are, and then give an initial state, a set of operators for expanding a state, and a predicate for recognizing goal states.  Let's say that a state consists of some variables with single values assigned to them, and some unassigned variables.  Then: The maximum depth of the search tree is the number of variables n, and all goal nodes are at this depth, so applying DFS is safe, i.e. sound and complete.

This algorithm is called generate-and-test, because each state is fully instantiated before it is tested.  (Assigning a value to an unassigned variable is called instantiating the variable.)  The rest of our analysis of CSPs will look at improvements to it, staying within the same framework as much as possible.

Incremental consistency checking is the idea that each constraint should be checked as soon as the variables involved have been assigned values.  If any constraint is not satisfied at a node in the search tree, then the entire subtree rooted at this node cannot contain a solution, so the node should not be expanded, and search should continue with a sibling node.  This refinement of DFS is called backtracking search.
 
 

CONSTRAINT PROPAGATION

The idea of constraint propagation is, informally, that as we go deeper in the search tree, we should keep track of the set of consistent values for each variable.  If any set becomes empty, we should backtrack.

In other words, each time the domain of one variable is narrowed down, all inconsistent values should be removed from the domains of the all other variables.

Let's consider a special case of constraint propagation called forward checking.  As soon as all the variables except one of a constraint have been assigned values, all values of the remaining variable that are inconsistent should be removed from its domain.

For the 4-queens problem, suppose the queens are placed column by column, so the domain of each variable consists of the four alternative rows.  Let the leftmost queen be placed in the lowest row.  Then the inconsistent values for the remaining variables are indicated by x marks:
 



x


x

x

Q x x x

If the next queen is placed in the second column as follows, then more values for the third and fourth variables become inconsistent, as shown in red:
 


x x

Q x x

x x
Q x x x

There are no feasible values now for the third variable, so the algorithm backtracks and tries the remaining alternative value for the second variable:
 

Q x x


x

x
x
Q x x x

The alternative assignment now leaves a single possible value for the third variable:
 

Q x x


x x

x Q x
Q x x x

This assignment for the third variable makes the last remaining value for the fourth variable inconsistent (shown in green), so the algorithm backtracks to the third variable, then to the second variable, and finally to the first variable:
 


x

x

Q x x x

x

This leads to solution with no further backtracking:
 

Q x x

x x Q
Q x x x

x Q x

Notice that each time we backtrack, values that were considered inconsistent can no longer be said to be inconsistent, so these values must be added back.  Therefore, you have to be able to undo updates to the data structure representing each set of remaining values.

Also note that the idea of forward checking is applicable to constraints of any arity.  As soon as k-1 variables involved in a constraint of arity k have been instantiated, we check each value of the last variable for consistency.
 
 

A HEURISTIC FOR VARIABLE ORDERING

For soundness and completeness, it does not matter in which order variables are chosen for instantiation.  In order to improve efficiency (we hope) we can do dynamic variable ordering, i.e. use different orderings in different parts of the search tree (also called search rearrangement).

Note that we have two nested choice points:

If we reach inconsistency after choosing one particular variable, we do not need to try other variables at the same choice point.  But if we reach inconsistency after choosing one particular value, we do need to try all other possible values for this variable.  Therefore, variable ordering is usually more important than value ordering.

A good heuristic for reducing computational complexity is always to instantiate next the remaining variable that is most constrained, i.e. the one whose domain contains the fewest remaining values.

Why is this a good idea?  We only need to try one variable, so we would ideally choose the variable that leads to inconsistency being detected the quickest, in order to prune away this whole part of the search tree as quickly as possible.  The most constrained variable is a guess at the ideal variable.

By definition, heuristics are not guaranteed to be effective.  Even when they appear to work in practice, we cannot prove that they will always work, or prove why they work.  It is difficult to do good research on heuristics, because it is difficult to establish conclusions definitively.  For example, the most constrained variable ordering heuristic might be beneficial, but possibly for a reason different from the one suggested above.
 
 

A HEURISTIC FOR VALUE ORDERING

Given that we have decided which variable to instantiate, in what order should we try its values that have not been eliminated already by constraint propagation?

As mentioned above, if a value leads to failure (i.e. inconsistency), we must still try the other values.  Only if a value leads to success (i.e. to a goal node) can we stop searching.  So, we should try first a value that seems most likely to lead to success.  This is a value that causes the smallest reduction in available values for other variables.

In other words, we should choose the value that allows the most flexibility for the still uninstantiated variables.  One way to make this idea precise is as follows:

for each value x of Vi separately, perform forward checking
measure the size of the remaining domains of all uninstantiated variables
sort the values x of Vi according to this measure.
If one value of Vi does lead to a goal node, then the forward checking done on other values of Vi will be wasted work.  Otherwise, if we do have to consider all values of Vi, then performing forward checking in advance will not impose any net cost in time (although it may in space).


THE DAVIS-PUTNAM-LOVELAND PROCEDURE

DPLL is an algorithm for propositional satisfiability.  This is the fastest known algorithm for satisfiability testing that is not just sound, but also complete, unlike all randomized local search methods.  A version was first published in 1960 by Martin Davis and Hilary Putnam and the algorithm is often called the Davis-Putnam method.  However the method in its present form is really due to Davis, G. Logemann, and Donald Loveland in 1962.




Copyright (c) by Charles Elkan, 2004.