This algorithm is called generate-and-test, because each state is fully instantiated before it is tested. (Assigning a value to an unassigned variable is called instantiating the variable.) The rest of our analysis of CSPs will look at improvements to it, staying within the same framework as much as possible.
Incremental consistency checking is the idea that each
constraint should be checked as soon as the variables involved have been
assigned values. If any constraint is not satisfied at a node in
the search tree, then the entire subtree rooted at this node cannot
contain a solution, so the node should not be expanded, and search
should continue with a sibling node. This refinement of DFS is
called backtracking search.
In other words, each time the domain of one variable is narrowed down, all inconsistent values should be removed from the domains of the all other variables.
Let's consider a special case of constraint propagation called forward checking. As soon as all the variables except one of a constraint have been assigned values, all values of the remaining variable that are inconsistent should be removed from its domain.
For the 4-queens problem, suppose the queens are placed column by
column, so the domain of each variable consists of the four alternative
rows. Let the leftmost queen be placed in the lowest row.
Then the inconsistent values for the remaining variables are indicated
by x marks:
| x | |||
| x | |||
| x | |||
| Q | x | x | x |
If the next queen is placed in the second column as follows, then
more values for the third and fourth variables become inconsistent, as
shown in red:
| x | x | ||
| Q | x | x | |
| x | x | ||
| Q | x | x | x |
There are no feasible values now for the third variable, so the
algorithm backtracks and tries the remaining alternative value for the
second variable:
| Q | x | x | |
| x | |||
| x | x | ||
| Q | x | x | x |
The alternative assignment now leaves a single possible value for
the third variable:
| Q | x | x | |
| x | x | ||
| x | Q | x | |
| Q | x | x | x |
This assignment for the third variable makes the last remaining
value for the fourth variable inconsistent (shown in green), so the algorithm backtracks to the
third variable, then to the second variable, and finally to the first
variable:
| x | |||
| x | |||
| Q | x | x | x |
| x |
This leads to solution with no further backtracking:
| Q | x | x | |
| x | x | Q | |
| Q | x | x | x |
| x | Q | x |
Notice that each time we backtrack, values that were considered inconsistent can no longer be said to be inconsistent, so these values must be added back. Therefore, you have to be able to undo updates to the data structure representing each set of remaining values.
Also note that the idea of forward checking is applicable to
constraints of any arity. As soon as k-1 variables involved in a
constraint of arity k have been instantiated, we check each value of the
last variable for consistency.
Note that we have two nested choice points:
A good heuristic for reducing computational complexity is always to instantiate next the remaining variable that is most constrained, i.e. the one whose domain contains the fewest remaining values.
Why is this a good idea? We only need to try one variable, so we would ideally choose the variable that leads to inconsistency being detected the quickest, in order to prune away this whole part of the search tree as quickly as possible. The most constrained variable is a guess at the ideal variable.
By definition, heuristics are not guaranteed to be effective.
Even when they appear to work in practice, we cannot prove that they
will always work, or prove why they work. It is difficult to do
good research on heuristics, because it is difficult to establish
conclusions definitively. For example, the most constrained
variable ordering heuristic might be beneficial, but possibly for a
reason different from the one suggested above.
As mentioned above, if a value leads to failure (i.e. inconsistency), we must still try the other values. Only if a value leads to success (i.e. to a goal node) can we stop searching. So, we should try first a value that seems most likely to lead to success. This is a value that causes the smallest reduction in available values for other variables.
In other words, we should choose the value that allows the most flexibility for the still uninstantiated variables. One way to make this idea precise is as follows:
If one value of Vi does lead to a goal node, then the forward checking done on other values of Vi will be wasted work. Otherwise, if we do have to consider all values of Vi, then performing forward checking in advance will not impose any net cost in time (although it may in space).for each value x of Vi separately, perform forward checking
measure the size of the remaining domains of all uninstantiated variables
sort the values x of Vi according to this measure.