CSE 150 LECTURE NOTES

February 26, 2004
 
 

ANNOUNCEMENTS

Today's topic is more on unification, and strategies for making resolution efficient for proving theorems in first-order logic.

Handouts today: the graded second assignments.


UNIFICATION

When two clauses in FOL are resolved together, we may need to match a variable in one clause with a constant in the other.  This process is called unification.  The result of unification is a substitution, i.e. a set of bindings, where each binding is an association between a variable and a term.  For example, consider these two CNF clauses in FOL:
forall x   loves(x,mother(x))
forall y,z loves(y,z) -> likes(y,z)
The substitution that unifies the two  loves  literals is a set of two bindings: { y := x, z := mother(x) }

In general, a substitution S is a set of individual substitutions of the form x := g where x is a variable and g is a term.  Each righthand side term may be ground (i.e. containing no variables) or not.

Applying the substitution S is usually written using postfix notation.  The equation pS = qS says that after applying S to p and to q, the results are syntactically identical.  Terminology: we say that S unifies p and q, or that S is a unifying substitution for p and q.

There is a standard algorithm that given two sentences, finds their unique most general unifying substitution.  Notes:

See the book by Russell and Norvig for an actual step-by-step unification algorithm.

The substitution that unifies two terms can have size exponential in the size of the input terms.  For example, consider unifying A = P( f(a,a) g(u,u) h(v,v) i(w,w) ) and B = P(u,v,w,x).  The unifying substitution is

{ u := f(a,a)
  v := g(f(a,a),f(a,a))
  w := h( g(f(a,a),f(a,a)), g(f(a,a),f(a,a)) )
  x := ...    }
What this example shows is that the usual notation for writing down terms should be avoided.  Good data structures for implementing unification and resolution are beyond the scope of this course.
 
 

CONTROL STRATEGIES FOR RESOLUTION

When doing a resolution proof, typically at each step there are many pairs of parent clauses that could be resolved.  A control strategy is a policy for prioritizing which resolutions to perform next.  Most strategies have the form "At least one parent of each resolution step must satisfy the following syntactic property: ...."

Definition: A control strategy for resolution is complete if its use preserves refutation-completeness, i.e. if false can be proved, it can be proved while respecting the strategy.

If a policy allows only a finite number of possible resolutions, then theorem-proving always terminates in bounded time.  In this case completeness must fail because it would contradict the fact that FOL is only semi-decidable, and not decidable.

Any policy that places a fixed upper bound on the length of the clauses that may be used in a proof only allows a finite number of resolutions, hence is incomplete.  This means that allowing axioms to be used multiple times is not enough to preserve completeness. 

 

INCOMPLETE STRATEGIES

Let's look first at two strategies that are not complete, but are still used often for efficiency.

Unit resolution is the policy that at least one parent of each resolution step must be a unit clause, i.e. a single literal.  Unit resolution is what we called unit propagation when discussing the DPLL algorithm for propositional satisfiability.

Under unit resolution the conclusion of each resolution step is shorter than its non-unit parent, so unit resolution must be incomplete.  For example, it fails to derive false from these four clauses:

{ P v Q    ~P v Q    P v ~Q   ~P v ~Q }
Input resolution is the policy that at least one parent of each resolution step must be in the original knowledge base, or part of the negated goal.  The eats(jo,fish) proof from last week is an example of input resolution.

This strategy is not complete either, but it can be efficient.  It fails on the same example as unit resolution.  (Just assume that any one of the four clauses is the negated goal.)

Both unit resolution and input resolution are complete when all clauses are Horn clauses.  A Horn clause is a clause with at most one positive literal.

Unit preference is the heuristic of prioritizing resolutions where one parent is a unit clause.  This heuristic does not actually eliminate any resolutions.  It can be used in combination with other strategies that do, without changing their completeness or incompleteness
 
 

SET OF SUPPORT STRATEGY

In general, this strategy is the policy that at least one parent of each resolution step must be a clause in a subset of sentences called the set of support.

Usually, the set of support consists of the clauses obtained by negating the goal, plus all descendants of these clauses.  Intuitively, with this strategy, the goal "supports" each resolution, at least indirectly, so each resolution is relevant to obtaining a contradiction.

Theorem: This choice of the supporting set yields a complete strategy if the original knowledge base is consistent.
Proof idea: In this case we'll never obtain a contradiction without using the negated goal somehow.

Note that the set of support increases at each step of the proof because each descendant of the original set of support (i.e. the clauses making up the negated goal) is added to the set of support.

Set-of-support resolution works effectively on the dolphin example.  Given this knowledge base:

Those who can read are literate:      ~r(x) v l(x)
Dolphins are not literate:           ~d(y) v ~l(y)
Some dolphins are intelligent:              d(a)
                            i(a)
To prove that there exists someone who is intelligent but can't read, we refute that "intelligent implies can read," that is we prove false from that.  Starting with the negated goal ~i(z) v r(z) the proof is:
~i(z) v r(z)        i(a)
------------------------
              r(a)                ~r(x) v l(x)
              --------------------------------
                        l(a)                ~d(y) v ~l(y)
                        ---------------------------------
                                      ~d(a)     d(a)
                                      --------------
                                            false
Notice that at each step of the proof, from top to bottom, the set of support strategy gave us only one choice about which resolution to perform next, given the heuristic of always using the clause proved most recently.