Lecture 20 - Satisfiability is NP-complete

1. The Classes P and NP; NP-complete problems

P is the class of languages that can be recognized in polynomial time by a one-tape deterministic TM. The TM halts in polynomial time on all inputs.
A multitape TM that makes T(n) moves can be simulated by a single-tape TM that takes (T(n))² moves. (HMU, Theorem 8.10, p. 347.) If T(n) is a polynomial function of n, then (T(n))² is still a polynomial function of n.
NP is the class of languages that can be recognized in polynomial time by a nondeterministic TM. NP stands for nondeterministic polynomial time.
A nondeterministic single-tape TM that makes T(n) moves can be simulated by a single-tape TM that makes 2^O(T(n)) moves using breadth-first search on the computation tree. (HMU, Sect. 8.4.4, pp. 347-349.) It is not known whether the simulation always requires exponential time.
A language L is NP-complete if (1) L is in NP and (2) every language in NP is polynomial-time reducible to L.
Today, thousands of important NP-complete problems have been identified such as Clique, Hamiltonian-Circuit, Independent Set, Node-Cover, Subset-Sum, Traveling Salesman.

2. Verifiers

Another way to define NP is to use the notion of a polynomial-time verifier.
Formally, a verifier for a language L is an algorithm V that accepts pairs of strings (w, c) where c is a certificate or proof that the string w is in L. The language defined by V is L = { w | V accepts (w, c) for some certificate c }.
For example, the certificate c could be a string specifying a sequence of correct moves that a nondeterministic Turing machine for L can make starting from an initial id q₀w to an accepting id. All the verifier needs to do is check that this is indeed a valid sequence of moves that leads to acceptance. Verifying the validity of an accepting sequence of moves may be done much faster than finding an accepting sequence of moves.
For a problem like subset-sum, the certificate could be a subset of the given integers that sums to zero. All the verifier needs to do is check that the integers are a subset of the set of integers in the problem instance and that the integers in this subset sum to zero. Again, verifying that a subset is valid and sums to zero may be much faster than finding such a subset.
The time taken by the verifier is measured solely in terms of the length of w.
A polynomial-time verifier runs in polynomial time in the length of w. Note that for a polynomial-time verifier, the length of the certificate needs to be a polynomial function of the length of w
A language L is said to be polynomially verifiable if it has a polynomial-time verifier.
Another way to define NP is to say it is the class of languages that have polynomial-time verifiers. It is not hard to prove that a language is in NP iff it has a polynomial-time verifier.

3. The Satisfiability Problem (SAT)

An expression E is satisfiable if there exists a truth assignment to the variables in E that makes E true.
The satisfiability problem (SAT) is to determine whether a given boolean expression is satisfiable.
We can view SAT as the language { E | E is the encoding of a satisfiable boolean expression }.
In 1971 using a slightly different definition of NP-completeness, Steven Cook showed that SAT is NP-complete. At roughly the same time, Leonid Levin independently showed that SAT was NP-complete.
SAT plays a similar role for NP-completeness that the universal language or Post's Correspondence Problem plays for computability theory. SAT can be used to prove that other problems are NP complete by showing that the other problem is in NP and that SAT can be reduced to the other problem in polynomial time.
Shortly after Cook published his result, Richard Karp wrote an influential paper that showed many important optimization problems arising in theory and practice were also NP-complete. Karp used the definition of NP-completeness that we are using here. Sometimes the term Karp-completeness is used for our version of NP-completeness.

4. Normal Forms for Boolean Expressions

In boolean expressions

Logical AND, as in x ∧ y, is often called conjunction and is sometimes written as a product, as in xy.
Logical OR, as in x ∨ y, is often called disjunction and is sometimes written as a sum, as in x + y.
Logical NOT, as in ¬x, is often called negation and is sometimes written with an overbar, as in x̅.
A literal is a variable or a negated variable.

E.g., x and ¬x are both literals.

A clause is the logical OR (disjunction) of one or more literals.

E.g., x ∨ ¬y is a clause.

A boolean expression is in conjunctive normal form (CNF) if it is the logical AND (conjunction) of clauses.

E.g., (x ∨ ¬y) ∧ (¬x ∨ z) is in CNF.

A boolean expression is in k-CNF if it is the logical AND of clauses each one of which is the logical OR of exactly k distinct literals.

E.g., (w ∨ ¬x ∨ y) ∧ (x ∨ ¬y ∨ z) is in 3-CNF.

Two boolean expressions are equivalent if they evaluate to the same value on any truth assignment to their variables.

5. The Problems CSAT and kSAT

CSAT: Given a boolean expression E in CNF, is E satisfiable?

We can view CSAT as the language { E | E is the encoding of a satisfiable CNF boolean expression }.
CSAT is NP-complete.

kSAT: Given a boolean expression E in k-CNF, is E satisfiable?

1SAT and 2SAT are in P; kSAT is NP-complete for k ≥ 3.

6. SAT is NP-complete: the Cook-Levin Theorem

SAT is in NP

Given a boolean expression E of length n, a multitape nondeterministic Turing machine can guess a truth assignment T for E in O(n) time.
The NTM can then evaluate E using the truth assignment T in O(n²) time.
If E(T) = 1, then the NTM accepts E.
The multitape NTM can be simulated by a single-tape deterministic TM in O(n⁴) time.

If L is in NP, then there is a polynomial-time reduction of L to SAT.
If a NTM M accepts an input w of length n in p(n) time, then M has a sequence of moves such that

α₀ is the initial ID of M with input w.
α₀ |– α₁ |– … |– α_k where k ≤ p(n).
α_k is an ID with an accepting state.
Each α_i consists only of nonblanks unless α_i ends in a state and a blank, and extends from the initial head position to the right.

From M and w we can construct a boolean expression E_M,w that is satisfiable iff M accepts w within p(n) moves. See HMU, pp. 440-446 for details.

7. Practice Problems

Show that if A is NP-complete and A is in P, then P = NP.
Show that if A is NP-complete and A is polynomially reducible to a problem B in NP, then B is NP-complete.
List all satisfying truth assignments for x ∧ (y ∧ ¬x) ∧ (z ∨ ¬y).
A boolean expression is a tautology if it is true for all truth assignments. Show that the boolean expression x ∧ y ∨ ¬x ∨ ¬y is a tautology.
Show that the two boolean expressions ¬(x ∨ y) and ¬x ∧ ¬y are equivalent.
Show that 1SAT is in P.
[Hard] Show that 2SAT is in P.

8. Reading Assignment

HMU: Sections 10.1-10.3

aho@cs.columbia.edu

verma@cs.columbia.edu