COMS W3261 CS Theory
Lecture 21: Restricted Satisfiability Problems

1. SAT is NP-Complete

It is easy to show that SAT is in NP: Given a Boolean expression E, a polynomial-time NTM can guess and verify a satisfying truth assignment to E if one exists.
We must also show that if L is any language in NP, then there is a polynomial-time reduction of L to SAT.

Let M be a p(n)-time single-tape NTM for L that accepts an input string w of length n. Let T be a (p(n)+1)x(p(n)+1) tableau (array) in which the rows represent the ID's of a branch of the computation tree of M on w. The first row is the starting ID q₀w and each row follows the previous one according to a valid move by M. We say the tableau is accepting if any row of the tableau is an accepting ID. Thus, an accepting tableau represents an accepting path in the computation tree of M on w. The problem of determining whether M accepts w is equivalent to determining whether an accepting tableau for M on w exists.
We now outline a polynomial-time reduction from M and w to a Boolean expression E_M,w such that M accepts w iff E_M,w is satisfiable. The variables of the Boolean expression are of the form y_ijA. If the variable y_ijA is assigned the value TRUE, then this means cell i,j of the tableau contains the symbol A, where A is either a tape symbol or state of M. The Boolean expression is designed so that a satisfying assignment to its variables corresponds to an accepting tableau for M on w.
The overall form of E_M,w is U ∧ S ∧ N ∧ F where

U turns on exactly one variable for each cell of the tableau,
S makes sure the first row of the tableau corresponds to the initial ID,
N makes sure each row of the tableau corresponds to a valid ID that can follow the ID in the previous row of the tableau, and
F makes sure there is a row in the tableau that corresponds to an accepting ID.

E_M,w can be constructed in time polynomial in the length of w and its length is also polynomial in the length of w. See HMU, pp.438-446 for details.
If E_M,w is satisfiable, then there is a an accepting tableau for M on w, and hence a proof that M accepts w. This then shows that there is a polynomial-time reduction of L to SAT.

We can now use the NP-completeness of SAT to prove other languages are NP-complete by showing that the other language is in NP and that SAT can be reduced to it in polynomial time.
A boolean expression is in conjunctive normal form (CNF) if it is the logical AND of clauses where a clause is the logical OR of one or more literals. (Colloquially, the AND of ORs.) For example, (x ∨ ¬y) ∧ (¬x ∨ z) is a CNF boolean expression with two clauses.
We define CSAT as the language { E | E is a satisfiable CNF boolean expression }.
We can prove that CSAT is NP-complete.

First, we need to show that CSAT is in NP. This is easy because, as with SAT, a NTM can guess and verify a satisfying assignment for a CNF boolean expression in nondeterministic polynomial time.
Next, we would like to show we can reduce an instance E of SAT into an instance F of CSAT in polynomial time such that E is satisfiable iff F is satisfiable. However, this part of the proof is a bit more complicated because we need to do this reduction in polynomial time.
Two boolean expressions are equivalent if they have the same value on every truth assignment. However, converting a boolean expression E to an equivalent CNF boolean expression E’ may make E’ exponentially longer and thus take exponential time (in the length of E).
To avoid this exponential blow-up, we transform E into another not necessarily equivalent boolean expression F but with the property that F is at most quadratically bigger than E and E is satisfiable iff F is satisfiable. Moreover, we can do this transformation in time polynomial in the length of E.
We do the transformation E into F in two stages.

We first push all the NOT's in E down to the leaves of the syntax tree for E to produce an equivalent expression E' in which all the NOT's are negations of variables. This can be done using DeMorgan's laws and eliminating double negations in time that is at most a quadratic function of the length of E. See HMU, Sect. 10.3.2, pp. 449-450. For example, under this transformation the expression ¬( (¬(x ∨ y)) ∧ (¬x ∨ y) ) becomes x ∨ y ∨ x ∧ ¬y.
We now take the expression E' and produce in at most quadratic time a CNF expression F that is an extension of E' such that a truth assignment T for E' makes E' true iff there exists an extension S of T that makes F true. See HMU, Theorem 10.13, pp. 452-455 for details.

A boolean expression is in k-CNF if it is the logical AND of clauses each one of which is the logical OR of exactly k distinct literals. For example, (w ∨ ¬x ∨ y) ∧ (x ∨ ¬y ∨ z) is in 3-CNF.
We define kSAT as the language { E | E is a satisfiable k-CNF boolean expression }.
3SAT is NP-complete. It is a very useful tool for showing other languages in NP are NP-complete.
We can prove 3SAT is NP-complete as follows.

We first note that 3SAT is in NP because a NTM can guess and verify a solution for 3SAT in nondeterministic polynomial time.
We can reduce CSAT to 3SAT in polynomial time. Given a CNF expression E = c₁ ∧ c₂ ∧ . . . ∧ c_k, we can replace each of the clauses by a one or more clauses with exactly three distinct literals to create a 3-CNF expression F such that a truth assignment that satisfies E can be extended into a truth assignment that satisfies F. The time taken to construct F from E is linear in the length of E. See HMU, Theorem 10.15, p. 457 for details.

It is interesting to note that 1SAT and 2SAT are in P but kSAT is NP-complete for k ≥ 3.

Even though SAT is NP-complete (a worst-case result), software tools called SAT solvers have been developed using various algorithms and heuristics that seem to work well for instances of large practical SAT problems arising in areas such as electronic design automation, model checking, software verification, and artificial intelligence.
Satisfiability Modulo Theories (SMT) are generalizations of SAT where an SMT instance is a formula in first-order logic in which some function and predicate symbols have additional interpretations. A simple example would be an instance of SAT in which some of the binary variables are replaced by linear inequalities. Many SMT solvers have been built. In a typical use of an SMT solver we would translate various kinds of assertions attached to a program into SMT formulas to see if the assertions are always true. SMT solvers allow a richer set of questions to be posed than is possible with SAT solvers.

Given the boolean expression (x₁ ∧ y₁) ∨ (x₂ ∧ y₂) ∨ . . . ∨ (x_n ∧ y_n), construct an equivalent CNF expression. How many clauses does your CNF expression have?
Given the boolean expression v ∧ w ∨ x ∧ y ∨ z, construct an equivalent 3-CNF expression.
Given the boolean expression E = (w ∨ ¬x ∨ y) ∧ (x ∨ ¬y ∨ z), using the method in the proof of Theorem 10.13 in HMU construct a 3-CNF expression that is satisfiable iff E is satisfiable.
Prove that CSAT is NP-complete.
Given a boolean expression E in disjunctive normal form, how hard is it to determine whether E is satisfiable?
Prove that 3SAT is NP-complete.