Computer Science Theory

Lecture 8: October 1, 2012

Pushdown Automata

- Review
- Pushdown automata (PDA)
- Instantaneous descriptions of PDA's
- The language of a PDA
- Deterministic PDA
- From a CFG to an equivalent PDA
- From a PDA to an equivalent CFG

- Context-free grammars
- Derivations and parse trees
- Ambiguity

- A pushdown automaton is an ε-NFA with a pushdown stack (last-in, first-out stack).
- Pushdown automata define exactly the context-free languages.
- There are seven components to a PDA P =
(Q, Σ, Γ, δ, q
_{0}, Z_{0}, F): - Q is a finite set of states.
- Σ is a finite set of input symbols (the input alphabet).
- Γ is a finite set of stack symbols (the stack alphabet).
- δ is a transition function from (Q × (Σ ∪ {ε}) ∪ Γ) to subsets of (Q × Γ*):
- Suppose δ(q,
*a*, X) contains (p, γ). Then whenever P is in state q, looking at the input symbol*a*with X on top of the stack, P may go into state q, move to the next input symbol, and replace X on top of the stack by the string γ. - The second component,
*a*, may be ε in which case P makes the move without looking at the input symbol and does not move to the next input symbol. - Note that P is nondeterministic so there may be more than one pair
in δ(q,
*a*, X). - q
_{0}is the start state. - Z
_{0}is the start stack symbol. - F is the set of final (accepting) states.

- We can represent a configuration of the PDA P above by a triple
(q,
*w*, γ) where: - q is the state of the finite-state control.
*w*is the string of remaining input symbols.- γ is the string of symbols on the stack. If γ = XYZ, then X is on top of the stack.
- Suppose δ(q,
*a*, X) contains (p, α). Then to represent a single move of P we write - (q,
*aw*, Xβ) |– (p,*w*, αβ) - for all strings
*w*in Σ* and β in Γ*. - Note that
*a*may be empty.

- A PDA P =
(Q, Σ, Γ, δ, q
_{0}, Z_{0}, F) can define a language two ways. - Acceptance by final state: P can accept an input string w by reading all of it during a sequence of moves and entering a final state.
- Formally, we define L(P), the language accepted by P by final state, to be
the set of input strings w such that P can go from its initial ID
(q
_{0}, w, Z_{0}) in a sequence of zero or more moves to an accepting ID of the form (q, ε, α) where q is a final state and α is any stack string (perhaps empty). - Acceptance by empty (null) stack: P can accept an input string by reading all of it and emptying its stack.
- Formally, we define N(P), the language accepted by P by empty stack, to be
the set of input strings w such that P can go from its initial ID
(q
_{0}, w, Z_{0}) in a sequence of zero or more moves to an accepting ID of the form (q, ε, ε) for any state q. - Note that the final states of a PDA accepting by empty stack are irrelevant.
- These two modes of acceptance are equivalent. That is, L has a PDA that accepts it by final state iff L has a PDA that accepts it by empty stack.

- A PDA is deterministic (DPDA) if there is never a choice for a next move in any instantaneous description.
- A PDA
(Q, Σ, Γ, δ, q
_{0}, Z_{0}, F) is deterministic if: - δ(q,
*a*, X) has at most one member for any*q*in Q,*a*in Σ ∪ {ε} and*X*in Γ. - If δ(
*q*,*a*,*X*) is nonempty for some*a*in Σ, then δ(*q*, ε,*X*) must be empty. - A DPDA can recognize {
`wcw`

|^{R}`w`

is any string of`a`

's and`b`

's }. - A PDA can recognize {
`ww`

|^{R}`w`

is any string of`a`

's and`b`

's }, but no DPDA can recognize this language. - If L is a regular language, then L can be recognized by a DPDA.

- Given a CFG
*G*, we can construct a PDA*P*such that N(*P*) = L(*G*). - The PDA will simulate leftmost derivations of G.
- Algorithm to construct a PDA for a CFG
- Input: a CFG
*G*= (V, T, Q, S). - Output: a PDA
*P*such that N(*P*) = L(*G*). - Method: Let
*P*= ({q}, T, V ∪ T, δ, q, S) where - δ(
*q*, ε,*A*) = {(*q*, β) |*A*→ β is in Q } for each nonterminal*A*in V. - δ(
*q*,*a*,*a*) = {(*q*, ε)} for each terminal*a*in*T*. - For a given input string
*w*, the PDA simulates a leftmost derivation for*w*in*G*. - We can prove that N(
*P*) = L(*G*) by showing that*w*is in N(*P*) iff*w*is in L(*G*): - If part: If
*w*is in L(*G*), then there is a leftmost derivationS = γ

_{1}⇒ γ_{2}⇒ ... ⇒ γ_{n}= w- We show by induction on
*i*that*P*simulates this leftmost derivation by the sequence of moves- (
*q*,*w*, S) |–* (*q*,*y*_{i}, α_{i})- such that if γ
_{i}=*x*α_{i}_{i}, then*x*_{i}*y*=_{i}*w*. - We show by induction on
- Only-if part: If
(
*q*,*x*, A) |–* (*q*, ε, ε), then A ⇒**x*. - We can prove this statement by induction on the number of moves made
by
*P*.

- Given a PDA
*P*, we can construct a CFG*G*such that L(*G*) = N(*P*). - The basic idea of the proof is to generate the strings that cause
*P*to go from state*q*to state*p*, popping a symbol X off the stack, by a nonterminal of the form [*q*X*p*]. - Algorithm to construct a CFG for a PDA
- Input: a PDA
*P*= (Q, Σ, Γ, δ, q_{0}, Z_{0}, F). - Output: a CFG
*G*= (V, Σ, R, S) such that L(*G*) = N(*P*). - Method:
- Let the nonterminal S be the start symbol of
*G*. The other nonterminals in V will be symbols of the form [*p*X*q*] where*p*and*q*are states in Q, and X is a stack symbol in Γ. - The set of productions R is constructed as follows:
- For all states
*p*, R has the production S → [*q*_{0}Z_{0}*p*]. - If δ(
*q*,*a*, X) contains (*r*, Y_{1}Y_{2}… Y_{k}), then R has the productions- [
*q*X*r*_{k}] →*a*[*r*Y_{1}*r*_{1}] [*r*_{1}Y_{2}*r*_{2}] … [*r*_{k-1}Y_{k}*r*_{k}]- for all lists of states
*r*_{1},*r*_{2}, … ,*r*_{k}. - [
- We can prove that [
*q*X*p*] ⇒**w*iff (*q*,*w*, X) |–* (*p*, ε, ε). - From this, we have
[
*q*Z_{0}_{0}*p*] ⇒**w*iff (*q*,_{0}*w*, Z_{0}) |–* (*p*, ε, ε), so we can conclude L(*G*) = N(*P*).

- Construct a PDA that accepts {
`wcw`

|^{R}`w`

is any string of`a`

's and`b`

's } by final state. - Construct a PDA that accepts {
`wcw`

|^{R}`w`

is any string of`a`

's and`b`

's } by empty stack. - Construct a PDA that accepts {
`ww`

|^{R}`w`

is any string of`a`

's and`b`

's } by final state. - Construct a PDA that accepts {
`ww`

|^{R}`w`

is any string of`a`

's and`b`

's } by empty stack. - Construct a PDA
*P*such that N(*P*) = L(*G*) where*G*is S → (S)S | ε.

- HMU: Ch. 6

aho@cs.columbia.edu