COMS W3261
Computer Science Theory
Lecture 11: October 10, 2012
Decision and Closure Properties of CFL's
Outline
- Cocke-Younger-Kasami algorithm
- Testing emptiness of a CFG
- Closure properties of CFL's
- Nonclosure properties of CFL's
- Undecidable CFL problems
1. Cocke-Younger-Kasami Algorithm for Testing Membership in a CFL
- Input: a Chomsky normal form CFG G = (V, T, P, S) and a string w =
a1a2 ... an
in T*.
- Output: "yes" if w is in L(G), "no" otherwise.
- Method: The CYK algorithm is a dynamic programming algorithm that fills in
a triangular table
Xij with nonterminals A
such that A ⇒*
aiai+1 ...
aj.
for i = 1 to n do
if A → ai is in P then
add A to Xii
fill in the table, row-by-row, from row 2 to row n
fill in the cells in each row from left-to-right
if (A → BC is in P) and for some i ≤ k < j
(B is in Xik) and (C is in Xk+1,j) then
add A to Xij
if S is in X1n then
output "yes"
else
output "no"
The algorithm adds nonterminal A to Xij iff there is a
production A → BC in P where B ⇒*
aiai+1 ... ak
and C ⇒*
ak+1ak+2 ... aj.
To compute entry Xij, we examine at most
n pairs of entries:
(Xii, Xi+1,j),
(Xi,i+1, Xi+2,j),
and so on until
(Xi,j-1, Xj,j).
The running time of the CYK algorithm is O(n3).
2. Testing Emptiness of a CFG
- Problem: Given a CFG G, is L(G) empty?
- Emptiness problem is decidable: determine whether the
start symbol of G is generating.
- Naive algorithm has O(n2) time complexity where n
is the size of G
(sum of the lengths of the productions).
- With a more sophisticated list-processing algorithm, emptiness
problem can be solved in linear time. See HMU, p. 302.
3. Closure Properties of CFL's
- The context-free languages are closed under
- substitution
- Let Σ be an alphabet and let La be a language
for each symbol a in Σ. These languages define
a substitution s on Σ.
- If w =
a1a2 ... an
is a string in Σ*, then s(w) =
{ x1x2 ... xn |
xi is a string in
s(ai)
for 1 ≤ i ≤ n }.
- If L is a language,
s(L) = { s(w) | w is in L }.
- If L is a CFL over Σ and s(a) is a CFL for each
a in Σ, then s(L) is a CFL.
- union
- concatenation
- Kleene star
- homomorphism
- reversal
- intersection with a regular set
- inverse homomorphism
4. Nonclosure Properties of CFL's
- The context-free languages are not closed under
- intersection
- L1 =
{ anbnci | n, i ≥ 0 }
and L2 =
{ aibncn | n, i ≥ 0 }
are CFL's. But
L = L1 ∩ L2 =
{ anbncn | n ≥ 0 }
is not a CFL.
- complement
- Suppose comp(L) is context free if L is context free.
Since L1 ∩ L2 =
comp(comp(L1) ∪ comp(L2)),
this would imply the CFL's are closed under intersection.
- difference
- Suppose L1 – L2 is a context free if
L1 and L2 are context free.
If L is a CFL over Σ, then comp(L) = Σ* - L
would be context free.
5. Undecidable CFL Problems
- We say a problem that cannot be solved by any Turing machine is undecidable.
There is no algorithm that can solve an undecidable problem.
- We shall see that several fundamental questions about context-free grammars and languages
are undecidable, such as:
- Is a given CFG ambiguous?
- Given a CFG, is there another equivalent CFG that is unambiguous?
- Do two given CFG's generate the same language?
- Is the intersection of the languages generated by two CFG's empty?
- Given a CFG G = (V, T, P, s), is L(G) = T*?
6. Practice Problems
- Let G be the following grammar:
S → AB | BC
A → BA | a
B → CC | b
C → AB | a
- Use the CYK algorithm to determine whether
aabab is in L(G).
- Modify the CYK algorithm to report the number of distinct parse trees
there are for a given string w in a CNF grammar G.
- Let min(L) = { w | w is in L but no proper prefix of w is in L }.
Are the CFL's closed under the min operation?
- Let max(L) = { w | w is in L but for no string x other than ε is wx is in L }.
Are the CFL's closed under the max operation?
- Let init(L) = { w | wx is in L for some string x (possibly the empty string) }.
Are the CFL's closed under the init operation?
- Let cycle(L) = { w | we can write w as xy where yx is in L }.
Are the CFL's closed under the cycle operation?
- Let half(L) = { w | there exists a string x such that |w| = |x| and wx is in L }.
Are the CFL's closed under the half operation?
7. Reading Assignment
aho@cs.columbia.edu