# COMS W3261 Computer Science Theory Lecture 4: September 17, 2012 Equivalence of Regular Expressions and Finite Automata

## Overview

• We say that a regular expression E and a finite automaton A are equivalent if L(E) = L(A).
• We will prove that regular expressions and finite automata are equivalent in definitional power by showing how to convert a
• a regular expression into an equivalent epsilon-NFA
• an epsilon-NFA into an equivalent DFA
• a DFA into an equivalent regular expression

## 1. Review

• Deterministic finite automata
• Nondeterministic finite automata
• The subset construction

## 2. ε-NFA: NFA with Epsilon-Transitions

• An ε-NFA is an NFA (Q, Σ, δ, q0, F) whose transition function δ is a mapping from Σ ∪ {ε} to P(Q), the set of subsets of Q.
• The language of an ε-NFA is the set of all strings that spell out a path from the start state to a final state. There can be ε-transitions along this path.

## 3. Epsilon-Closures

• We define ECLOSE(q), the ε-closure of a state q of an ε-NFA, recursively as follows:
• State q is in ECLOSE(q).
• If state p is in ECLOSE(q), then all states in δ(p, ε) are also in ECLOSE(q).
• We can compute the ε-closure of a set of states S by taking the union of the ε-closures of all the individual states in S.

## 4. McNaughton-Yamada-Thompson Algorithm: Converts an RE to an equivalent ε-NFA

• Theorem: If L = L(R) for some regular expression R, then there is an ε-NFA N such that L = L(N).
• Proof: See HMU, Sect. 3.2.3, pp. 102-107.
• The proof is in the form of an algorithm that takes as input a regular expression R of length n and recursively constructs from it an equivalent ε-NFA that has
• exactly one start state and one final state,
• at most 2n states,
• no arcs coming into its start state,
• no arcs leaving its final state,
• at most two arcs leaving any nonfinal state
• This algorithm was discovered by McNaughton and Yamada, and then independently by Ken Thompson who used it in the string-matching program `grep` on Unix. On an input string of length m, an n-state MYT ε-NFA can be efficiently simulated in time O(mn) using a two-stack algorithm.

## 5. Converting an ε-NFA to an equivalent DFA

• We can eliminate all ε-transitions from an ε-NFA by converting it into an equivalent DFA using the subset construction.
• Given an ε-NFA E = (QE, Σ, δE, qE, FE), we construct the DFA D = (QD, Σ, δD, qD, FD) as follows:
• QD = P(QE).
• δD is computed as follows:
• Let S = {p1, p2,..., pk} and let {r1, r2,..., rm} be the union of δE(pi, a) for i = 1, 2, ..., k and all a in Σ.
Then, δD(S, a) = ECLOSE({r1, r2,..., rm}).
• qD = ECLOSE(qE).
• FD = { S | S is in QD and S contains a state in FE }.
• As with the subset construction, we can prove by induction that L(D) = L(E).

## 6. Kleene's Algorithm: Converting a DFA to an Equivalent Regular Expression

• Given a DFA A, Kleene's algorithm constructs a regular expression R from A such that L(R) = L(A).
• Suppose the states of A are numbered 1, 2, ..., n.
• Kleene's algorithm is a dynamic programming algorithm that constructs a regular expression R[i,j,k] that denotes all paths from state i to state j with no intermediate node in the path numbered higher than k as follows:
• ``````
for (i = 1; i <= n; i++)
for (j = 1; j <= n; j++)
if (i != j)
if (there are transitions from state i to state j labeled a1, a2, ..., ak)
R[i,j,0] = a1 + a2 + ... + ak;
else
R[i,j,0] = ∅;
else if (i == j)
if (there are transitions from state i to state i labeled a1, a2, ..., ak)
R[i,i,0] = ε + a1 + a2 + ... + ak;
else
R[i,i,0] = ε;
for (k = 1; k <= n; k++)
for (i = 1; i <= n; i++)
for (j = 1; j <= n; j++)
R[i,j,k] = R[i,j,k-1] + R[i,k,k-1](R[k,k,k-1])*R[k,j,k-1];
``````
• Assuming the start state is 1, the regular expression for the DFA A is then the sum (union) of all expressions `R[1,j,n]` where j is a final state.
• Note that Kleene's algorithm for constructing a regular expression from a DFA reduces to Warshall's transitive closure algorithm and to Floyd's all-pairs shortest paths algorithm for directed graphs.
• Example 3.5, HMU, pp. 95-97.

## 7. Converting a DFA to an equivalent RE by Eliminating States

• Kleene's algorithm gives us a mechanical way to construct a regular expression from a DFA, or for that matter, from any NFA or ε-NFA.
• Another approach that avoids duplicating work is to eliminate states, one at a time, from the DFA using the procedure outlined in Section 3.2.2 of HMU.
• Example 3.6, HMU, pp. 101-102.
• You can see an expanded treatment of state elimination with more examples in pp. 583-588 of Chapter 10 of Aho and Ullman, Foundations of Computer Science.

## 8. Practice Problems

1. Use the MYT algorithm to construct an equivalent ε-NFA for the regular expression a+b*a.
2. Show the behavior of your ε-NFA on the input string bba.
3. Use the subset construction to convert your ε-NFA into a DFA.
4. Show the behavior of your DFA on the input string bba.
5. Consider the DFA D with:
1. Q = {`1`, `2`, `3`}
2. Σ = {`a`, `b`}
3. δ:

4. ` State ` `Input Symbol`
` a ` ` b `
`1` `2` `1`
`2` `3` `1`
`3` `3` `2`

5. Start state: `1`
6. F = {`3`}

a) Use Kleene's algorithm to construct a regular expression for L(D). Simplify your expressions as much as possible at each stage.
b) Construct a regular expression for L(D) by eliminating state 2.