Regular Languages Flashcards
Definition 1.1 (Finite automaton)
A finite automaton (FA) M is a 5-tuple M =
(Q, Σ, δ, q0, F) where
1. Q is a finite set called the states,
2. Σ is a finite set called the alphabet,
3. δ : Q × Σ → Q is the transition function,
4. q0 ∈ Q is the start state, and
5. F ⊆ Q is the set of accept states / final states.
STD
State Transition Diagram
Definition 1.2 (Strings accepted by M)
Let M = (Q, Σ, δ, q0, F) be a finite automaton
and w = w1w2 · · · wn be a string over alphabet Σ.
M accepts w if there exists a sequence of states r0, r1, . . . , rn, such that all following
three conditions hold:
- r0 = q0 (M starts in start state.)
- δ(ri, wi+1) = ri+1, for i = 0, . . . n − 1 (State change follows transition function.)
- rn ∈ F (M ends up in accept state)
If M does not accept w, it rejects it.
Definition 1.3 (Computation of an FA)
Let M = (Q, Σ, δ, q0, F) be a finite automaton
and w = w1w2 · · · wn be a string over alphabet Σ.
A computation of M on w is a sequence of states c = r0, r1, . . . , rn, such that the
following two conditions hold:
1. r0 = q0 (M starts in start state.)
2. δ(ri, wi+1) = ri+1, for i = 0, . . . n − 1 (State change follows transition function.)
We call c an accepting computation, if rn ∈ F, otherwise we call it a rejecting computation.
Definition 1.4 (Language of machine M)
Let M = (Q, Σ, δ, q0, F) be a finite automaton.
The language of machine M L(M) is the set of all strings that are accepted by M.
We say: M recognizes L(M).
Definition 1.5 (Regular language)
A language is called a regular language if some
finite automaton recognizes it.
Definition 1.6 (Regular operations)
Let A and B be languages. We define the regular
operations union, concatenation and star as follows:
Union: A U B := {x | x ∈ A or x ∈ B}
Concatenation: A ◦ B := {xy|x ∈ A and y ∈ B}.
Star: A* := {x1x2 . . . xk|k ≥ 0 and each xi ∈ A}
Theorem 1.1 The class of regular languages is closed under the union operation.
In other words, if A1 and A2 are regular languages, so is A1 ∪ A2.
Theorem 1.2 The class of regular languages is closed under the concatenation operation.
In other words, if A1 and A2 are regular languages, then A1 ◦ A2 is regular.
Definition 1.7 (Nondeterministic finite automaton)
A nondeterministic finite automaton (NFA) is a 5-tuple (Q, Σ, δ, q0, F), where
- Q is a finite set of states,
- Σ is a finite alphabet,
- δ : Q × Σε → P(Q) is the transition function, with Σε := Σ ∪ {ε},
- q0 ∈ Q is the start state, and
- F ⊆ Q is the set of accept states.
Definition 1.8 (Strings accepted by NFA N)
Let N = (Q, Σ, δ, q0, F) be a nondeterministic finite automaton and w be a string over alphabet Σ.
N accepts w if we can write w as w = y1y2 . . . ym, yi ∈ Σε and if there exists a sequence of states r0, r1, . . . , rm (in Q), such that all following three conditions hold:
- r0 = q0 (N starts in start state.)
- ri+1 ∈ δ(ri, yi+1), for i = 0, . . . m − 1 (State change follows transition function.)
- rm ∈ F (N ends up in accept state)
If N does not accept w, it rejects it.
Definition 1.9 (Computation branch of an NFA)
Let N = (Q, Σ, δ, q0, F) be a nondeterministic finite automaton and w = w1w2 · · · wn be a string over alphabet Σ.
A computation branch of N on w allows for a modification of the input string w to a string w = y1y2 · · · ym, yi ∈ Σε and is a sequence of states c = r0, r1, . . . , rm, such that the following two conditions hold:
- r0 = q0 (N starts in start state.)
- ri+1 ∈ δ(ri, wi+1), for i = 0, . . . m − 1
(State change follows transition function.)
We call c an accepting computation branch, if rn ∈ F, otherwise we call it a rejecting computation branch.
Definition 1.10 (Language of NFA N)
Let N = (Q, Σ, δ, q0, F) be a nondeterministic
finite automaton.
The language of N L(N) is the set of all strings that are accepted by N. We say: N
recognizes L(N).
Theorem 1.3 Every nondeterministic finite automaton has an equivalent
deterministic finite automaton, i.e. there exists for every NFA N a DFA M such that L(M) = L(N).
Corollary 1.1 A language is regular if and only if some
nondeterministic finite automaton recognizes it.
Theorem 1.4 The class of regular languages is closed
under the star operation.
Definition 1.11 (Regular Expressions)
Let Σ be an alphabet. A regular expression (RE) R over the alphabet Σ describes a language L(R) over Σ. It is inductively defined such that R is RE if
R is
1. a for some a ∈ Σ, with L(a) = {a}
2. ε, with L(ε) = {ε}
3. ∅, with L(∅) = ∅
4. (R1 ∪ R2) (R1, R2 REs), with L(R1 ∪ R2) = L(R1) ∪ L(R2)
5. (R1 ◦ R2) (R1, R2 REs), with L(R1 ◦ R2) = L(R1) ◦ L(R2)
6. (R∗1), (R1 RE), with L(R∗1) = L(R1)∗
Parentheses in an RE can be omitted. Then, evaluation is done in the precedence order ∗, ◦, ∪. If clear from the context, the concatenation operator ◦ does not have to be written down. Moreover, “R1 = R2” means L(R1) = L(R2).
Theorem 1.5 A language is regular if and only …
if some regular expression describes it
Definition 1.12 (generalized nondeterministic finite automaton)
A generalized nondeterministic finite automaton (GNFA), (Q, Σ, δ, qstart, qaccept), is a 5-tuple, with
- Q is the finite set of states,
- Σ is the finite input alphabet,
- δ : Q × Q → R (R: set of all REs over Σ), where δ(q, qstart) and δ(qaccept, q) are undefined for all q ∈ Q.
- qstart ∈ Q \ {qaccept} is the start state, and
- qaccept ∈ Q \ {qstart} is the accept state
Definition 1.13 (Strings accepted by a GNFA)
A GNFA G = (Q, Σ, δ, qstart, qaccept) accepts a string w ∈ Σ∗, if there exists a decomposition w = w1w2 · · · wk (wi ∈ Σ∗) and a sequence of states q0, q1, . . . , qk such that
1. q0 = qstart is the start state
2. qk = qaccept is the accept state, and
3. for each i = 1, . . . , k we have wi ∈ L(Ri), where Ri = δ(qi−1, qi).
Otherwise w is rejected. L(G) is the language recognized by G.
Theorem 1.6 (Pumping Lemma)
Let Σ be an alphabet and A be a language over Σ. If A is a regular language, then there is a number p, the pumping length, where, if s is any string in A of length at least p, then s may be divided into three pieces, s = xyz, satisfying the following conditions: 1. for each i ≥ 0, xyi z ∈ A, 2. |y| > 0, and 3. |xy| ≤ p.