1.3 REGULAR EXPRESSIONS (reglulegar segðir) Flashcards
What are regular expressions?
we can use the regular operations to build up expressions describing languages, which are called regular expressions. An example is:
(0∪1)0∗.
Σ∗ describes
the language consisting of all strings over that alphabet.
Σ∗1 is ?
the language that contains all strings that end in a 1.
Which operations are done first in regex?
The star operation is done first,
followed by concatenation,
and finally union,
unless parentheses change the usual order.
Formal definition of a regular expession
Say that R is a regular expression if R is
1. a for some a in the alphabet Σ,
2. ε,
3. ∅,
4. (R1 ∪ R2), where R1 and R2 are regular expressions,
5. (R1 ◦ R2), where R1 and R2 are regular expressions, or
6. (R1∗), where R1 is a regular expression.
Don’t confuse the regular expressions ε and ∅. What is the different?
The expression ε represents the language containing a single string—namely, the empty string—whereas ∅ represents the language that doesn’t contain any strings.
0∗10∗
{w| w contains a single 1}.
Σ∗1Σ∗
{w| w has at least one 1}.
Σ∗001Σ∗
{w| w contains the string 001 as a substring}.
1∗(01+)∗
{w| every 0 in w is followed by at least one 1}.
(ΣΣ)∗
{w| w is a string of even length}.
(ΣΣΣ)∗
{w| the length of w is a multiple of 3}.
01 ∪ 10
{01, 10}.
0Σ∗0 ∪ 1Σ∗1 ∪ 0 ∪ 1
{w| w starts and ends with the same symbol}.
(0∪ε)1∗
01∗ ∪1∗.
The expression 0 ∪ ε describes the language {0, ε}, so the concatenation operation adds either 0 or ε before every string in 1∗ .
(0∪ ε)(1∪ ε)
{ε,0,1,01}.
1∗∅
∅.
Concatenating the empty set to any set yields the empty set.
∅∗
{ε}.
The star operation puts together any number of strings from the language to get a string in the result. If the language is empty, the star operation can put together 0 strings, giving only the empty string.
A language is regular if and only if some regular expression describes it.
Say that we have a regular expression R describing some lan- guage A. We show how to convert R into an NFA recognizing A. By Corol- lary 1.40, if an NFA recognizes A then A is regular.
We convert the regular expression (ab ∪ a)∗ to an NFA in a sequence of stages.
If a language is regular, then it is described by a regular expression.
PROOF IDEA We need to show that if a language A is regular, a regular expression describes it. Because A is regular, it is accepted by a DFA. We describe a procedure for converting DFAs into equivalent regular expressions.
How do we convert regex to DFA ?
We break this procedure into two parts, using a new type of finite automaton called a generalized nondeterministic finite automaton, GNFA. First we show how to convert DFAs into GNFAs, and then GNFAs into regular expressions.
What are Generalized nondeterministic finite automata?
Generalized nondeterministic finite automata are simply nondeterministic fi- nite automata wherein the transition arrows may have any regular expressions as labels, instead of only members of the alphabet or ε.
For convenience, we require that GNFAs always have a special form that meets the following conditions.
The start state has transition arrows going to every other state but no arrows coming in from any other state.
There is only a single accept state, and it has arrows coming in from every other state but no arrows going to any other state. Furthermore, the accept state is not the same as the start state.
Except for the start and accept states, one arrow goes from every state to every other state and also from each state to itself.
We can easily convert a DFA into a GNFA in the special form. How ?
We simply add a new start state with an ε arrow to the old start state and a new accept state with ε arrows from the old accept states. If any arrows have multiple labels (or if there are multiple arrows going between the same two states in the same direction), we replace each with a single arrow whose label is the union of the previous labels. Finally, we add arrows labeled ∅ between states that had no arrows.
A generalized nondeterministic finite automaton is a 5-tuple, (Q, Σ, δ, qstart, qaccept), where
- Q is the finite set of states,
- Σ is the input alphabet,
- δ: (Q − {qaccept}) × (Q − {qstart})−→R is the transition function,
- qstart is the start state, and
- qaccept is the accept state.