Stream-based Text Processing Flashcards
What is DFA ?
Deterministic Finite Automaton
Formallydefinedasa5-tuple:(Q,Σ,δ,q0,F) – Qisasetofstates
– Σ is an input alphabet
– δ:Q×Σ→Qisatransitionfunction
– q0 ∈ Q is the start state
– F ⊂ Q is a set of final or accepting states
What is NFA ?
Non-deterministic Finite Automaton
Formally:(Q,Σ,δ,q0,F)
Reguler expression : Literal ?
/words/
REX: Character class ?
/./ (any character)
REX: any of the characters ?
/[abc]/ (a or b or c)
REX: range of characters ?
/[0-9]/, /[a-z]/, /[A-Za-z0-9_-]/
REX: case sensitive ?
/[_-]/
/[A-Z_-]/
start of line
$
end of line
\s
white space
\S
not white space
\d
digit
\D
not digit
\w
word
\W
not word
(a|b)
a or b
[^abc]
not a or b or c
*
0 or more
+
1 or more
?
0 or 1
{3}
exactly 3
{3,}
3 or more
{3,5}
3 or 4 or 5
Perl variable names start with $, @, or % ?
$a — a scalar variable
@a — an array variable
%a — an associative array (or hash)