Language & Computation Flashcards
TW3V19002 UU
Formal automata
Models with step by step computation.
Formal grammars
Recipes to create patterns and sentences.
List the four language types from Chomsky’s hierarchy from simplest to most complicated:
- Regular
- Context-free
- Context-sensitive
- Recursively enumerable
List the four automata from Chomsky’s hierachy from simplest to most complicated:
- Finite state
- Pushdown
- Linearly bounded
- Turing machine
Describe the language and automaton of the type 3 grammar in Chomsky’s hiearchy:
Type 3 grammar has a regular language and a finite state automaton.
Describe the language and automaton of the type 2 grammar in Chomsky’s hiearchy:
Type 2 grammar has a context-free language and a pushdown automaton.
Describe the language and automaton of the type 1 grammar in Chomsky’s hiearchy:
Type 1 grammar has a context-sensitive language and a linearly bounded automaton.
Describe the language and automaton of the type 0 grammar in Chomsky’s hiearchy:
Type 0 grammar has a recursively enumerable language and a turing machine.
What is the trade-off found in Chomsky’s hierarchy concerned with?
Expressivity and complexity.
Greater expressivity requires…
greater complexity.
What does a(n English) syllable consist of?
The onset, the nucleus and the coda.
Concatenation
One symbol after another that form a sequence.
In what three ways can you make combinations in the formal grammar RegEx?
- Concatenation
- Choice
- Repetition
FSA
finite state automaton
What is the problem for finite automata on palindroms?
The computational power is not enough, it does not have any memory.
How is the stack of a pushdown automaton organized?
First In, Last Out
FILO
first in, last out
PDA
Pushdown automaton
Which grammar types have polynomial complexity?
Type 3 and type 2.
FA
finite automaton
The program (in FA/FSA)
The finite set of instructions for changing from state to state while reading.
Deterministic FA
FAs where there is only one instruction for each combination of state and read symbol.
State diagram
A representation for an FA in which each state is a circle and the instructions are arrows.
Give the formal definition of a deterministic FA:
A deterministic FA M is a quintuple [K, Sigma, delta, q0, F] with
K = finite set of states,
Sigma = finite set (the alphabet),
Q0 E K = the initial state,
F [subset] K = set of final states,
delta = f: K x Sigma -> K (the transition function).
Give the formal definition of a situation in a deterministic FA:
Given a dFA M = [K, Sigma, d, q0, F], a situation of M is a triple (x,q,y) where q E k and x,y E Sigma^x.
Lemmatization
The task of determining that two words have the same root
Stemming
A simple form of lemmatization where suffixes are stripped from the end of the word.
Sentence segmentation
Breaking up a text into individual sentences using cues like periods or exclamation points.
Concatenation
Putting characters in sequence.
What is the hierachy in regex?
- Parentheses ()
- Counters *, +, ? and {}
- Sequences and anchors ^, $
- Disjunction |
What is precision?
Minimise false positivies.
Recall
Minimize false negatives
Regex: find a pattern that contains either t or T.
/[tT]/
Regex: find a pattern that matches ‘tea’ or ‘Tea’.
/[tT]ea/
Regex: find any number in range 2, 3, 4, or 5.
/[2-5]/
Regex: find any letter in range c, d, e, f, g, h.
/[c-h]/
Regex: find any upper case letter.
/[A-Z]/
Regex: find any lowercase letter.
/[a-z]/
Regex: find any single digit.
/[0-9]/
Regex: find any single character, except f.
/[^f]/
Regex: find any single character, but neither a nor A.
/[^aA]/
Regex: find either f or ^.
/[f^]/
Regex: find the pattern c^f.
/c^f/
Regex: find ‘horse’ or ‘horses’.
/horses?/
Regex: what does ? mean?
The preceding character or nothing.
Regex; what does * mean?
Zero or unbounded amount of the previous character
/aa*/ (regex)
Any string of one a followed by zero or more a’s
What does . mean in regex?
It’s a wildcard, any single character.
How do you search for the start of a line with regex?
/^/
How do you search for the end of a line with regex?
$
How do you search for a word boundary in regex?
\b
How do you search for a non-word boundary in regex?
\B
How do you use the . as an actual dot in regex?
.
How do you show disjunction in regex?
|
How do you search for either ‘cat’ or ‘dog’ in regex?
/cat|dog/
When do you use parentheses in regex?
When you want the sequence to act as a single character to the operators around it.
How do you look for as little text as possible when using a Kleene star?
*?
How do you look for as little text as possible when using a Kleene plus in regex?
+?
How do you look for exactly x occurrences of the previous character?
/character{x}/
What is a shortcut in regex for any alphanumeric/underscore character?
\w
What is a shortcut in regex for any digit?
\d
Give the shortcut in regex for any non-digit:
\D
Give the shortcut in regex for any non-alphanumeric:
\W
Give the shortcut to look for a whitespace in regex:
\s
Give the shortcut in regex for a non-whitespace:
\S
How do you search for a - b occurrences of the previous char in regex?
/char{a-b}/
How do you look for at least n occurrences of the previous in regex?
/char{n,}/
How do you look for up to m occurrences of the previous in regex?
/char{,m}/
How do you search for the asterisk in regex?
*
How do you search for the question mark in regex?
\?
How do you find a newline in regex?
\n
How do you find a tab in regex?
\t
When do you use \1 in regex?
To refer to whatever string matched the first item in parentheses.
What do you use when you don’t want to capture the group between parentheses in regex?
(?:pattern)
Fragment
broken-off word
Lemma
Set of lexical forms having the same stem, pos and word sense.
Word form
The full inflected or derived form of the word.
Types (in corpora)
The number of distinct words in a corpus.