Module 1 - chatbot and fundamentals Flashcards

1
Q

A definite noun refers to a ____________ of a noun(s), while an indefinite noun refers to a of a noun(s)

A

A definite noun refers to a specific instance of a noun, while an indefinite noun refers to a general category of nouns

A women –> indefinite
thereafter,
The woman –> definite

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Phonetics and phonology:

A

how words are related to sounds that realize them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Morphology:

A

how words are constructed from more basic meaning units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Syntax:

A

how words can be put together to form correct utterances

What structural role each word plays in the sentence
What phrases are subparts of other phrases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Lexical semantics:

A

what words mean

sole vs soul

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Compositional semantics:

A

how word meanings combine to form larger meanings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Pragmatics:

A

how situation affects interpretation of utterance

Context matters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Discourse structure:

A

how preceding utterances affects processing of next utterance

Friend 1: I’m hungry.
Friend 2: Let’s go to the Fuji Gardens. (restaurant)

Friend 1: It’s a beautiful day.
Friend 2: Let’s go to the Fuji Gardens.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Morphology: How words are constructed from more basic units, called __________

A

How words are constructed from more basic units, called morphemes

adverb/adjective, pluralization, suffixes…

friend + ly = friendly
friend is the noun
suffic -ly turns it to an adjective (or for a verb, an adverb)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Temporal Interpretation

a subset of Discourse

A

Understanding of time impacts your meaning of the sentence.

“Max fell. John pushed him”

him refers to Max; pushing happned before falling; the second sentence is an explanation for the first here.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

World Knowledge

a subset of discourse

A

What we know about the world and what we can assume our hearer
knows about the world is intimately tied to our ability to use language

I took the fugu from the plate and ate it.
refers to the dish made from fugu, not a live fugu fish.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

_ is a fundamental problem of computational linguistics. Resolving _ is a crucial goal.

A

ambiguity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Normalization: Stemming is

A

the process of reducing a word to its stem/root word.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Normalization: Lemmatization is

A

related to stemming, it reduces words to its cononical forms based on a word’s lemma. Dictionary form of the word.

better —> good

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Normalization: everything else

A

substitution and removal
- chars set to upper/lower
- remove numbers
- remove punctuation
- etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

stop word removal

A

missed it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Tokenization is

A

? single words basically

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

POS tagging

A

Parts of Speech process of dtagging words in a sentince to a prtice POS based on its position in sentence and onctext or something like that

19
Q

N-grams are the combination of

A

multiple words used together.

used when we want to preserve sequence info in the doc, like what word is likely to follow a given one.

n- refers to number of words together e.g. bi-gram, tri-gram.

each individual word would be called a unigram. They dont’ contain any sequence ifo because each word taken individually.

20
Q

vectorization is

A

the process of converting text into numbers - machine readable.

21
Q

BOW

it is a method of __________

A

Bag of Words method for vectorization.

table iin lecture showed count of words in each sentence. it is missing the order of the words.

22
Q

Type of regular expression

Literals are

A

normal text characters

23
Q

Type of regular expression

Metacharacters are

A

characters that have special meanings in regex:

. & * + $ ? | \ ^ [ { (

Need escape character to use them literally.

24
Q

Use of metacharacter:

a.b

period

see regex101.com

A

wildcard, any character except a newline

matches acb or azb or a&b

25
# Use of metacharacter: a*b | star ## Footnote see regex101.com
zero or more of the preceding character does not match a | matches b or aaab or ab
26
# Use of metacharacter: a+b | plus
one or more of the preceding character | matches ab or aaaab
27
# Use of metacharacter: a?b | question mark
preceding character is optional | matches b or ab or cab abb will match ab and b
28
# Use of metacharacter: a{2,4}b | curly brace ## Footnote char{n,m}char
match the preceding character at least n times but not more than m times | If open-ended, will maatch the substring: example it will match aa*aaaab
29
# Use of metacharacter: [ab] | square brace
match any single character present in the set inside of braces
30
# Use of metacharacter: [^ab] | carat
negation, match anything but | matches x does not match a
31
# Use of metacharacter: (abc) | parens
matches characters literally in order, case sensitive. | matches abc does not match acb does not match ABC
32
# Use of metacharacter: a|A | pipe
or | matches both a and A
33
# python regular expressions findall
returns a list containing all matches
34
# python regular expressions Search/Match
Returns aMatch objectif there is a match anywhere in the string
35
# python regular expressions split
Returns a list where the string has been split at each match
36
# python regular expressions sub
Replaces one or many matches with a string
37
FSA means
Finite State Automata AKA Automaton / Automata
38
FSAs recognize ___________ represented by __________
FSAs recognize the **regular languages** represented by **regular expressions**
39
what does q mean in an FSA
the states. Q is the set of states. q0 is the start state. qn is the final state.
40
ε | epsilon, it's not sigma? check chapter 2
ε denotes the transition ε stands for the empty string (a string with no characters). In automata theory, epsilon transitions allow the automaton to move between states without reading any input symbol. Epsilon transitions are particularly useful in non-deterministic finite automata (NFAs).
41
2 types of FSA
deterministic and non-deterministic
42
deterministic FSA has how many transitions?
at most one transition (from each state)
43
non-deterministic FSA has how many transitions?
a choice of several