W4 L2 Flashcards

1
Q

what is lexical semantics

A

a branch of nlp that deals with word sense

but what is a sense bearing unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

does a word always have the same meaning

A

no! the same spelling of a word can have different meanings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

homonymy

A

when two words are spelt the same but have different meaning

i went to the bank
we sat on the bank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is word net

A

the largest most popular database of lexical relations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is a homophone

A

same pronouncation with different spelling

to too two

these are hard for speech models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

homographs

A

same spelling but different pronounciaiton

drum bass
fish bass

hard for text to speech agents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is polysemy

A

when words are extended/ the meaning is transfered from words

bank
blood bank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is metonymy

A

a subtype of polysemy

ie shakespear

author: shakepear wrote hamlet

works of author: i studied shakespear at school

turkey animal vs turkey meat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what synonymy

A

words that have the similar meaning

ie sofa and couch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

antonymy

A

words that have opposite meanings

big small

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is hypo and nypernym

A

hypo is a sub term

and hyper is a super term

socretes is a man
all men are mortal
socreties is moral

hypo to hyper

but not hyper to hyp
not all men are socreties

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are semeantic fields

A

allow you to combine various terms that are related to the same domain

ie flight booking, plane, price, meal
-> domain of air travel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is a concept

A

and abstract idea representign the fundemental characteristics of what it represents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is classical concept theory

A

aristotle

concepts have a definitional structure, a list of features and all memebrs of this class must have these features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is prototype concept therory

A

properties of concepts are not definitional

memebers tend to posses them but not strictly required

ie memebrs tend to look similar but they dont need to be super restrictive requriements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is the theory concept theory

A

catagorization by concept

as new evidence comes, new members join and definitions change

definitions are in relation to each other

17
Q

what is the big bag of words method

A

machine learning models reqiure numbers so we can use classification to turn words lemmas synsets and concepts into features

this ignores word order so its called bag of words method

18
Q

what do we need to apply to features to classififcation

A

feature selection, feature weighting, normalization

19
Q

what is feature selection

A

selecting the most important of the features

20
Q

what is feature weighting and normalisiation

A

weighing the most important features after feature selection (tfidf)

21
Q

what is the power of bayes formula

A

bayes formula allowss us to flip the coniditions

P(class|content) =

P(content)

22
Q

what is the P(class | content) in bayes formula

A

posteriro probabilitiy
this is the postierior probabilitiy we are interested in in bayes formula that we need to use other formulas to get

23
Q

what is P(class) in bayes formula

A

prior probabilitiy

our belife about the class distrbution before we see more evidence

24
Q

what is P(content |class) in bayes formula

A

the likelihood

it shows how likely it is to see this exact combinations of features

25
Q

what is the P(Content) in bayes formula

A

the probability of the data

independent of the class and is treated like a normalization factor

26
Q

what is the maximum a posteriori (MAP) decision rule

A

y = maxP(Y|X)

    P(X)

= P(X|Y)P(Y)

27
Q

why is estimating P(content |spam) easier in practice than estimating P(spam |content)

A

because of the naive independence assumption

we assume that the occurrence of each feature given the class is indepented of the occurance/non occurance of any other feature given the class

28
Q

what can we do when probabilities are independent

A

we can multiply probabilities

ie P(f1|spam)x…. P(fn|spam)

29
Q

how would we estimate the priors in spam vs ham

A

ex

P(spam) =
num(spam emails in training set)/
num(all emails in trainingset)

P(ham) =
num(ham emails in training set)/
num(all emails in trianing set)

note all priors in the set should sum up to one spam + ham = 1

30
Q

how to calculate feature probabilities

ex what is the chance of ‘lottery’ given its spam

A

P(“Lottery”|spam) =

num(spam emails in training set)

note that all of the feature probabilities for each word should sum up to one for both spam and ham cases

31
Q

what are some ways we can evalutate the success of our algorithm

A

accuracy, confusion matrix

precision, recall, f1

32
Q

what is formula for accuracy

A

correct/ (correct+incorrect)

not ideal for unbalanced classes

33
Q

what is a confusion matrix

A

A confusion matrix is a table used to evaluate the performance of a classification model by comparing predicted and actual outcomes. It shows True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN).

its fr positive fr negative
predicted
pos
predicted
neg

34
Q

why is a confusion matrix better than accuracy score

A

It is useful because it provides detailed insights into model performance, helps calculate metrics like accuracy, precision, recall, and F1-score, and highlights specific areas where the model may be making errors.

35
Q

what is precision

A

Precision: Measures how many predicted positives are actually correct. Formula: TP / (TP + FP)

36
Q

what is recall

A

Recall: Measures how many actual positives were correctly predicted. Formula: TP / (TP + FN)

37
Q

what is f1

A

F1-Score: Harmonic mean of precision and recall, balancing both. Formula: 2 × (Precision × Recall) / (Precision + Recall)