W4 L2 Flashcards by Anne Jiao

what is lexical semantics

a branch of nlp that deals with word sense

but what is a sense bearing unit

How well did you know this?

Not at all

Perfectly

does a word always have the same meaning

no! the same spelling of a word can have different meanings

How well did you know this?

Not at all

Perfectly

homonymy

when two words are spelt the same but have different meaning

i went to the bank
we sat on the bank

How well did you know this?

Not at all

Perfectly

what is word net

the largest most popular database of lexical relations

How well did you know this?

Not at all

Perfectly

what is a homophone

same pronouncation with different spelling

to too two

these are hard for speech models

How well did you know this?

Not at all

Perfectly

homographs

same spelling but different pronounciaiton

drum bass
fish bass

hard for text to speech agents

How well did you know this?

Not at all

Perfectly

what is polysemy

when words are extended/ the meaning is transfered from words

bank
blood bank

How well did you know this?

Not at all

Perfectly

what is metonymy

a subtype of polysemy

ie shakespear

author: shakepear wrote hamlet

works of author: i studied shakespear at school

turkey animal vs turkey meat

How well did you know this?

Not at all

Perfectly

what synonymy

words that have the similar meaning

ie sofa and couch

How well did you know this?

Not at all

Perfectly

antonymy

words that have opposite meanings

big small

How well did you know this?

Not at all

Perfectly

what is hypo and nypernym

hypo is a sub term

and hyper is a super term

socretes is a man
all men are mortal
socreties is moral

hypo to hyper

but not hyper to hyp
not all men are socreties

How well did you know this?

Not at all

Perfectly

what are semeantic fields

allow you to combine various terms that are related to the same domain

ie flight booking, plane, price, meal
-> domain of air travel

How well did you know this?

Not at all

Perfectly

what is a concept

and abstract idea representign the fundemental characteristics of what it represents

How well did you know this?

Not at all

Perfectly

what is classical concept theory

aristotle

concepts have a definitional structure, a list of features and all memebrs of this class must have these features

How well did you know this?

Not at all

Perfectly

what is prototype concept therory

properties of concepts are not definitional

memebers tend to posses them but not strictly required

ie memebrs tend to look similar but they dont need to be super restrictive requriements

How well did you know this?

Not at all

Perfectly

what is the theory concept theory

catagorization by concept

as new evidence comes, new members join and definitions change

definitions are in relation to each other

what is the big bag of words method

machine learning models reqiure numbers so we can use classification to turn words lemmas synsets and concepts into features

this ignores word order so its called bag of words method

what do we need to apply to features to classififcation

feature selection, feature weighting, normalization

what is feature selection

selecting the most important of the features

what is feature weighting and normalisiation

weighing the most important features after feature selection (tfidf)

what is the power of bayes formula

bayes formula allowss us to flip the coniditions

P(class|content) =

P(content)

what is the P(class | content) in bayes formula

posteriro probabilitiy
this is the postierior probabilitiy we are interested in in bayes formula that we need to use other formulas to get

what is P(class) in bayes formula

prior probabilitiy

our belife about the class distrbution before we see more evidence

what is P(content |class) in bayes formula

the likelihood

it shows how likely it is to see this exact combinations of features

what is the P(Content) in bayes formula

the probability of the data independent of the class and is treated like a normalization factor

what is the maximum a posteriori (MAP) decision rule

y = maxP(Y|X) = P(X|Y)P(Y) ------------------ P(X) = P(X|Y)P(Y)

why is estimating P(content |spam) easier in practice than estimating P(spam |content)

because of the naive independence assumption we assume that the occurrence of each feature given the class is indepented of the occurance/non occurance of any other feature given the class

what can we do when probabilities are independent

we can multiply probabilities ie P(f1|spam)x.... P(fn|spam)

how would we estimate the priors in spam vs ham

ex P(spam) = num(spam emails in training set)/ num(all emails in trainingset) P(ham) = num(ham emails in training set)/ num(all emails in trianing set) note all priors in the set should sum up to one spam + ham = 1

how to calculate feature probabilities ex what is the chance of 'lottery' given its spam

P("Lottery"|spam) = num(spam emails with 'lottery in training set) ---------------------------------------- num(spam emails in training set) note that all of the feature probabilities for each word should sum up to one for both spam and ham cases

what are some ways we can evalutate the success of our algorithm

accuracy, confusion matrix precision, recall, f1

what is formula for accuracy

correct/ (correct+incorrect) not ideal for unbalanced classes

what is a confusion matrix

A confusion matrix is a table used to evaluate the performance of a classification model by comparing predicted and actual outcomes. It shows True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN). its fr positive fr negative predicted pos predicted neg

why is a confusion matrix better than accuracy score

It is useful because it provides detailed insights into model performance, helps calculate metrics like accuracy, precision, recall, and F1-score, and highlights specific areas where the model may be making errors.

what is precision

Precision: Measures how many predicted positives are actually correct. Formula: TP / (TP + FP)

what is recall

Recall: Measures how many actual positives were correctly predicted. Formula: TP / (TP + FN)

what is f1

F1-Score: Harmonic mean of precision and recall, balancing both. Formula: 2 × (Precision × Recall) / (Precision + Recall)