W3 L1 Information extraction Flashcards

Question 1

Q

why information extraction

Answer

A

we need it to get structure information (facts we want) from an unstructured/potentially messy input

ie where was einstein born
-> we need to extract this information from the many relevant documents

Question 2

Q

are terms like albert einstien a challenge for information extraction

Answer

A

yes, words that should be treated as a single unit are a challenge, some words are part of larger groups and should be grouped together

ie albert einstien is together not two seperate words

Question 3

Q

does the order of words matter in sentneces and information extraction

Answer

A

yes words occupy specific positions in a sentence for a reason and relate to each other in particular ways

the order of words matter, its not random

was einstine born albert where is garbage

Question 4

Q

what are parts of speech and pos tagging

Answer

A

parts of speech are nouns verbs adj and adverbs

part of speech tagging is identifying and tagging these parts of speech

Question 5

Q

why is pos tagging hard

Answer

A

there are many different intepretations for the same words

ie
we can fish
-> we go fishing
-> we put fish in cans

Question 6

Q

what is a markov chain

Answer

A

a model that defines the chance of a sequence of random variables/states happening

Question 7

Q

what is in a markov chain

Answer

A

states, transitions
give us markov chains
these markov chains have transition probabilities

the starting of these transition probabilities is the start distribution

Question 8

Q

what is the assumtion we are making in markov chain

Answer

A

they are memoryless

in order to predict the weather for tomorrow you only need to consider the weather today

Question 9

Q

what is a first oder markov chain

Answer

A

only the current state matters in determining the future

Question 10

Q

recall what these notations are

S, A, pi

Answer

A

S = s1…sN set of N states

A = a1….aNN transition probability matrix
the sum of the rows must equal 1

pi = pi1…piN initial probability distribution
the sum of pi = 1

Question 11

Q

how can we apply markov models to language

Answer

A

we use markov chains to calculate the probability of certain words appearing given the previous word

Question 12

Q

how can we use hidden markov models for pos tagging

Answer

A

if we recorded some observations that depended on the hidden events we can try to identify whta the hidden events were that were underneath, causing our surface observations

Question 13

Q

in pos tagging what would be the observations and the hidden states

Answer

A

observations = words in sentences

hidden states = part of speech

Question 14

Q

what elements does a hidden markov model have

Answer

A

states, transition matrix, initial probabiltiy distribution

AND

O = o1…oT sequence of T observations

B = bi(ot) sequence of observation likihoods (emission probabilities)

W3 L1 Information extraction Flashcards

(14 cards)