Intro To NLP Flashcards

1
Q

What is NLP?

A

The development of systems with knowledge of human language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What subject areas does NLP combine? (4)

A

Linguistics, computer science, maths, psychology

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the main challenge of NLP? And what does it mean?

A

Ambiguity- more than one possible interpretation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 4 types of ambiguity?

A

phonological, lexical, syntactic, semantic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe phonological ambiguity

A

words that sound the same but have a different meaning

e.g. red and read, flower and flour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe lexical ambiguity

A

Due to a word having multiple senses

e.g. I am going to the bank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe syntactic ambiguity

A

Due to a word having more than one possible part of speech

e.g. I saw her duck

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe semantic ambiguity

A

due to lack of knowledge of the world: multiple possible interpretations unless knowledge of the world is available.
e.g.”the children ate the cookies because they were very hungry”
were the children hungry or were the cookies hungry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the two approaches to NLP

A

symbolic or statistical/ML based (+hybrid)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe the symbolic approach to NLP

A

rule and dictionary based

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the advantages of a symbolic approach (3)

A
  • expert knowledge yields highly predictive results
  • interpretable results
  • good when labelled data is hard to obtain
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the disadvantages of a symbolic approach (3)

A
  • shortage of experts
  • laborious rule writing
  • domain adaptation problematic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe the statistical/ML approach to NLP

A

Use a large amount of data to discover patterns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the advantages of a statistical/ml approach (2)

A
  • can generalise to unseen examples

- good when a dictionary is unavailable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the disadvantages of a statistical/ml approach (3)

A
  • need labels, therefore need people, therefore time consuming
  • retrain for new domain
  • black box, cant inspect
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a pipeline?

A

A system of components, each tackles a different problem

17
Q

What would be an example NLP pipeline?

A

sentence segmentation - tokenisation - POS tagging - parsing - information extraction

18
Q

What is the difference between NLP and Text mining?

A

NLP is about language structure, linguistics and grammar
Text mining is not grammatical, its things like frequencies to find correlations
NLP Text Mining is text mining driven by NLP