Lexical Knowledge Bases Flashcards

1
Q

What is a lexicon?

A

In computer science, a machine-readable dictionary that supports NLP functions such as POS, inflections (oxen instead of oxes), transitive vs. intransitive verbs (does the verb need an object?).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a lexical knowledge base?

A
  • Organize words into senses
  • Link senses via relations as examples below
  • Goes beyond a lexicon because it connects words in a lexicon (synonyms, antonyms, hyponyms, hypernym, holonom/meronym). WordNet is a lexical knowledge base.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a hyponym/hypernym relationship?

A

In the dog example, a hyponym might be something like a springer spaniel. It gets less. A hypernym would be mammals, it gets more than just dogs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a meronym/holoynym relationship?

A

A part of something else, e.g., a wheel is part of a car. It’s the whole thing that other things are part of. A car is a holonym of wheel.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a semantic network?

A

Knowledge base that has a network of different types of relationships between different words. A has relationship would be a cat has fur. A is a relationship is a cat is a mammal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is ontological distance?

A

Uses the hypernyms in wordnet to count the steps to get from one word to another. The closer the distance, the more similar the words are. It could also be done between documents or words in two documents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Limits to this approach

A

Word sense. We don’t know which sense of the word chair was meant. We usually just use sense #1 - it should be right about 70% of the time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is monosemy?

A

Words that only have one meaning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are polysemy?

A

Words that have more than one possible meaning. The more common a word is, the more polysemous they are.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Building or extending lexical knowledge bases

A

You can supplement WordNet by taking information from other sources such as dictionaries, encyclopedias, and taxonomies. Examples are Gety Vocabularies, Amazon, Urban Dictionanary, Wiktionary.

Urban Dictionary

1) Check the robots.txt file to find out if we have permission to grab data. http://website.com/robots.txt
2) If you’re good to go, go to the sitemap xml, can use Beautiful Soup to get all the words

Getty has an JSON API

Wiktionary can be downloaded as XML

Urban Dictionary also has an undocumented API (a lot of sites have undocumented APIs, search Stack Exchange)

Encyclopedic Resources

  • Wikipedia
  • IMDB
  • DotDash (about.com)
  • Investopedia
  • International Encyclopedia of the First World War
  • Internet Encyclopedia of Philosophy

Taxonomical organizations

  • Curie (DMOZ)
  • Sitemaps from CNN Money, Vogue, LA Times, SFGate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Applications of lexical knowledge bases

A

Examples:

1) Enhance search engines
- Query expansion: Adding more words to the words typed in.
- Related searches
- More like this function, uses psuedorelevance

2) Writing evaluation and advice
(helping children or adults learn how to write better)
-if you use fun three times, suggest another option
-suggest that the writer be more specific

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Capsules

A

Short text for every search result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Pseudorelevance feedback

A

Expanding initial query/results to include more results for a query

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a graph?

A

Data Structure that allow you to represent relationships. There are two main parts - vertices (nodes) where the data is stored and edges (connections) which connect the nodes. Once you put the actual values in a graph it becomes a knowledge graph. Schema of a table is similar to an ontology of a graph. The ontology defines the schema for a graph, but as soon as there are instances with, it becomes a knowledge graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Challenges

A

-Freshness: Is the information up to date?
-Coverage: Do we have all the information we need?
-Correctness: is our information accurate? Correctness is always hard. What is true and correct? There has to be human validation.
You can have two out of three

Entity resolution: If multiple sources or entries, we need to clean or determine which source will be the one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

UberEats query2vec example

A
Word = a Search query
Query = Spicy food or Tan Tan Noodle would be in the context of the restaurant Hunan Noodle House
17
Q

Knowledge graph construction

A

-Ingest knowledge from structured and unstructured sources. Unstructured sources could include ml learning based as well as web pages, rule-based, tree-based, etc.

18
Q

Troponym

A

Verb