L10: Topic Modelling Flashcards
Topic modelling objective: a tool for organization of information
TRUE/FALSE
TRUE
Which of the following are steps for topic modelling?
A) Discover the thematic structure: which themes do the documents belong to?
B) Annotate the documents according to themes
C) Use the annotations to organize, summarize, search and form predictions
ALL ARE CORRECT
A) Discover the thematic structure: which themes do the documents belong to?
B) Annotate the documents according to themes
C) Use the annotations to organize, summarize, search and form predictions
Topic modelling provides methods for automatically organizing, understanding, searching, and summarizing large electronic archive
TRUE/FALSE
TRUE
Topic models helps determine the probability that each document is associated with a given theme or topic.
TRUE/FALSE
TRUE
Latent Dirichlet Allocation (LDA) is a probabilistic model used in topic modeling to discover underlying topics within a collection of documents; it assumes that each document is a mixture of topics, and each topic is a mixture of words, providing insights into the thematic structure of the text corpus.
TRUE/FALSE
TRUE
The output of LDA: it produces the probability that each document within the corpus is associated with each of the k topics specified by the user
Structural Topic Modelling (STM) is very similar to LDA, but it employs meta data on top (data that provides information about other data, e.g., characteristics, properties) about documents
TRUE/FALSE
TRUE
Name of the author and date in which the document was produced are examples of?
Metadata used in Structural Topic Modelling (STM)
What is the utility of stLDA-C?
stLDA-C is useful for topic modelling for short texts where LDA usually performs poorly
What is the primary goal of topic modelling?
Identifying hidden thematic structure
In topic modelling, what does a “topic” represent?
A cluster of documents
What is the purpose of the term “bag-of-words” in topic modelling?
It ignores the order of words and considers only their frequency.
What is the purpose of using a term-document matrix in topic modeling?
It represents the relationship between frequency of given terms/ words and documents
What is the primary objective of Structural Topic Modeling (STM)?
Analysing the relationship between topics and document metadata
Understanding the emotional sentiment expressed in text can be facilitated by _____?
Sentiment analysis