Lecture 7 Flashcards

1
Q

Old methods, new data - challenges

A

› In general, modeling social influence is complex
› Observations are not independent
› What is the relevant network of a consumer?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Text analysis - Two approaches

A

› Information directly observed (≈ counting)  Counting words

  • # verbs, nouns etc
  • # positive and negative words - wordcloud

› Information latent (≈ intelligence)
- Groups of words/sentences that relate to a certain latent topic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Latent Dirichlet Allocation (LDA)

A

“Latent topics are defined by a collection of words with a relatively high probability of usage and not from the prevalence or significance of single words”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

LDA assumptions

A

› Assumptions:
 each document is characterized by a mixture of topics.
 each topic is characterized by a discrete probability distribution over words.
› Think of a dictionary of all words in all documents.
› Each topic is a unique set of probabilities of potential word use.
› Words that are likely to occur ‘in a topic’ are used to label/identify the topics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Buschken and Allenby 2016

A

Bag of sentences instead of bag of words.
› Piece of text typically contains multiple topics.
› But, a single sentence typically pertains to one
topic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly