W2 T2 Flashcards
A Web Scraper will only give you data that
a that you see on the website (available and visible to visitor): product features, ratings, reviews, etc
A Web Scraper does NOT give you wha
t customers/visitors do on that website, what and how they click ! (non-visible information)
tekst Analytics is a broad domain in data science and many methods-approaches available.
In Digital Marketing practice we mainly use two methods – with the help of Natural-Language-Processing / Machine Learning.
1 Sentiment (Emotion-Valence)
2 Topic Analysis or Modeling:
1 Sentiment (Emotion-Valence) Analysis:
Quantifying the positivity-negativity or emotions from a text data
2 Topic Analysis or Modeling:
Detecting/classifying/clustering the main topics in a large textual data file.
what do sentiment/topic analyis tools?
analyze valence autoamted by usng machine learning based models/algohoritms
Sentiment Analysis (in practice) classifies 4 different classes
(i) Positivity/Valence (ii) Polarity-Extremity (iii) Subjectivity/Emotionality or if more advanced (ii) Emotions – in detailed breakdow
Sentiment Models/Techniques can be: 2
1 Key-Word (Lexicon) based: simple/easier, faster, more classification error (see the example on the right)
2 Contextual-Semantic based: more accurate – takes a look at the meaning in all sentence, not only the keywords (example: on left)
Sentiment Analysis: Subjectivity & Polarity
SUBJECTIVITY analysis classifies content into objective (facts) or subjective (opinions)
POLARITY analysis indicates what is the strength / how fierce is of an opinion as being positive, neutral, or negative
what can be clues for sarcasm? 3
Interjections, Puncuation marks, quotes etc
Weakness of Lexicon-Dictionary based methods in Sentiment Analysis:
they only look at key-words – mostly not to the actual meaning or context
But still many firms use dictionary-keyword based methods: since
they are easy, practical, less costly and requires less investment to data science
Topic Modeling (Analysis) is
is mostly classifying tekst-data on the basis of main topics covered/mentioned.
Sentiment & Topic Analysis: Tools 2
1.1 Monkeylearn (simple: for this course) 1.2 Evaluative Lexicon (for this course)
evaluative lexicon
More advanced lexicon based sentiment detection with higher level ofquantification (valence, extremity, emotionality scores) Academicallyvalidated. Has its limitations (lexicon based)