lecture 4 Flashcards
overview of brand perception data
several market research companies track the perception of firms and brands
This includes variables such as attitudes towards the firm, customer satisfactions, reputations, quality perceptions etc
Indication for how strong a brand is in the hearts and minds of consumers
Some potential limitations (brand perception data)
No actual purchase behaviour (e.g. sales)
Response bias
Sampling bias
Overview of stock return data
Refers to the current price that a share of s tock is trading for on the market
A companys stock price reflects investor perception of its ability to earn and grow its profits in the future
Issues within and outside of a company may cause a stock price to move in either direction
Stock price data is available for free (e.g. yahoofinance.com)
Some potential limitations (stock price data)
Sampling bias (i.e. only available for traded companies)
Only focus on investors
Text data
Huge amounts of text data: online reviews, social media posts, texts, customer service calls, open-ended survey questions, firm annual reports, advertisements, newspaper articles, movie scripts, song lyrics, etc.
Text data: textual data goes beyond social media
firm to firm
consumer to consumer
society to society
Cleaning big data
most time consuming and least enjoyable data science task, surve says
What data scientists spend the most time doing
building and training sets 3%
Cleaning and organizing data 60%
Collecting data sets 19%
Mining data for patterns 9%
Refining algorithms 4%
other 5%
Statistical data editing
observed data generally contains errors and missing values. Thus, the data must undergo preliminary preparation before the data can be analyzed
Process of checking observed data, and, when necessary, correcting them
Essential tasts (statistical data editing)
Error localization: determine which value are erroneous
Correction: correct missing and erroneous data in best passible way
Consistency: adjust values such that all edits become satisfied
Interviewer error
interviewers may not be giving the respondents the correct instructions
Omissions
respondents often fail to answer a single question or a section of the questionnaire, either deliberately or inadvertently
Ambiguity
a response might not be legible or it might be unclear
Inconsistencies
Sometimes two responses can be logically inconsistent. For example, a respondent who is a lawyer may have checked a box indicating that he or she did not compmlete high school
Lack of cooperation
In a long questionnaire with hundreds of attitude questions, a respondent might rebel and check the same response in a long list of questions
Ineligible respondent
An inappropriate respondent may be included in the sample (e.g. underage respondents)
data coding
specifying how the information should be categorized to facilitate the analysis. The main purpose is to transform the data into a form suitable for the analysis
Data matching
task of identifying, matching and merging records that correspond to the same entities from several databases or even within one database