lecture 4 Flashcards

1
Q

overview of brand perception data

A

several market research companies track the perception of firms and brands

This includes variables such as attitudes towards the firm, customer satisfactions, reputations, quality perceptions etc

Indication for how strong a brand is in the hearts and minds of consumers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Some potential limitations (brand perception data)

A

No actual purchase behaviour (e.g. sales)

Response bias

Sampling bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Overview of stock return data

A

Refers to the current price that a share of s tock is trading for on the market

A companys stock price reflects investor perception of its ability to earn and grow its profits in the future

Issues within and outside of a company may cause a stock price to move in either direction

Stock price data is available for free (e.g. yahoofinance.com)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Some potential limitations (stock price data)

A

Sampling bias (i.e. only available for traded companies)

Only focus on investors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Text data

A

Huge amounts of text data: online reviews, social media posts, texts, customer service calls, open-ended survey questions, firm annual reports, advertisements, newspaper articles, movie scripts, song lyrics, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Text data: textual data goes beyond social media

A

firm to firm

consumer to consumer

society to society

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Cleaning big data

A

most time consuming and least enjoyable data science task, surve says

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What data scientists spend the most time doing

A

building and training sets 3%

Cleaning and organizing data 60%

Collecting data sets 19%

Mining data for patterns 9%

Refining algorithms 4%

other 5%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Statistical data editing

A

observed data generally contains errors and missing values. Thus, the data must undergo preliminary preparation before the data can be analyzed

Process of checking observed data, and, when necessary, correcting them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Essential tasts (statistical data editing)

A

Error localization: determine which value are erroneous

Correction: correct missing and erroneous data in best passible way

Consistency: adjust values such that all edits become satisfied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Interviewer error

A

interviewers may not be giving the respondents the correct instructions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Omissions

A

respondents often fail to answer a single question or a section of the questionnaire, either deliberately or inadvertently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Ambiguity

A

a response might not be legible or it might be unclear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Inconsistencies

A

Sometimes two responses can be logically inconsistent. For example, a respondent who is a lawyer may have checked a box indicating that he or she did not compmlete high school

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Lack of cooperation

A

In a long questionnaire with hundreds of attitude questions, a respondent might rebel and check the same response in a long list of questions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Ineligible respondent

A

An inappropriate respondent may be included in the sample (e.g. underage respondents)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

data coding

A

specifying how the information should be categorized to facilitate the analysis. The main purpose is to transform the data into a form suitable for the analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Data matching

A

task of identifying, matching and merging records that correspond to the same entities from several databases or even within one database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Data imputation

A

Process of estimating missing data and filling these values in into data set

20
Q

Data adjusting

A

Process to enhance the quality of the data for the data analysis (e.g. weighting, variable respecification, scale transformation)

21
Q

Common procedures for statistically adjusting data

A

Weighting: Procedure by which each observation (e.g. consumer responses) in the database is assigned a number according to some pre-specified rule

For example, Weighting is used to make the sample data more representative

Variable respecification: procedure in which the existing data are modified to create new variables or in which a large number of variables ar ereduced into frewer variables

For example, six categories are summarized into four categories

Scale transformation: procedure to adjust the scale to ensure comparability with other scales

For example, some respondents (e.g. from different cultures) may consistently use the lower end of the rating scale, whereas other may consistently use the upper end. These differences can be corrected for

22
Q

two main ways we can use text data

A

Language FEFLECTS

Text reflects intentions, actions, relationships, context and more

Eg. People tweet about events near vs far using difference in concreteness

Brand positioning maps

Customer service that uses “I” vs “we” can have greater impact on customer satisfaction

Language AFFECTS

Text affects perceptions, firm outcomes and more

Eg. Online chatter increases stock value

narrative reviews are more persuasive than non narrative reviews

Frames impact implicit attitudes about consumption pratice

23
Q

types of scales and informative statistics location parameter (dispersion parameter)

A

nominal (mode)

Ordinal (median, mode)

Interval + ratio (mean, median, mode) + (variane and SD)

24
Q

Mode

A

value in a measurement series (category) with maximum frequency

25
Q

Median

A

value that lies in the middle of a frequency distribution

26
Q

mode meaning and limits

A

low data requirements (nominal scaling) + intuitive understanding

Limits: ambiguous in interpretation if multiple mode values exist

Cannot be used for analysis with advanced statistical methods

27
Q

Median meaning and limits

A

low data requirements (ordinal scale) + low sensitivity to outliers

Limits: cannot be used for analysis with advanced statistical methods

28
Q

mean (meaning and limits)

A

most popular location parameter

basis for many advanced statistical analyses (t-test, variance analysis etc)

limits: sensitive to outliers

high scale requirements (interval scaling))

29
Q

discrete distribution

A

Name: binomial distribution, poisson distribution, multinominal distribution

E.g. Customer retention rate, frequency of purchase, brand selection probability

30
Q

Continuous distribution

A

name: normal distribution, log-normal distribution, x^2 distribution, t-distribution

e.g. image ratings, scales, special distribution in inferential statistics

31
Q

Empirical distribution

A

Many features exhibit normal distribution in reality (e.g. body size)

32
Q

Distribution model for statistical parameters

A

statistical parameters such as mean and variance exhibit normal distribution upon multiple sampling

33
Q

mathematical base distribution

A

Distributions in inferential statistics are derived from normal distribution

34
Q

Distribution in error theory

A

random errors in repeated measurements exhibit normal distribution

35
Q

explanatory power and limits of correlation analysis

A

measurement of the linear association strength between two metrically scaled variables

Direction of the correlation is visible

Values are comparable across different variables due to restriction to interval (-1,+1)

No dependence on the sample size

Strength of the correlation in thesense of the explained variance can be identified (r^2 = Explained variance)

prerequisite for statistical verification of (linear) causal relationships

Limits: only linear correlations can be depicted

no sufficient evidence for the presence of a causal relationship

strength of the correlation in the sense of a leverage effect cannot be identified

Spurious association possible if background variables are not controlled for (-> partial correlation coefficient as workaround)

36
Q

How to identify causal relationships

A

1) evidence for a strong association (e.g. correlation) between two variables

2) Changing of the cause variable precedes changing of the result variable (e.g. through a time lag)

3) Evidence that no rival explanation (other correlated parameter) exists for the observed association of the variable

Experiments establish (the best) conditions that make it possible to determine causal relationships

37
Q

Features of an experiment

A

1) formulate a causal relationship

2) evaluation of the directional influence of one or more independent variables on one or more dependent variables
(definition of the independent variables to be manipulated, definition of the dependent variables to be measured, definition of the variation steps (manipulation) of the independent variables)

3) controlling of all disturbing influences (control variables) to exclude distortion of the results
(selection of the test subjects and assignment to the groups, Controlling of the selection bias, minimizing the inlfuence of other external variables

38
Q

experimental group

A

test subjects who are exposed to t he experimental stimulus, e.g. a new advertisement

39
Q

Control group

A

test subjects who are not exposed to the experimental stimulus

40
Q

Randomizing

A

random assignment of test subjects to experimental/ control groups

41
Q

Matching

A

test subjects in experimental and control groups share specific criteria (e.g. gender, age)

42
Q

Stimulus

A

variation of a variable that should trigger a behavioral reaction in people (e.g. response to price changes)

43
Q

Lab experiment

A

performance of the experiment in an artificial laboratory environment

Test subjects are aware that they are participating in a test

Advantages: higher internal validity because stimuli can be more effectively manipulated and exetnal factors better cotnrolled, lower costs

Disadvantages: test subjects do not react as in a natural environment, making generalizations and predictions of the effect difficult

Lower external validity

44
Q

Field experiment

A

Performance of the experiment in a natural environment

Test subjects are not aware that they are part of an experiment

e.g. INtroduction of a new sales promotion plan to retailers

Advantages: higher external validity because test subjects are acting under real conditions

Easier to predict and generalize the effect

Disadvantages:
cost intensive

Activities visible to competitors

less manipulation freedom (e.g. limits to changes in the price)

More difficult to control extraneous factors

45
Q
A