Lecture 4 Flashcards

1
Q

Statistical data editing

A

Observed data genrallt contains error and missing values. Thus the data must underdo preliminary preparation before the data can be Analysed

Statistical data Editif is process of checking or observed data and when necessary, correcting them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Essential tasks

A

Error localisation: determine which values are erroneous

Correction: correct missing and erroneous data in best possible way

Consistency adjust values such that all edits become satisfied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Statistical data editing - why does the data need to be edited?

A

Interview error: interviewers may not be giving the respondents the correct instructions

Omissions: respondents often fail to answer a single question or a section of the questionanaire either deliberately or inadvertently

Ambiguity: a response might mnot be legible or it might be unclear

Inconsistencies: sometime two responses can be logically inconsistent. Eg lawyer may tick box saying they didn’t attend school

Lack of cooperation: in a long questionnaire with hundreds of attitude questions, a respondent might rebel and check the same response in a long list of questions

Ineligible respondent : an inappropriate respondent may be included in the sample (eg underage respondents)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Interview error

A

Interviewers may not be giving the respondents the correct instructions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Omissions

A

Respondents often fail to answer a single question or a section of the questionnaire, either deliberately or inadvertently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data techniques to prepare data for model estimation

A

Data coding
Data matching
Data imputation
Data adjusting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Data coding

A

Data coding is specifying how the info should be categorised to facilitate the analysis. Transform data into a suitable form for the analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data matching

A

Data matching is the task of identifying, matching and merging records that correspond to the same entities from several databases or even within one database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data imputation

A

Data imputation is the process of estimating missing data and filling these values into data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Data adjusting

A

Data adjusting is a process to enhance the quality of the data for the data analysis (eg weighing, variable, repsect, scale transformation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Common procedures for statistically adjusting data

A

Weighting = each observation is assigned a number according to some pre specified rule eg weighting is used to make the smalle data more representative

Variable specification = existing data are modified to create new variables or in which a large number of variables are reduced into fewer variables eg six categories are summarised in four categories

Scale transformation = adjust the scale to ensure the compatibility with other scales eg some respondent may consistently use a lower end of a rating scale and some uppers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Two main ways we can use data

A

Two main ways we use data:
Language reflects: tech reflects intentions, relationships context and more
Eg people tweet about events near, brand positioning,

Language affects: text affect perceptions, firm outcomes and more
Eg online chatter increases stock value, narrative reviews are more persuasive than non narrative reviews

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Mode

A

Mode is the value in a measurement series (category) with maximum frequency (multiple mode values are possibibke)

Mode is low data requirement (nominal scaling)

Limits to mode is ambiguous in interpretation if multiple mode values exist + cannot be used for analysis with advanced stat modules

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Median

A

Median is the value that lies in the middle of a frequency distriburion(same number of instances above and below the median)

Low data requirements (Ordinal scaling) * low sensitivity to outliers

Limits to using median/ cannot be used for analysis with advanced stat method a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Mean

A

Mean is most popular location parameter + basis for many advanced stat analyses (t test, variance analysis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Important distributions in stats

A

Discrete distriburion is binomiaal, poisson and multinomical distriburion eg customer retention rate, frequency of purchase, brand selection probability

Continuous distriburion: normal,image ratings

17
Q

Exploratory power

A

Exploratory power is the measurement of the linear association strength between two metrically scales variables. Discretion of the correlation is visible. Values are comparable across different variables due to restrictions to interval

18
Q

Limits to explanatory power:

A

Only linear correlation can be depicted. No sufficient evidence for the presence of casual relationship. Strength of the correlation in the sense of a leverage effect cannot be identified

19
Q

How to identity casual relationships

A
  1. Evidence for a strong association (eg correlation) between two variables
  2. Changing of the cause variable precedes changing of the result variable (eg through a time lag)
  3. Evidence that No rival explanation (other correlated parameter) exists for the observed association of the variables
20
Q

Experiments group

A

Experiments group is test subjects who are exposed to the experiments stimulus like new advertisement

21
Q

Control groups

A

Test subjects to experiments / control groups

22
Q

Randomising

A

Random assignment of test subjects to experimental

23
Q

Matching

A

Test subjects in an experimental and control groups share specific criteria eg gender age

24
Q

Stimulus

A

Variation of a variable thatbshould trigger a behavioural reaction in people (eg response to price changes)

25
Q

Labatory equipment

A

Labatory equipment is the performance of the experiment in an artificial environment - test subjects are aware they they are participating in an experiment eg new product test for mobile phones

Ads are it’s higher internal validity because stimuli can be more effectively manipulated and external factors can be controlled

Dis are test subjects do not react in an natural environment making generilisarion and prediction of the effect difficult

26
Q

Field experiment

A

Field experiment is performance of the experiment in an natural environment, test subjects are not aware they are part of an experiment eg introduction of new sales promotion plan to retailers

Ads are higher external validity because test subjects are acting under real conditions, easier to predict and generalise the effect

Dis is cost intensive, activities visible to completitos less manipulation (eg limits to changes in prices) more difficult to control exyaneous factors