lecture 4 summary Flashcards

1
Q

statistical data editing

A

is the process of checking observed data and correcting them if necessary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

error localization

A

determines which values are erroneous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

we recognize the following types of errors

A

interviewer error: interviews may not be giving the respondents the correct instruction

Omissiong: respondents often fail to answer a single question or a section of the questionnarie, either deliberately or inadvertently

AMbiguity: a response might not be legible, or it might be unclear

Inconsistncies: sometimes two responses can be logically inconsistent

lack of cooperation: in the long questionnaire with hundreds of attitude questions, a respondent might rebel and checkthe same response in a long list of questions

Ineligible respondent: an inappropriate respondent may be included in the sample (e.g. underage respondents)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data coding

A

is specifying how the information should be categorized to facilitate the analysis. The main purpose is to transform the data into a form suitable for the analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data matching

A

is the task of identifying, matching and mergin records that correspond to the same entities from severaldatabases or even within one database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data imputation

A

is the process of estimating missing data and filling these valuees into the dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Data adjusting

A

refers to the process to enhance the quality of the data for the data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Weighting

A

is the procedure by which each observation in the database is assigned a number according to some pre-specified rule

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Variable re-specification

A

is the procedure in which the existing data are modified to create new variables, or in which a large number of variables are reduced into fewer variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

scale transformation

A

is the procedure to adjust the scale to ensure comparability with other scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The model

A

is the value in a measurement series (category) with maximum frequency (multiple mode values are possible)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Median

A

is the value that lies in the middle of a frequency distribution (same number of instances above and below the median)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

discrete distributions

A

such as binomial distribution, poisson distributions, and multinomial distributions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Continuous distributions

A

such as normal distributions, log-normal distributions, t-distributions and f-distributions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A positive correlation and negative correlation reflects

A

a positive correlation reflects a tendency for a high value in one variable to be associated with a high value in a second variable.

A negative correlation reflects an association between a high value in one variable and a low value in a second variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Correlation analysis

A

is a measurement of the linear association strength between two metrically scaled variables. Values are comparable across different variables due to restrictions to the interval. It recognizes the following limitations:

Only linear correlations can be depicted

No sufficient evidence for the presence of a causal relationship

Strength of the correlation in the sense of a leverage effect cannot be identified

Spurious association is possible if background variables are not controlled for

17
Q

Causality means

A

that a change in one variable will produce a change in another. If we can claim for sure that x causes y, we can talk about a causal relationship. If there are theoretical reasons why different variables such as z cause a change in x and y, we need to control for this variable (e.g. strategy to increase ice cream prices), otherwise we cannot claim causality. Furthermore, we need to make sure that x causes y and not that y causes x. So the first approach to determine the direction of causation is to draw from logic and previous theories. Theory always comes first. The second approach to determine the direction of causation is to consider that there is usually a tome lage between cause and effect, and so if such time lage can be postulated a causal relationship can be identified.

18
Q

There are three ways to identify causal relationship

A

there exists theoretical evidence for a strong association, or correlation between two variables

Changing of the cause variable precedes changing of the result variable

Evidence that no rival explanation (other correlated parameter )exists for the observed association of the variables

19
Q

Experiment

A

features a formulation of a causal relationship (hypothesis) it is an evaluation of the directional influence of one or more independent variables on one or more dependent variables

20
Q

Experimental group

A

test subjects who are exposed to the experimental stimulus eg. a new advertisement

21
Q

Control group

A

test subjects who are not exposed to the experimental stimulus

22
Q

Radnomizing

A

random assignment of test subjects to experimental / control groups

23
Q

Matching

A

test subjects in experimental and control groups share specific criteria

24
Q

stimulus

A

variation of a variable that should trigger a behavioural reaction in people

25
Q

Laboratory experiment

A

is a performance of the experiment in an artificial (laboratory) environment. Test subjects are aware that they are participating in a test

26
Q

Field experiment

A

a performance of the experiment in a natrual environment. Test subjects are not aware that they are part of an experiment

27
Q

limits of correlation analysis

A

only linear relations can be depicted

no evidence for causal relationship

strength of the correlation in the sense of a leverage effect cannot be identified

spurious association is possible if background variables are not accounted for