Lecture 3 Flashcards

1
Q

Data coding

A

Specifying how the information should be categorized to
facilitate the analysis. The main purpose is to transform the data into a
form suitable for the analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data matching

A

Task of identifying, matching and merging records that
correspond to the same entities from several databases or even within
one database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data imputation

A

Process of estimating missing data and filling these

values in into data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data adjusting

A

Process to enhance the quality of the data for the data

analysis (e.g., weighting, variable respecification, scale transformation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Weighting

A

Procedure by which each observation (e.g. consumer responses) in the database is
assigned a number according to some pre-specified rule

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Variable respecification

A

Procedure in which the existing data are modified to create new variables, or in which a large number of
variables are reduced into fewer variables

For example, six categories are summarized in four categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Scale

transformation

A

Procedure to adjust the scale to ensure comparability with other scales. Like grading systems in different countries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How to identify causal relationships

A

Evidence for a strong association (e.g. correlation) between two variables.

Evidence that no rival explanation (other correlated parameter) exists for the observed association of the variables.

Changing of the cause variable precedes changing of the result variable (e.g. through a time lag).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Experimental group

A

Test subjects who are exposed to the experimental stimulus, e.g. a new
advertisement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Control group

A

Test subjects who are not exposed to the experimental stimulus

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Randomizing

A

Random assignment of test subjects to experimental / control groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Matching

A

Test subjects in experimental and control groups share specific criteria (e.g.
gender, age)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Stimulus

A

Variation of a variable that should trigger a behavioral reaction in people (e.g.
response to price changes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Entity extraction

A

Which words people write about?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Topic Modelling

A

What topic people write about?

Topics in movie reviews, motivations to host on AirBNB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Sentiment analysis

A

How positive/negative is the text?

17
Q

Relationship between entities? (words)

A

How do words relate to each other (What side effects are mentioned with the drug?)

18
Q

Writing style

A

What is the writing style between the words? (Identifying personality traits of social media users)

19
Q

Mode

A

Low data requirements
Intuitive understanding

Ambiguous if multiple mode values exist
Cannot be used with advanced statistical methdos

20
Q

Median

A

Low Data Requirements
Low sensitivity to outliers

Can’t be used with advanced statistical methods

21
Q

Mean

A

Most popular location parameter
Basis for many advanced statistical analyses

Sensitive to outliers
High scale requirements (interval scaling)

22
Q

What is causality?

A

Causality is variable X causing a change in variable Y. WE need to consider control variables to claim causality

23
Q

How to identify causal releationships?

A

Evidence for strong association between two variables

Changing of the cause variable precedes changing of the result variable

Evidence that no rival explanation exists for the observed association of the variables