lecture 1 & 2 Flashcards

1
Q

estimates

A

claims about the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

why is statistics important

A
  1. Statistical models also help us draw inferences from huge datasets
  2. Uncertainty → inferences always come with some level of uncertainty - statistics allows us to measure this uncertainty
  3. Gives tools to process info in a principled manner in order to draw inferences (claims) about the world
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

spurious correlations

A

occurs when two variables are statistically related but not directly causally related. These two variables falsely appear to be related, normally due to an unseen, third factor.
e.g. eating ice cream increases the chances of being involved in a shark attack
third factor –> being at the beach

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

causes of spurious correlations

A
  1. coincidence
  2. confounding variables
  3. small sample size
  4. overfitting
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

overfitting

A

occurs when a statistical model fits exactly against its training data. When this happens, the algorithm unfortunately cannot perform accurately against unseen data, defeating its purpose.
When the model memorizes the noise and fits too closely to the training set, the model becomes “overfitted,” and it is unable to generalize well to new data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

variables

A

a characteristic of a concept that takes on different values from one case to another or, for a given case, from one time to another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

types of variables

A
  1. nominal
  2. ordinal
  3. quantitative
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

nominal variables

A

Different categories, no natural ordering (one category is not “more” than another)
E.g. Religion - Continent - Colours - Party - Etc.
blue is different from red.
blue is more than red. → nonsensical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ordinal variables

A

Different categories, with a meaningful ordering.
The distance between the two categories is not meaningful.

E.g. very dissatisfied < dissatisfied < neither dissatisfied or satisfied < satisfied < very satisfied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

quantitative variables

A

Different categories, with a meaningful ordering AND the distance between two categories, have a meaning.

e.g. Number of votes, Temperature (Degrees), GDP (€)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly