Week two Flashcards

1
Q

What is kurtosis?

A

Kurtosis is a measurement of how much data centres around the mean compared to the tails in a probability distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is leptokurtic?

A

Leptokurtic refers to a probability distribution that has more data centred around the mean and fatter tails than a normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is platykurtic?

A

Platykurtic distributions have smaller tails than a normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is a q-q plot?

A

A q-q plot refers to a quantile-quantile plot. It allows us to assess whether a sample distribution is normal. It is normally used when checking assumptions for a given statisical test, such as normality.

In regression for example, one assumption is that the residuals are normally distributed. The residuals from a sample would be compared with values from a theoretical normal distribution. If the correlation is linear then the residuals are normally distributed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If CLT allows us to use stats based on normality, why does it matter if we have skewed data or why does kurtosis matter?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is homoscedasisty?

What is heteroscedasticity?

Why are these descriptors important?

A

Homoscedasticity refers to equal variation of one variable across all values of another variable.

e.g. if we had depression scores for people aged 10 to 50 and the distribution of was homescedastic, then the variantion of depression scores for each age would be the same.

If the distribution was heteroscedastic, then the variation would not be the same. For example, we might see the depression scores for adolescents varies a lot, whereas, the depression scores for people over 40 have a similar value, i.e. do not vary much.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are some reasons for heteroscedasticity? (look at slides)

A

One of the variables may not be normally distributed.

The relationship between the variables is heteroscedastic in nature.

There is an error in measurement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why do we transform our data? When do we need to consider doing this?

A

Parametric statistical tests require variables to be normally distributed. If they are not, then we can transform the data into a normal distribution. This allows us to use parametric tests without violating their assumption of normality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When would we use a log transformation?

When would we use a square root transformation?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the steps that we go through when transforming data? (look at slides)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what are the limitations of transforming data?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the relationship between skewness, kurtosis, and homoscedasticity?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly