Topic 1 - Populations, Samples and Normal Distributions Flashcards

1
Q

What are descriptive statistics?

A

Brief descriptive coefficients that summarise a given data set of an entire or sample population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the 5 number summary?

A

range (min and max value)
median (2nd quartile, middle value)
quartiles (upper and lower, quarter of the way up and down the dataset).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the interquartile range?

A

the 3rd quartile - 1st quartile.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What would the IQR be of a dataset with 1st quartile 105.6 and 3rd quartile 111.1?

A

IQR = 3rd - 1st.
= 111.1 - 105.6 = 5.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How would you display the 5 number summary?

A

Box and whisker plot.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe a box plot.

A

Top of box = 3rd quartile
Line across box = 2nd quartile (median).
Bottom of box = 1st quartile.
So 50% of data within the box.
Whiskers can extend to a min and max but to only a given multiple of the box height intended to allow outliers to be seen.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are outliers?

A

Exceptionally small or large values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What graph is a good way to display all data in a continuous sample?

A

Histograms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What happens to the shape of distribution as a sample size increases?

A

Becomes more and more regular - normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is inferential statistics?

A

The practice of using sampled data to draw conclusions or make predictions about a population from a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How is a population often defined in inferential stats?

A

in terms of unknown parameters - µ, σ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the associated parameters of normal distribution?

A

µ - measure of location (mean)
σ - measure of spread (SD)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why can’t the mean be a good representative of skewed data?

A

Because the mean is unduly influenced by few large values in a sample so it doesn’t represent the whole sample well.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is an interpretable feature of the normally distributed curve?

A

Area under the curve.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

If we call P the area under the curve up to any value of X, what is the meaning of P?

A

The probability that a randomly chosen member of population has value < X (cumulative probability).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the probability of observations larger than X?

A

1-P
Symmetry means that it follows the proportion that whatever 1-P = is what proportion of sample is above and below the mean.

17
Q

What does sample mean (m) estimate?

A

Population mean (µ)

18
Q

What does sample SD (s) estimate?

A

Population SD (σ)

19
Q

What are mean and SD useful for?

A

Normal data only.

20
Q

When would you use the median or IQR?

A

When data is not normal or if unsure whether data is normal.

21
Q

What do the greek letters µ and σ refer too?

A

The population mean µ
The population SD σ

22
Q

What do the roman letters m and s refer too?

A

The sample mean m
The sample SD s

23
Q

What can be estimated by the sample mean and SD?

A

Population mean and SD

24
Q
A