Distributions, Sample & Populations Flashcards

1
Q

What is a continuous variable?

A

A variable where values can change

e.g. temperature could be 4C. 10.34C or -0.0000513C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a discrete variable?

A

A numbered variable that has a fixed set of values

e.g. number of cars shown

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a histogram?

A
  • Visualise the distribution of a dataset
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do histograms for continuous data look like?

A

X- axis is split into “bins”
Each bin covers a set range

  • Increasing the number of bins in the histogram gives more resolution
  • Fewer bins are less noisy but tend to be less informative
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the 4 distribution shape metrics?

A
  • Mean
  • Variance
  • Skewness
  • Ketosis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe the mean in terms of distribution shape

A

Shape stays the same but the centre of mass shifts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe variance in terms of distribution shape

A

stretches or compresses the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe skewness in terms of distribution shape

A

negative skewness will have a long tail

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe ketosis in terms of distribution shape

A

Effects the peak (high = sharp peak)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What do you use to test for normal distributions

A

Shapiro Wilk W
Shapiro Wilk P

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does Shapiro Wilk W test for normal distributions

A
  • Testing the null hypothesis that our data is normally distributed

If the test is non-significant = normal

If the test is significant = then the distribution is significantly different from a normal distribution.
So higher values indicate more normal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does the Shapiro Wilk P test do?

A

A probability indicating how significant any difference from normality is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does sampling work?

A
  • we can’t test everyone, we can only take a sample from our sample population
  • The issue is that our populations aren’t heterogeneous. There will be lots of additional variability that we can’t control
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How is the standard error of a mean used to make inferences about a sample of data?

A

How well do we believe our data sample can be used to approximate to our population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the calculation for the standard error of a mean

A

Sample mean = (sum if all individual data points) divided by (total number of data points)

Sample standard deviation = the square root of (the sum of the squared difference between the sample mean and each individual data point) divided by (total number of data points)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly