research skills 7 data distribution Flashcards

1
Q

what is normal distribution ?

A

tendency of data to cluster around the mean - bell shaped curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are the characteristics of normal distribution ?

A
  • x axis - continuous scale of measurements
  • peak of curve = mean
  • y axis - probability density
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is probability density ?

A
  • all the population in the normal distribution is under the curve
  • relationship between observations and their probability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how is the Z score calculated ?

A

X - mean / SD

X = value on x axis

use z score to find percentage on table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is a population and a sample ?

A

A population is the huge collection of individuals or data points and a sample is a smaller group drawn from that population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the types of sampling ?

A

Random – chosen entirely by chance
Systematic – selected at regular intervals
Stratified – divided into subgroups that share a characteristic first
Clustered – subgroup used for sample
Convenience – first volunteers through door, past a threshold
Quota – need a set number of each sub group
Purposive – relies on the judgment of the collector
Snowball – one chosen person, recruits the next person

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is sampling distribution

A

data comprised of the means of various different samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is the central limit theorem ?

A

The sampling distribution of the mean is approximately a normal distribution if the sample size is large enough.”

It says – that if the sample size is large enough then the sampling distribution of the mean is a normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a confidence interval ?

A

a range of values, calculated from sample data, that is likely to contain the true population parameter (like the mean) with a certain level of confidence, typically 95%

Sample size = smaller sample size will have less accuracy and will be less representative, larger samples will give more accuracy, a large sample is considered to be 30 or more.

Variation = if the variation in the actual real population is high then the variation in the sample will also be high

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how do calculate confidence interval ?

A

the number of observations (n)
the mean (X)
and the standard deviation (s)

Decide what Confidence Interval we want: 95% or 99% are common choices. Then find the “Z” value for that Confidence Interval

Plug in all the numbers to the equation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

when do you use the t score

A

use instead of z value when calculating confidence interval
More suitable for sample sizes of under 30.
need degree of freedom to find out confidence level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly