Year 1 Stats Flashcards

1
Q

Discrete vs continuous data

A

Discrete is countable - shoe size (binomial distribution)

Continuous is measurable - shoe length (normal distribution / histograms)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Target population

A

All the members of the population that would ideally take part in your study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Sample

A

A subset of a target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sampling frame

A

A list or database of the target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Census

A

Measures or observes every member of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Advantages and Disadvantages of census

A
  1. Completely Accurate - collects data from everyone
  2. Expensive/time consuming
    2 Cannot be used in testing which destroys the item
    3 hard to process large quantities of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Steps to simple random

A
  1. Have a sampling frame and have a number on every member of sample
  2. Use random number generator to pick members
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Advantages of using a sample (2)

A
  1. Less time consuming/cheaper than census

2. Less data to process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Disadvantages of using a sample (2)

A
  1. Inaccurate

2. May not give any information about small sub groups of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Advantages of simple random sampling (2)

A
  1. Minimises bias

2. representative of whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Disadvantages of simple random sampling (2)

A
  1. Need sampling frame

2. Time consuming/ expensive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is simple random sampling

A

When every possible sample has the same probability of being selected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is stratified sampling

A

When the population divided into mutually exclusive strata proportional to population and a simple random sample is taken from each strata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Advantages of systematic (2)

A
  1. Quick and easy to use

2. Assures that the population will be evenly sampled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Disadvantages of systematic (2)

A

Need sampling frame

There may be missing values in the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is systematic sampling

A

When you chose a starting point at random then systemically select groups at a certain number apart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Advantages of stratified (2)

A
  1. Minimises selection bias by making sure no strata are over/under represented
  2. Frequencies for each group in the sample proportional to each group in the population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Disadvantages of stratified (2)

A

Need sampling frame

Strata must be clearly defined

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is quota sampling

A

When the population is split into groups or strata, then you select members from the group. Is non random and biased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Advantages of quota (2)

A
  1. Don’t need sampling frame

2. Frequencies for each group in the sample can be proportional to each group in the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is opportunity sampling

A

Taking a sample from the population who are available at the time the study is carried out. Is non random and biased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Advantage of opportunity sampling

A

Easy to select sample

23
Q

Formula for stratified

A

Target population/ whole population * sample size

24
Q

Measuring outliers

A

LQ - 1.5(IQR)

UQ + 1.5(IQR)

25
Q

In a box plot diagram what does it mean if group A median is larger than group B median

A

On average group A gets higher results

26
Q

In a box plot diagram what does it mean if group A IQR is larger than group B IQR

A

Group A is less consistent in the results as data more spread out

27
Q

Frequency density

A

Frequency / Class width

In Area, F = kA

28
Q

Independent vs dependant

A

Independent does not rely on the other variable whilst dependant does. Independent on x axis

29
Q

What an upwards very straight line says about correlation

A

It’s a strong positive correlation. When one variable increases so does the other.

30
Q

What is correlation

A

Describes a linear relationship between two variables

31
Q

What is bivariate data

A

Data which has pairs of values for two variables

32
Q

PMCC

A

Measures how correlated a data set is

33
Q

What does ‘b’ tell you in formula

y =a+bx

A

The change in y for each unit change in x

34
Q

Why extrapolation unreliable

A

Doesn’t take into account limits to data

35
Q

mean calculation

A

Sum x / n or Sum fx / n

36
Q

How to find the median point and quartiles from 8 values of discrete data

A

8+1=9
9/2 = 4.5 so halfway between the 4th and 5th value

To find lower quartile find the median of the lowest half (4 values) of the data

37
Q

If grouped continuous data, how would you find mean

A

Find midpoint of each class width and plug into calc with frequency then press 1-Var

38
Q

If grouped continuous data, how would you find median

A

Frequency / 2 then use interpolation

39
Q

Rule of thumb for choosing which set to use for linear interpolation

A

Always go set up unless right on the boundary, then use set down

40
Q

What is standard deviation

A

A way of measuring how varied the data is from the mean

41
Q

What does it mean if group A standard deviation from the mean higher than group B

A

data points are on average further apart and so less consistent

42
Q

Meaning of Sxx and Sx

A

Sxx is the sum of the squares
Sxx = sum of (x - x(bar))^2.
Sx tells us the standard deviation of the sample.
Sx = Square root Sxx / n-1

43
Q

Standard deviation from summary statistics

A

Square root Sum of x^2 / n - x (bar)^2

44
Q

Boundaries for outliers using standard deviation

A

X(bar) - 2sd

X(bar) + 2sd

45
Q

What to do if you have a constant k in a discrete random variable distribution
P(X=x) = 3k(4-x)(x^2+1).
x = 0,1,2

A

Substitute 0,1,2 into function for x
All equations add to 1
Work out k from that

46
Q

What does a uniform distribution mean

A

All variables have same probability

47
Q

What is a probability distribution

A

Describes the probability of any outcome in a sample space

48
Q

Random variable

A

A variable whose value is determined by a random experiment

49
Q

How to do binomial distribution on calculator for multiple values

A

Go bpd and plug in numtrial and probability values. Then press List 1 rather than variables to find individual values.

50
Q

What is hypothesis testing

A

Building evidence for a case against the nil hypothesis

51
Q

What does reducing the significance level on a hypothesis test mean

A

less evidence is needed to pass hypothesis test

52
Q

What is the significance level

A

The probability of incorrectly rejecting the nul hypothesis

53
Q

What does it mean if PMCC gets closer to 1 or -1

A

It’s getting closer to perfect positive correlation and perfect negative correlation

54
Q

The conditions under which it is appropriate to assume a random variable has a binomial distribution

A

There are n independent trials