Year 1 - Statistics Flashcards

1
Q

1.1 What is a census?

A

A census observes or measures every member of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

1.1 What is a sample?

A

A selection of observations taken from a subset of the population which is used to find out information about the population as a whole.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

1.1 What are the advantages and disadvantage of a census?

A

Adv- Completely accurate result

Disadv- Time consuming, expensive, cannot be used when testing process destroys item, hard to process as large quantities of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

1.1 What are the advantages and disadvantages of a sample?

A

Adv- time-efficient, fewer people have to respond, less data to process

Disadv- Not as accurate, sample not large enough to give information about sub-groups of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

1.2 What are the three methods of random samplng?

A

Simple random sampling - every member has an equal chance of being selected

Systematic sampling - required elements chosen at regular intervals from an ordered list

Stratified sampling - population is divided into mutually exclusive groups (e.g. males & females) and a random sample is taken from each

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

1.2 What are the advantages and disadvantages of simple random sampling?

A

Adv- No bias, easy & cheap to do for small samples, each sampling unit has an equal chance

Disadv - Not suitable when population is large as time-consuming, sampling frame is needed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

1.2 What are the advantages and disadvantages of systematic sampling?

A

Adv- Simple & quick, suitable for large samples & populations

Disadv- Sampling frame needed, can introduce bias if sampling frame is not random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

1.2 What are the advantages and disadvantages of stratified sampling?

A

Adv- Accurately reflects population structure, guarantees proportional representation of groups

Disadv- Population classified into distinct groups (strata), selection within each stratum suffers with same disadvantages as simple random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

1.3 What is quota sampling?

A

A researcher selects a sample that reflects the characteristics of the whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

1.3 What is opportunity sampling?

A

Taking the sample from people who are available at the time of the study and who fit the criteria of the study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

1.3 What are the advantages and disadvantages of quota sampling?

A

Adv- Allows a small sample to be representative of the population, no sampling frame, quick, easy, cheap, easy comparison between different groups

Disadv- Non-random sampling can introduce bias, population is divided into groups - costly/inaccurate, increasing scope of study increases no. of groups - time-consuming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

1.3 What are the advantages and disadvantages of opportunity sampling?

A

Adv- Easy, cheap

Disadv - Unlikely to be representative, highly dependent on individual researcher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

1.4 What is the difference between qualitative and quantitative data?

A

Qualitative - non-numerical observations

Quantitative - numerical observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

1.4 What is the difference between discrete and continuous data?

A

Discrete - A variable that can only take specific values in a range e.g. shoe size

Continuous - A variable that can take any value in a range e.g. time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

1.5 What are the 8 cities in the large data set?

A

Leuchars, Leeming, Heathrow, Hurn, Camborne, Beijing, Jacksonville, Perth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

1.5 What are the following measured in? Daily mean temp, daily total rainfall, daily total sunshine, daily mean wind direction and windspeed, daily max gust, daily max relative humidity, daily cloud cover, daily mean visibility, daily mean pressure

A

Daily mean temp - degrees Celsius (1dp)
Daily total rainfall - mm (1dp)
Daily total sunshine - tenth of an hour
Daily mean wind direction - Cardinal directions
Daily mean windspeed - Knots (1kn = 1.15mph)
Daily max gust - knots
Daily max relative humidity - percentage of air saturation (%)
Daily cloud cover - oktas (eighths of the sky covered)
Daily mean visibility - Decametres (Dm)
Daily mean pressure - Hectopascals (hPa)

17
Q

1.5 What time periods are used in the Large Data Set?

A

May-October 1987 & 2015

18
Q

2.1 What is the formula you can use to calculate the mean from a set of data?

A

x̄ = (Σx)/n where x bar is the mean, x is each data value, and n is the number of data values

19
Q

2.1 What is the formula you can use to calculate the mean from a frequency table?

A

x̄ = (Σxf)/(Σf) where x bar is the mean, x is each data value, and f is each frequency

20
Q

2.2 How do you find the upper and lower quartiles for discrete data?

A

LQ: divide n by 4, if a whole number then LQ between this data point and one above, if a decimal then round up

UQ: Find 3/4 of n, if a whole number the UQ is between this data point and the one above, if a decimal round up

21
Q

2.2 What is interpolation used for and how do you do it?

A

Used to find the median, quartiles, or percentiles of a grouped frequency table, assuming data values are distributed evenly within each class

Median= LB + ((n-a)/(b-a) x range) where LB is lower bound, n is the middle value, a is the lower frequency bound and b is the upper frequency bound

22
Q

2.3 What is the range, IQR, and interpercentile range?

A

Range - difference between largest and smallest values

IQR - difference between upper and lower quartiles

Interpercentile range - difference two given percentiles

23
Q

2.4 Give the formula for variance

A

((Σx^2)/n)-((Σx)/n)^2

24
Q

2.4 Give the formula for standard variation

A

sqrt(((Σx^2)/n)-((Σx)/n)^2)

25
Q

2.4 What is the formula for variance and standard deviation in a frequency table?

A

Variance:
((Σfx^2)/Σf)-((Σfx)/Σf)^2

Standard deviation:
sqrt(((Σfx^2)/Σf)-((Σfx)/Σf)^2)

26
Q

3.2 What are the 5 aspects of a box plot?

A

Range, Interquartile Range, Lower Quartile (Q1), Median (Q2), Upper Quartile (Q3)

27
Q

3.3 Describe how to plot a cumulative frequency graph

A
  • Add another column to the frequency table labelled ‘cumulative frequency’
  • Plot cumulative frequency on the y-axis and the measurement on the x-axis
  • Plot the first point at the origin
  • Plot each point at the upper bound for each range
  • Join up points with a curve
28
Q

3.4 What kind of data is a histogram used to present?

A

Continuous data

29
Q

3.4 What is the formula for frequency density?

A

Frequency density = frequency/class width

30
Q

3.4 How do you draw a frequency polygon?

A

Join up the middle of the top of each bar of a histogram with a straight line

31
Q

3.4 What is the relationship between the area of each bar in a histogram and the frequency?

A

Area of each bar is proportional to the frequency