Stats Flashcards
(39 cards)
How does Quota sampling work
Name one advantage and one disadvantage
Take a certain number from each category according to the size of each group in the population
Ad: all categories are represented
Disad: Not random so can lead to bias
How does stratified sampling work
One advantage
One disadvantage
Sample data from each strata that is proportional to the population sizes
Adv : sample accurately reflects the population , selection is random
Disad : time consuming , depends on sampling frame available
how does systematic sampling work
one advantage
one disadvantage
number every piece of data in the population then use random number generator to take a starting point, select every nth price of data
ad: random so less likely to lead to bias
disadv : need sampling frame
how does opportunity sampling work
one advantage
one disadvantage
pick the data as it becomes available
adv: easy and quick (cheap)
disadv : not random , can lead to bias
how does simple sampling work
one advantage
one disadvantage
number every piece of data in the population , use number generator to pick he numbers in the sample and keep going until you have your sample
Adv : random and less likely to be biased, each piece of data has an equal chance of being picked
disadv : requires a sampling frame
if you have outliers which value of the average and which measure of the spread would be best to use
the median as the mean is distorted by extreme values
interquartile range as this is not affected by outliers - represents the middle 50%
which value of the average and measure of the spread is most accurate and why
the mean and standard deviation as both measures include every value
is the explanatory variable the x or y values
x values
is the response variable the x or y values
y values
if a line of regression y = 17.0 + 14x represents the relationship between the percentage (x%) of cocoa solids and the price (y pence ) of different chocolate , interpret the value 15.4
for every 1% more cocoa that the chocolate contains, the price can be increased by 15.4 pence
if the relationship between the variables is p on n, is the linear regression line
p = an +b
or
n = ap + b
p = an + b
where data is coded what is the mean affected by
addition / subtraction
multiplication / division
continuous data ….
can take every value
where data is coded what is the standard deviation affect by
multiplication / division
discrete data is ….
data that can only take specific values e.g shoe size
in large data set which UK locations are on the coast (windy )
north to south
Leuchars, Hurn, Camborne
in the large data set which worldwide locations are on the coast
Jacksonville and Perth
in the large data set what is the daily maximum gust measures in and give a definition of what this means
Knots
1 knot = 1.15 mph
in the large data set what the only 3 categories of data where the data is continuous
daily mean rainfall
daily hours of sunshine
daily max temperature
helpful histogram formula
area = k x frequency
what is the definition in words of the standard deviation
the average distance every value is away from the mean
for a hypothesis test that uses binomial distribution what are the null and alternative hypothesis if you are testing a two tailed test
H0 : p = p
H1 : not equal
for a hypothesis test that is testing positive correlation what are the null and alternative hypotheses
H0 : row / p =
H1 : row/p > 0
for a hypothesis test that is testing to see if the mean has decreased, what are the null and alternative hypotheses
H0 : u =
H1 : u<