Statistics Year 1 Flashcards
population
the whole set of items that are of interest
census
+ It should give a completely accurate result * Time consuming and expensive
- Cannot be used when the testing
- process destroys the item
- Hard to process large quantity of data
sample
+ Less time consuming and expensive than
a census
+ Fewer people have to respond
+ Less data to process than in a census
- The data may not be as accurate
- The sample may not be large enough
to give information about small subgroups of the population
population
census
sampling form
samling units
average heights
population : eveyone thats ever walked into the school
census : must find eveyone whos walked into the school- impossible!
sampling frame : practical list from which you can pick people to survey, list of all students/ teachers narrowed down focus, the best we can get of the population
sampling units : individual voters
bias
sample doesn’t represent population fairly
random sampling
all sampling units has an equal chance of being picked
+ Free of bias
+ Easy and cheap to implement for small
populations and small samples
+Each sampling unit has a known and equal
chance of selection
- Not suitable when the population size or the
sample size is large - A sampling frame is needed
systematic sampling
the required elements are chosen at regular intervals from an
ordered list
+ Simple and quick to use
+ Suitable for large samples and large
populations
- people who are picked may not want to take part in survey
-A sampling frame is needed - It can introduce bias if the sampling frame is
not random
stratified
the population is divided into mutually exclusive strata (males and
females, for example) and a random sample is taken from each.
+Sample accurately reflects the population
structure
+ Guarantees proportional representation of
groups within a population
-Population must be clearly classified into
distinct strata
- Selection within each stratum suffers from
the same disadvantages as simple random
sampling
quota sampling
an interviewer or researcher selects a sample that reflects the
characteristics of the whole population.
+Allows a small sample to still be
representative of the population
+No sampling frame required
+Quick, easy and inexpensive
+Allows for easy comparison between different
groups within a population
- Non-random sampling can introduce bias
- Population must be divided into groups,
which can be costly or inaccurate - Increasing scope of study increases number
of groups, which adds time and expense - Non-responses are not recorded as such
difference with quota and straified
q: you meet the people and select them, no sampling frame involved, allocate the people in the appropriate quota
s: if you want 5 tall people, you randomly pick from a list of 5 tall people
opportunity sampling
+Easy to carry out
+ Inexpensive
- Unlikely to provide a representative sample
- Highly dependent on individual researcher
continuous variable
■ A variable that can take any value in a given range is a continuous variable.
For example, time can take any value, e.g. 2 seconds, 2.1 seconds, 2.01 seconds etc
e.g. foot size
discrete
A variable that can take only specific values in a given range is a discrete variable.
For example, the number of girls in a family is a discrete variable as you can’t have 2.65 girls in a family
e.g. soe size, goes up in 1/2 s
mode/ modal class
■ The mode or modal class is the value or class that occurs most often.
median
■ The median is the middle value when the data values are put in order.
For data given in a frequency table, the
mean can be calculated using the formula
x bar = ∑ x f / ∑ f
mean calculated
x̄ = ∑ x/ n
how to find the quartiles
Q1 = 1/4 x n
Q2 = 1/2 x n
Q3 = 3/4 x n
the xth value found is where the upper/ lower quaritle lies
interpolation
if modal class is 34-36
numerline = 33.5………………….36.5
interpolation steps
- find the mean/ value your interested in
- find the modal class its in
- draw a numberline, round first down, second up
- on the bottom write the total at the start and end of the modal class
- find fraction of where the mean is on number line
- times fraction by the difference of the modal class valuses numberline
interpolation data is
evenly spaced out
spread
dispersion
variance and standard deviation
takes account of all pieces of data