statistics Flashcards
population
A collection of all items.
statistic
A function of the sample that contains no unknown quantities.
raw data
Information that can be obtained from a population.
disadvantages of a census
- time consuming and expensive
- cannot be used when the testing process destroys the item
- hard to process large quantity of data
advantage of a census
It should give a completely accurate result.
census
Observes or measures every member of a population.
sample
Observations taken from a subset of the population.
advantages of a sample
- less time consuming and expensive than a census
- fewer people have to respond
- less data to process than in a census
disadvantages of a sample
- data may not be as accurate as a census
- sample may not be large enough to give info about small subgroups of the population
sampling units
Individual units of a population.
sampling frame
A list formed from individually named/numbered sampling units of a population.
3 sampling techniques when there is access to the whole population
- simple random
- stratified
- systematic
Never Get In Cold Soup
Number
Generate
Ignore
Continue
Select
random integer generator in calculator
1 –> PROB –> RAND –> RanInt#(lower, upper, total)
stratified sampling
Samples taken from each group in proportion to their size.
systematic sampling
Members of the population are chosen at regular intervals.
advantages of simple random sampling
- free of bias
- easy and cheap to implement for small samples
- each sampling unit has a known and equal chance of selection
disadvantages of simple random sampling
- not suitable when sample size is large
- a sampling frame is needed
advantages of systematic sampling
- simple and quick to use
- can introduce bias if sampling frame is not random
advantages of stratified sampling
- sample accurately reflects the population structure
- guarantees proportional representation of groups within a population.
disadvantages of stratified sampling
- population must be clearly classified into distinct strata
- selection within each stratum suffers the same disadvantages as simple random sampling
3 sampling techniques when there is not access to whole population
- opportunity
- quota
- cluster
opportunity sampling
Takes samples from members of the population you have access to until you have a sample of the desired size.
quota sampling
When you decide how many members of each group you wish to sample in advance then use random sampling.
cluster sampling
Population is split into clusters which are randomly selected and then taken a random sample of.
advantages of opportunity sampling
- easy to carry out
- inexpensive
disadvantages of opportunity sampling
- unlikely to provide a representative sample
- highly dependent on individual researcher
advantages of quota sampling
- allows a small sample to still be representative of the population
- no sampling frame required
- quicky, easy and inexpensive
- allows for easy comparison between different groups within a population
disadvantages of quota sampling
- non-random sampling can introduce bias
- population must be divided into groups, which can be costly or inaccurate
- non-responses are not recorded as such
advantages of cluster sampling
- easy to carry out
- inexpensive (if few clusters)
disadvantages of cluster sampling
- bias is more likely (size of cluster changes probability)
- only useful when population can be naturally divided into easily identifiable clusters
variance formula in terms of standard deviation
variance = standard deviation²
Independent variable on a scatter graph
X axis
Dependent variable on a scatter graph
Y axis
causal relationship
If a change in one variable causes a change in the other.
one reason why a conclusion drawn from a scatter graph may not be valid
There may be a 3rd variable that affects the data.
mutually exclusive
When events have no outcome in common.
P(A∪B) for mutually exclusive events
P(A) + P(B)
independent events
When one event has no effect on another.
discrete data
Data that takes values which change in steps.
random variable
A variable whose value is determined by chance.
discrete uniform distribution
All probabilities are the same.
criteria for binomial distribution
- fixed number of trials
- 2 possible outcomes
- fixed probability
- trials are independent of each other
probability that a random continuous variable takes a specific value
0
features of a normal distribution graph
- bell shaped
- symmetrical about the mean
total area under a normal distribution graph
1
what is μ and σ in X~N(μ, σ²)
μ is the mean and σ is the sd
most of the distribution in a normal distribution should be within…
3 standard deviations from the mean
standard normal distribution notation
Z~N(0, 1)
formula to convert standard normal
X = μ + Zσ
standard deviation of the probability of the mean of a sample
standard deviation / √sample size
normal distribution hypothesis if above sig level
accept H₀
Give one reason why a certain sample should not be used to conduct a hypothesis test.
the sample is not random