Statistics 1 Flashcards
Define population
Whole set of items of interest
Define census
Observation/measure of every member of the population
Name of the sampling units used for sampling from census?
Parameters
Define sample
Selection of observation from a subset of population in order to discover information about the population in its entirety
Name of sampling units used for sampling from sample?
Statistics
Advantage of a census?
-Completely accurate result obtained ( ie. everyone’s views recorded), giving true measure of population.
Disadvantages of a census?
-Time-consuming, labour-intensive and expensive
-Hard to contact whole population if applicable.
-Not used when testing involves the destruction of the item
-Hard to process large quantity of data
Advantages of a sample?
-Less-time consuming, labour-intensive and expensive
-If applicable, more easy to contact whole population
-Fewer people required to respond
-Less data to be processed
Disadvantages of a sample
-Data could be inaccurate
-Sample could not be large enough to inform of whole population by small sub-groups used
Correlation between a sample size and the validity of conclusions of the processed data.
Larger size of sample usually increases the validity of the conclusions of the processed data.
Unless using non-random sampling, requirement of sample?
To be random.
What does the size of a sample depend on.
-Accuracy required
-Resources available
Why is larger sample typically more accurate.
Larger proportion of data examined, more likely to be representative of population.
If population is very varied (heterogeneous)?
Size of sample required would be larger than that of a uniform (homogeneous) population.
Different samples can…
Lead to different conclusions due to the natural variation of a population.
Define sampling units.
The individual units of a population available for sampling.
Define sampling frame.
Where sampling units are individually named/numbered to form a list.
Criteria (generally) for representative sampling?
-Usage of random sampling method
-Typically, large sample size.
What is a biased sample?
One that does not accurately reflect the population, and perhaps favours a proportion of population over another.
How can you assess if a sample could be biased?
-Sample excludes people (based on age/gender/different interests (sweet sample outside of sweet shop) or habits (sport sampling at a sports centre) etc.)
-Sometimes, a small sample is likely to be biased.
If a sample is biased, what then can occur?
A sample unrepresentative of a population can lead in a sampling error.
Conclusion of data, on whole/average =x. Use data to agree/disagree with statement.
Steps
-Mean of data?
-Median of data?
-Presence of anomalies?
-Thus, mean/median better
(mean affected, median not)
-Hence, validity of data…
Define random sampling.
Where every member of the population has an equal chance of being selected for sampling (each sampling unit chosen by chance for sampling).
Thus, the sample performed under the methods of random sampling should be…
More representative of the population.
Benefit of random sampling as a whole.
It helps to eradicate the bias from sampling.
What are the 3 types of random sampling.
-Simple random
-Systematic
-Stratified
Discuss the method undergone of simple random sampling.
-Requirement of sampling frame.
-Utilisation of random number function of calculator or “lottery sampling”.
-Lottery sampling is where the members of the sampling frame are placed in a hat/other appropriate item, and then the required number of “tickets” are drawn from this object.
What are the advantages of simple random sampling.
-No bias
-Easy and cheap for small populations and small samples
-Each sampling unit has known and equal chance of selection
Disadvantages of simple random sampling.
-Not suitable with large population/sample size
-Requirement of a sampling frame.
-Only random if sampling frame is random.
Discuss the method undergone in systematic sampling.
-Required elements chosen at regular intervals from sampling frame.
-Regular intervals decided by number of units/required sampling size (n + x, n + 2x etc.)
-1st person chosen should be of randomised (1-x), then from then on the succeeding units are chosen at regular intervals (n+x, n+2x etc.) from the sampling frame.
Describe the advantages of systematic sampling.
-Simple and quick
-Suitable for large populations and sample sizes.
Describe the disadvantages of systematic sampling.
-Requirement of sampling frame
-If the 1st person chosen is not randomised, bias can be introduced into the sampling.
-Only random if sampling frame random
Possible limitation of systematic sampling?
Patterns could randomly occur in the selected data you have, not representative of all sub-groups of population.
Discuss the method undergone to perform stratified sampling.
-Population divided into mutually exclusive strata and random sampling occurs from each strata.
-The PROPORTION of each strata should be equal
What is the equation that decides what the number of strata will be to ensure its proportion of the overall population is equal to the rest.
No. of sample in the strata+ no of strata/no. of overall population x overall sampling size
What are the advantages of stratified sampling.
-Sample accurately reflects the population structure
-Guarantees proportional representation of groups within the population
What are the disadvantages of stratified sampling.
-Population required to be classified into distinct strata.
-Selection process within each strata is not suitable for large population/ sample sizes
-Requirement of sampling frame.
-Only random if sampling frame random
Chance of being selected in a stratified sample.
-Assumed that each member has equal chance of being selected due to system of random sampling.
Then:
chance= number of groups selected x 1/number of groups of study.
What are the 2 types of non-random sampling.
-Quota sampling
-Opportunity/Convenience sampling.
Describe quota sampling.
-Interviewer/researcher selects a sample to try to reflect the characteristics of a population.
-Population divided into groups according to the given characteristic, with the size of each group determining the proportion of the sample that will have that specific characteristic.
-Interviewer meets people, assesses group, and subsequently allocates them into the appropriate quota
-This continues until all the quota are filled.
What occurs if a person refuses to be interviewed/person fits into quota already filled?
Simply ignored and researcher/interviewer moves onto next person.
What are the advantages of quota sampling.
-Allows a small sample to still be representative of the population so field work can be done quickly.
-Not requiring a sampling frame.
-Quick, administration easy, inexpensive.
-Allows for easy comparison between the different groups of a population.
What are the disadvantages of quota sampling.
-Methods of non-random sampling, with judgement of interviewer, can introduce bias
-Population must be divided into groups, which can be costly or inaccurate.
-Increase scope of the study increases the number of groups, hence increasing the time and expense.
-Non-responses are recorded as such.
-Not possible to estimate sampling errors
(due to lack of randomness)
-Difficulties of defining controls e.g. social class
What is the method of opportunity sampling.
-Taking the sample from people available at the time (e.g first n people saw etc.) of sampling who fit the criteria that is being researched.
Advantages of opportunity sampling.
-Easy to carry out
-Inexpensive.
Disadvantages of opportunity sampling.
-Unlikely that the sampling is representative of the population
-Is highly dependent on the individual researcher
How can non-random sampling data be made to be more representative.
Contextual to the question, try and make less biased and more representative. Thus, usually increasing size of sample size valid, and also, 1 way of eradicating bias (prevent exclusion of certain people etc.)
What is qualitative data/variables.
Variables/data associated with non-numerical (ie. categorical) observations, being descriptive.
Quantitative?
Variables/data associated with numerical observations, being numerical.