Biostastics 1 Flashcards
Why is it important to study statistics?
Statistics enables us to use information from a sample of people to better understand populations.
How does studying statistics benefit critical thinking and analytical skills?
Studying statistics helps develop critical thinking and analytical skills
Why is the ability to read and evaluate journal articles important for informed information use?
Being able to read and evaluate journal articles makes one an informed user of information.
What would a population be like if all its members were identical?
If all members of a population were identical, the population would be homogenous (no variations in characteristics).
What is the reality of population characteristics in terms of variation?
In reality, all populations are heterogenous, meaning they have significant variations among members.
Why is it necessary to observe many individuals in a population study?
It is necessary to observe many individuals to capture all possible characteristics of the population.
What does using data to make an inference (to say something) about a population involve?
Using data involves making an inference with confidence about a whole population based on the study of only a sample.
What is the population in a statistical study?
The population is the entire group that you want to talk about.
What is a sampling frame in the context of statistics?
A sampling frame is the eligible population that you intend to sample.
What is a sample in a statistical study?
A sample is the subset of the population that you actually observe in the data.
Why is it important to have a sample that represents the population well?
We want a sample that is a good representation of all the characteristics in the population to make accurate inferences.
What is a representative sample
A representative sample is a sample that has characteristics that
are similar to the overall population.
What is random error in statistics?
Random error is a statistical error that occurs when a selected sample does not represent the entire population.
What happens when the results found in a sample do not represent the entire population?
It indicates the presence of random error or sampling error.
Why is it called “random error” or “sampling error”?
It is called “random error” or “sampling error” because it assumes the error is just by chance due to using a sample instead of the whole population.
How can we reduce sampling error?
We can reduce sampling error by increasing the sample size.
What is another method to reduce sampling error besides increasing sample size?
Selecting a representative sample helps reduce sampling error.
Why don’t researchers use the entire population in a study?
Using the entire population, or conducting a census, is very seldom done in survey research due to its impracticality.
What is a census in the context of survey research?
A census is a study that includes data about every member of the defined target population.
What are the two main ways samples can be drawn?
Samples can be drawn using probability/random sampling or non-probability sampling.
What is probability/random sampling?
Probability/random sampling is when each member of the population has a known probability of being selected and selection is not controlled by the investigators.
What is non-probability sampling?
Non-probability sampling is when samples are chosen based on personal judgment or convenience.
How should researchers decide which sampling method to use?
Researchers should weigh what fits best for their research question, budget, and timeframe.
What is random sampling?
Random sampling is a method where each member of the population has an equal chance of being selected.
How does random sampling benefit the sampling process?
It removes conscious and unconscious sampling bias.
What does random sampling allow researchers to do in terms of population parameters?
It permits the estimation of population parameters, allowing for statistical inference.
How are units selected in random sampling?
Units are selected randomly, in an unpredictable manner with no pattern.
Who controls the sampling process in random sampling?
The researcher controls the sampling process but not who gets selected.
How does random sampling reduce researcher bias?
It reduces researcher bias in the selection process by ensuring the selection is random and not influenced by the researcher’s preferences.
Types of Random sampling
- simple random sampling
- systematic random sampling
- stratified random sampling
- cluster random sampling
What is the first step in simple random sampling?
Assign each member of the population a number.
What is the second step in simple random sampling?
Decide on the sample size (n) that you need.
How do you select the members in simple random sampling?
Use a hat, random number table, or random number generator to select your “n” members.
If the population is 10,000 and the sample size is 400, what is the likelihood of being selected?
The likelihood of being selected is 400/10,000, which is 0.04 or 4%.
(each member of the population has an equal chance of being selected)
What is a key benefit of simple random sampling?
Each member of the population has an equal chance of being selected, leading to an accurate representation of the population.
What is another advantage of simple random sampling?
It is easy to use.
What is a disadvantage of simple random sampling if the sampling frame is large?
This method can be impractical if the sampling frame is large.
What is a potential drawback regarding minority subgroups in simple random sampling?
Minority subgroups of interest in a population may not be present in the sample.
What is the first step in systematic random sampling?
Assign each member of the population a number.
What is the second step in systematic random sampling?
Decide on the sample size (n) that you want or need.
What is the third step in systematic random sampling?
Calculate the selection interval.
How do you determine the starting point in systematic random sampling?
Randomly select a starting point within the first interval using a random number table or random number generator.
What do you do after selecting the starting point in systematic random sampling?
Include every nth member until the sample size is achieved.
How does systematic random sampling compare to simple random sampling in terms of results?
It empirically provides identical results to simple random sampling but is more efficient.
What is a stratum in stratified random sampling?
A stratum is a segment/sub-group of the population that shares at least one common characteristic (e.g., birth years, gender).
How are characteristics distributed within and across strata in stratified random sampling?
The characteristic will be homogeneous within each stratum but heterogeneous across the strata.
Can you give an example of strata in a population?
One group could consist only of women, and another group could consist only of men.
Why would we use stratified random sampling if sub-groups may differ with regard to the measurement being made?
To ensure these sub-groups are adequately represented in the final sample
What is the main advantage of stratified random sampling?
It helps ensure we sample a representative sample of the population.
How are participants sampled within each stratum in stratified random sampling?
Participants are sampled from within each stratum using simple or systematic random sampling
What is proportional stratified sampling?
Proportional stratified sampling is when the sample size in each stratum is proportional to the size of that stratum in the population
In a population of 1,000 people, if 20% are current smokers, 30% are previous smokers, and 50% are never smokers, what would a proportional sample of 100 people look like?
A proportional sample would include:
20 current smokers
30 previous smokers
50 never smokers
What is disproportional stratified sampling?
the sample size in each stratum is NOT proportion to the stratum size in the population
What is a cluster in cluster random sampling?
A naturally occurring structure with a mixed aggregate of members of the population, with each member appearing in only one cluster.
Can you give examples of clusters?
Schools, suburbs, hospitals, or clinics
How does cluster random sampling differ from stratified sampling?
In cluster sampling, there is homogeneity across clusters but heterogeneity within each cluster, whereas stratified sampling has homogeneity within strata and heterogeneity between strata.
What does it mean that clusters all look like each other?
All clusters are similar in type (e.g., all are primary care clinics), but the members within each cluster are as diverse as the population as a whole.
Why is only a subset of clusters needed to represent the population?
Because each cluster is a microcosm of the entire population, representing its diversity
When is cluster random sampling an efficient strategy?
When the population is spread over a large geographic area.
What is the first step in cluster random sampling?
Develop or obtain a list of clusters.
What is the second step in cluster random sampling?
Draw a random sample of clusters.
What are the two options for sampling within selected clusters?
- Include everyone in each selected cluster.
- Draw a random sample of people from within each selected cluster.
What is the desired internal relationship in stratified sampling?
Subjects in the same stratum are similar to one another regarding the stratifying factor (homogeneous).
What is the desired external relationship in stratified sampling?
Each stratum is different from other strata.
How are subjects included in the sample in stratified sampling?
All strata are represented in the sample.
What is the desired internal relationship in cluster sampling?
Subjects in the same cluster are different from one another regarding the factor of interest (heterogeneous).
What is the desired external relationship in cluster sampling?
Each cluster is similar to other clusters.
How are subjects included in the sample in cluster sampling?
Only a subset of clusters are in the sample.
Types of non- random sampling methods
- Convenience sampling
- Purposive sampling
- Snowball sampling
- Volunteer sampling
What is convenience sampling?
Selecting participants who are close at hand, readily available, or convenient.
What is purposive sampling?
Researchers select participants because they have specific characteristics of interest.
When is purposive sampling often used?
It is often used in qualitative research when researchers are particularly interested in insights from certain types of people.
What is snowball sampling?
Starting with one or two eligible participants and then asking them to refer others to participate in the study
When is snowball sampling useful?
It is useful for accessing difficult-to-reach or hidden populations.
What is volunteer sampling?
Participants self-select to participate in the study, often in response to an advertisement or call for participants.
Is non-random sampling generalizable to the broader population?
No, non-random sampling methods are not generally considered to be representative of the broader population.
When might non-random sampling methods be the only option?
Depending on the research question and population of interest, non-random sampling methods may be the only feasible option
Why is estimating sample size important?
Estimating sample size ensures that the study has enough statistical power to detect meaningful and important results.
What are the consequences of having too small a sample size?
Too small a sample size may result in inconclusive or imprecise results because meaningful effects may not be detected.
What are the consequences of having too large a sample size?
Too large a sample size can lead to detecting statistically significant differences that are not practically important, wasting resources.
What factors should be considered when determining sample size?
Feasibility factors such as time, cost, and availability of participants should be considered, although they should not dictate sample size.
What to consider before calculating?
1.
Consider the research question
and study design
2.
Planned sampling strategy
3.
Consider the type of outcome measure
4.
What is the required level of precision
?
*
What is your acceptable margin of error? How much error can be tolerated?
5.
What level of confidence
do you want to use?
*95% is typical but definitely notmandatory
6.
What are the expected population parameters
?
*Is there an existing survey in a similar population? What does the literature say? What can you expect to find?
*Do you need to be able to detect a 2% difference between two groups? Would a 5 or 10% difference be more meaningful in the real world?
How is the minimum sample size estimated for a prevalence study?
The minimum sample size
𝑛
n for estimating a single proportion (prevalence) can be calculated using the formula
where:
𝑝
p is the anticipated population proportion (prevalence).
𝑑
d is the desired precision or margin of error.
𝑧
z is the critical value from the standard normal distribution corresponding to the desired confidence level (e.g.,
𝑧= 1.96
z=1.96 for a 95% confidence level).
What does
𝑝
p represent in this formula?
𝑝
p represents the expected proportion of the population exhibiting the characteristic of interest (prevalence).
What does
𝑑
d represent in this formula?
d represents the desired precision or margin of error around the estimated prevalence
Why is z=1.96 used in the formula?
corresponds to the critical value at the 95% confidence level in the standard normal distribution, ensuring a 95% confidence interval around the estimated prevalence.