Final Exam Material Flashcards
What is the definition of statistics? What are the two classifications?
Procedures for collecting, analyzing, interpreting, and presenting data
1) Descriptive Statistics
2) Inferential Statistics
What is the definition of descriptive statistics? An example?
Used when measuring characteristics of a group without intending to generalize beyond the group
Example: Mean +_ SD: age, BMI, height
What is the definition of inferential statistics? An example?
Used when making generalization s or inferences from a smaller group (sample) to a larger group (population)
Examples: t-test or ANOVA, correlations: Person’s Correlation Coefficient, Simple Linear Regression, & Multiple Regression
What is the definition of a population? What is the value representing a characteristic of the population?
A large group to which the results of a study conducted on a sample from the group may be generalized
Value: parameter
What is the definition of a sample? What is the value representing a characteristic of the sample?
Individuals from the population who actually participate in the research –> a representative subset of the population
Value: statistic
What is a sampling frame?
A list of all those within a population who can be sampled (sample is taken from this)
What are the steps in sampling?
- Define the population by specifying the criteria (inclusion and exclusion) for selecting participants
- Develop the plan for sample selection
- Determine the sample size (power analysis?)
What are the options for sample selection?
Random Sampling
- Simple Random Sampling
- Systematic Random Sampling
- Cluster Random Sampling
- Stratified Random Sampling
Non-Random Sampling
- Convenience (Volunteer) Sampling
What is the definition, advantage, and disadvantage of simple random sampling?
Every person in the population has an equal chance of being selected for the sample
- Advantage: unbiased, representative sample
- Disadvantage: Need a good sampling frame
What is the definition, advantage, and disadvantage of systematic random sampling?
Every nth person from sampling frame is selected to participate
- Advantage: easier to administer than simple random sampling
- Disadvantage: may be biased if pattern in population
What is the definition, advantage, and disadvantage of cluster random sampling?
Specific clusters (groups) are randomly selected out of all possible clusters
- Advantages: time- and cost-effective (convenient and practical), large samples
- Disadvantages: May be biased/non-representative if clusters are different from each other
What is the definition, advantage, and disadvantage of stratified random sampling?
Members of the sampling frame are divided into subgroups (strata)
- Random sample of participants then selected from each strata
- Advantages: Good representative sample: captures key characteristics of the population
- Disadvantages: Difficult to administer, need detailed information about the population
What is the definition, advantage, and disadvantage of convenience sampling?
Investigators recruit easily available individuals who meet criteria until meet desired sample size
- Advantage: very convenient
- Disadvantage: sample may be biased, self-selection bias
What type of sample selection is a survey of admissions of individuals with Type II diabetes to hospitals in greater Dayton area?
Cluster
What type of sample selection is a study of risk factors for progression of osteoarthritis (assuming more women affected than men)?
Stratified
What type of sample selection is a study of rural access to health care in the different regions of Ohio?
cluster
What type of sample selection is a survey of of satisfaction and retention among PA in Ohio?
Simple or systematic
What type of sample selection is a study of the effectiveness of resistance training on functional outcomes in patients with chronic obstructive pulmonary disease?
Convenience
What is the difference between random selection vs random assignment?
RS: about how elected: more representative more generalizable
RA: making groups: more equal = good internal validity, better for causality
What are the types of frequency distributions?
- Normal distribution: symmetrical bell-shaped curve
- Non-Normal distribution: highest frequencies of scores do not fall centrally but are shifted towards positive or negative extremes
(positively, negatively skewed: where the tail is) - Kurtosis: a vertical shift in the normal curve; the middle of the curve is elevated or flattened
What are the two aspects of descriptive statistics?
Measures of central tendency: mean, median, mode
Measures of variability: range, standard deviation, variance, standard error of the mean (SEM)
What are the measures of central tendency?
Extent to which values cluster in a data distribution plot
- Mean: the average of all the number
- Median: middle number in the list
- Mode: most frequently occurring number
Normal distributions: use mean
Skewed distributions: use median
What are the arrangement of skewed distributions?
Positively: mode < median < mean
Negatively: mean < median < mode
What are the measures of variability? (Elaborate on the first two)
How the scores vary; how they are dispersed around the measures of central tendency
- Range: the difference between the highest and lowest scores (ie age range)
- Standard deviation: a numerical indicator of the spread of values within a data set
- Variance
- Standard Error of the Mean
What could a large standard deviation around the mean indicate?
- wide spread of data
- indicate outliers
- sample size too small
What is variance?
The square of the standard deviation; used to calculate many other statistics
What is the standard error of the mean? What makes it smaller and what does this mean?
an estimate of the expected difference between the sample mean and the population mean. (SEM = SD/ square root of n)
Smaller standard deviation, larger sample size the smaller the SEM
The smaller the SEM the greater the confidence that the sample mean accurately represents the population mean.
What is the sampling error?
The difference between a calculated sample mean and the (unknown) population mean
Which hypothesis is typically tested in inferential statistics?
The null hypothesis
If the level of significance is 0.05 what does this mean for the chance that the study results are erroneous? What does it mean for the researcher’s confidence?
5% chance of the study results being in error
95% confident they will detect a true difference when there is one
When is a alpha level of 0.01 or 0.001 selected?
Medical interventions, pharmacutical companies
What is the p value?
The probability value–calculated by the computer. The probability of the results being due to chance–measure of the strength of evidence against the null hypothesis.
Low: strong evidence against null
High: weak evidence against null
In clinical studies what are the concerns of type I and type II errors?
Type I: pts treatment does not actually work
Type II: treatment was effective but was rejected and is no longer available
What is the B level typically?
0.2 (20%)
What is the most common reason for a type II error?
small sample size causing low power
What is the trade off of level of significance and type I and II errors?
Lower alpha decreases risk of Type I error and increases risk of Type II.