Statistical Inference Flashcards
What is a two-tailed alternative hypothesis?
simple expect a difference to exist (group A and B will differ)
What is the null hypothesis H0?
there will be no difference
What is a one-tailed alternative hypothesis?
expect a difference and state in which direction
e.g., group A will do better than group B
What is the significance level (alpha)?
criteria to decide whether to accept/reject H0
What does a significance level (alpha) of 0.05 signify?
- minimum established by scientific community
- when you find sufficient evidence to reject null you can be 95% certain that it is due to a true difference in data, not because of experimental manipulation
- accept that 5% of time results occurred by chance alone
What does an alpha of 0.01 (significance level) signify?
- stricter level significance
- when find sufficient evidence to reject null you can be 99% certain truly is difference in data because of experimental manipulation
- but accept that 1% of time results occurred by chance alone
What is the study flow?
- estimate number of subjects needed to get reliable answer
- obtain sample(s) and assign to conditions
- collect data
- calculate basic summary statistics (central tendency and dispersion)
- choose statistical test based on types of variables and types of questions being asked
- apply the STATISTICAL TEST and obtain test statistic
- COMPARE test statistic to theoretical sampling distribution derived for the particular test you are using with a particular alpha-value as your criterion
- obtain a P-VALUE = the likelihood that the result observed is due to chance if H0 is correct (alpha is the value of p at which you are willing to reject H0 even if it is correct)
- -> p < 0.05 indicates statistical significance
- decide to accept or reject H0
- derive conclusion that answers hypothesis
What are two types of Decision Errors?
- Type I (alpha)
- Type II (beta)
What is the study flow (includes early steps)?
- state H1 (alternative hypothesis)
- define population and variables
- identify outcome variables
- state H0, null hypothesis
- declare significance level (alpha)
- estimate number of subjects needed to get reliable answer
- obtain sample(s)
- collect data
- calculate summary stats (central tendency and dispersion)
- choose statistical test based on types of variables and types of questions being asked
- apply stat test and obtain test stat
- compare test stat to theoretical sampling
- obtain p-value
- decide to accept/reject H0
- derive conclusion
What is Type I (alpha) error?
reject H0 when it’s true (more serious of two errors)
say something happened when it just happened by chance
What is Type II (beta) error?
accept H0 when it’s false (less serious)
How can errors be minimized?
- by good design
- sufficient power
- but error cannot be eliminiated
What is the t-test used for?
when comparing means for two samples
What is the unpaired t-test?
- typically have control and treatment/experimental groups, each with different subjects
e. g., one group of hypertensive patients gets a new drug (treatment group) and the other gets sugar pills (control/placebo) group - has less power than paired t-tests
How can errors be minimized?
- by good design
- sufficient power
- but error cannot be eliminiated
What are some characteristics of the t-test?
- want to determine if the difference between means for each of two groups occurred because of the treatment or chance
- -> H0: no significant difference between means
- -> H1: can be either 1 (treatment group will have a higher mean score than control group) or 2-tailed (there is a significant difference between means)
- t-test calculation basically gets the difference between 2 means and divides by standard error of the difference (square root of the average standard deviation of the two groups) - this takes into account the central tendency of the 2 groups and an estimate of the average dispersion of the data
- formula yields single value called t-statistic
- compare calculated t-statistic with theoretical sampling distribution for the t-distribution (tables are found in statistics books or online) to decide if accept/reject H0
- ->need alpha value and degrees of freedom (number of subjects minus number of parameters (2))
- -> if t-stat > table reject H0; if T-stat < table accept H0
What is the trade-off between the two types of error?
as you decrease the probability of making a Type I Error you increase the probability of making a Type II Error
*as Type I is more serious most people set the Type I Error (typically at 0.05)
When you reject H0, but H0 is true, what type of error is this?
Type I
When you accept H0 but H0 is not true, what type of error is this?
Type II
What does a confidence limit attempt to do?
- capture population parameters
- range of values around mean (or other measure of central tendency) that says X% sure that it will fall int his range using confidence levels/limits
- this is used because sample statistics only estimate population statistics, can’t actually get population statistics
What do confidence limits depend on?
- standard deviation of sample data (smaller yields narrower margin error)
- sample size (larger yields narrower margin error)
- level of confidence desired (95% o5 99%)
- -> 95% tighter (narrower margin error) than 99% (to be more certain, 99% needs bigger margin of error)
- formula to calculate confidence limits depend son type of data and test
What is statistical power?
Power (1-B) = probability that you correctly reject H0 when H1 true
- typically set at 80% : power of 80% means that when H0 is truly false and there is a true treatment/experimental effect, a significant difference will be detected 80% of the time
- increase power reduces the probability of making Type II error
Increasing power reduces the probability of making which type of error?
Type II
What are methods of increasing power?
- lower alpha to 0.1: easier say difference significant - generally not accepted as 0.05 is minimum
- 1-tailed instead of 2-tailed H1 - sometimes feasible
- increase effect size (differences between means) - can’t control
- decrease variance/standard deviation - can’t control
- increase sample size = best and most effective method
- can work backwards before starting experiment and calculate what sample size would give you sufficient power
- -> need significance level (alpha)
- -> estimated std dev (from literature)
- -> estimated effect size (from literature)
What is the best way to increase power?
increase sample size
What are some characteristics of a paired t-test?
- typically have same subjects in both groups
for example, in a pre-post (before and after) design
e.g., single group of hypertensive patients has BP measured before going on any drug, go on the drug for 6 weeks then measure blood pressure again and compare pre and post levels
*takes into account that same subjects measured twice and thus there are correlations or common relationships between the two sets of data
What is ANOVA used for?
comparing 3+ means
*analysis of variance
What does the ANOVA do?
- compares variability of each treatment/experimental group across all subjects (between variance) to variability individual subjects across all treatment conditions (within subject variance)
- assumes data normally distributed and similar variances
- F-statistic examined for significance using F-distribution
- H0 = 3+ means do not differ; H1 = 3+ means differ
- -> if reject null all you can say is the means differ - cannot say exactly which ones differ
- -> need to conduct follow-up tests that then compare the means to determine which ones differ
- numerous variations of ANOVA depending on study design and variance relationships
What is the Chi-Squared or Goodness of Fit test used for?
- do observed (collected) data fit expected pattern (chance) or are trends observed in distribution
positive correlation
0 < r =< 1
- score high on 1 variable and score high on the other
- score low on 1 variable score and score low on the other
- positive slope when plot the data
- 1 = perfect correlation (can’t exceed 1)
negative correlation
- 1 =< r < 0
- score high on 1 variable and score low on the other
- negative slope when plot the data
- -1.0 = perfect correlation, can’t exceed
no correlation relates to what correlation coefficient?
r = 0
- no correlation or no linear relationship (other relationships exist, but correlation only measures linear)
What are some characteristics of correlation and correlation coefficient?
- correlation does NOT equal causation
- interpreting r values
|r| < 0.29 small correlation, weak relationship
|r| 0.3 - 0.49 medium correlation/relationship
|r| 0.5 - 1.0 large correlation, strong relationship - calculate the Spearman r if have ranked data
What is the one-variable chi-square (X^2) test used for?
- used with typical survey questions whether subject picks from set of pre-set categorical answers
e. g., how much do you agree with the statement “compared to 5 years ago I take better care of myself” strongly disagree, disagree, neutral, agree, strongly disagree - -> with 20 subjects by chance would expect 4 people to answer in each one - do the observed responses differ significantly from 4 in each: H0 = no, H1 = yes
What is coefficient of determination (r^2)?
gives the proportion of 1 variable explained by the other
e.g., if correlation between height and weight = 0.80, then r^2 = 0.64 meaning 64% of weight is explained by height and 36% is explained by other variables
What is a multi-variable X^2 test used for?
- proportions/frequencies/percents of observed categorical values for 2+ groups (minimum = 2x2) in 2+ conditions
- -> is there a difference in the proportion of males vs females benefitting from low dose aspirin in terms?
- -> calculating expected values more complicated than single-variable situation but still finding “goodness of fit” between expected and observed
What is Fisher’s Exact Test?
- version of Chi-square test when outcome of interest occurs infrequently and thus data are “lopsided” and one variable has too few counts (i.e., e.g., does alcohol reduce the rate of cardiac disease?
- -> formula to get probabilities is complicated but still finding what would have occurred by chance compared to observed
In what studies is an Odds Ratio (OR) used?
used in case-control study
*case-control study = group of cases/patients (those with disease and those without) assembled and exposure histories ascertained to compute measures of association between exposure and risk
What is a case-control study?
group of cases/patients (those with disease and those without) assembled and exposure histories ascertained to compute measures of association between exposure and risk
What are the characteristics of an Odds Ratio?
- used in case-control studies
- looking at retrospective data for the most part
- outcome = lung cancer or no lung cancer; history = cigarette exposure or not
- odds = probability exposure to cigarettes/probability no exposure to cigarettes
- calculate for cancer and no cancer groups
- Odds Ratio (OR) = cancer ratio/no cancer ratio
- OR > 1 = association exists (Farther from 1 = stronger association)
Odds vs. Odds Ratio?
Odds are easier to report to public
Odds Ratio is a way of normalizing things towards 1
What is Relative Risk used for?
used more in cohort studies than case control in association with OR
What are characteristics of Relative Risk?
- outcome = probability that particular event will happen over time
- can only be determined prospectively
- RR = incidence of disease in exposed/incidence of disease in non-exposed
- RR > 1 = risk exists (farther form 1 = greater risk)
- used more in cohort studies than case control
RR vs OR?
RR: prospective
OR: retrospective
What is correlation?
examines strength and direction of relationship between 2 variables
–> can extend to 3+ variables using multiple correlation
What is the correlation coefficient (r)?
measure used to express extent or strength of relationship; often referred to as Pearson r
- positive correlation: 0 < r < 1; score high on 1 variable and score high on the other; score low on 1 variable score and score low on the other; positive slope when pllot the data; 1.0
What is regression (r)?
using correlation in models of prediction
- if linear relationship exists between 2 variables can use that to calculate equation of line that best represents relationship, then use to predict what one variable (weight) would be if know value for other (height)
- can use multiple regression techniques with 3+ variables
What is inferential statistics?
estimating parameters of a population from a sample
Example:
sample mean for BMI = 20.4
calculated 95% Confidence Interval = 3.5
What is the confidence interval? What does it mean?
16.9 - 23.9
We can be 95% certain the population mean falls between these limits
What are three possible inferential tests for categorical data?
- Chi-square
- Fisher’s Exact
- OR vs RR
If have ranked data, what type of correlation coefficient do you find?
Spearman r