Correlation, Regression, Sample Size, & Power Flashcards
Correlation and regression
Advantages: maintain continuity of data, model one variable as a function of the other
Disadvantages: only measures linear relationship, only useful when both variables are continuous
2 x 2 tables
Advantages: ease of interpretation, no distributional assumptions, can easily stratify by other variables, can calculate OR or RR
Disadvantages: if a variable is continuous requires arbitrary cutoff (loss of information)
2-way scatter plots
The first step in analyzing association between 2 continuous variables X and Y
Pearson correlation coefficient (r)
Determines whether two continuous variables (X and Y) are linearly related. Ranges between -1 (perfect negative correlation) and 1 (perfect positive correlation)
R^2 coefficient of determination
R^2 is the proportion of the total variability in Y that can be explained by the linear association between Y and X
Type I error
Incorrectly rejecting H0 when H0 is true; probability of type I error is also called significance level
Type II error
Incorrectly failing to reject H0 when H1 is true
Power
Power is the probability that we reject H0 when H1 is true