Lecture 6: Statistical Testing II Flashcards
What is the ANOVA test (Analysis of variance)?
A technique used to compare means among three or more independent populations with one test.
Each individual falls into a group described by a (categorical) grouping variable and for each individual we measure some continuous outcome
The P value from ANOVA answers the question?
Are there any differences in mean among the groups studied?
Why use ANOVA?
It is similar to the t-test in that it compares means
ANOVA is more suited for a large number of groups:
–Robust design
–Increases statistical power in comparison to the t-test
The kind of questions that can be answered using ANOVA?
- Is there a difference in the mean blood pressure for three different types of smokers (ex-smoker, current smoker, passive smoker)
- Study that compares student’s test performance following various teaching techniques (online, tutorials, lecture-based)
- Is there a difference between the average number of times articles are shared on social
What is Linear regression?
- Assumes a linear relationship between one or many variables and an outcome
- Looks at the extent to which a change on one (or many) variables (X) are uniquely associated with an outcome (Y)
What is Linear regression used for?
Estimating the relationship between two continuous variables
Continuous outcomes
What is the simple linear regression equation?
y= mx + c
y= outcome or dependent variable c= intercept of line at the y axis m= slope (how much y changes with every unit increase in x) x= independent variable
Explain the application of simple linear regression
Example: The linear relationship between the age of a driver and the maximum distance at which a highway sign was legible
Regression equation: y= a+bx
y= distance (dependent variable) a= interval of line at the y axis b= slope (how much y changes with every unit increase in x) x= age (independent variable)
What is the difference correlation and simple linear regression?
Simple linear regression is similar to correlation in that the purpose is to measure to what extent there is a linear relationship between two variables.
- The major difference between the two is that:
- Correlation: provides information on the direction and strength of the relationship
- Linear regression: allows us “predict” the value of the dependent variable based upon the values of one or more independent variables.
What is Simple linear regression?
Simple linear regression is a statistical method to generate an equation to summarize and study relationships between two (quantitative) variables, where one variable has an impact on the other…..
X: is the predictor or independent variable.
•Y: is the response or dependent variable.
•Simple linear regression concerns the study of only one independent (predictor) variable.
What is Multiple linear regression?
Multiple linear regression is a regression model that contains more than one regressor variable.
Multiple linear regression is a statistical method to generate an equation to summarize and study relationships between multiple (quantitative) variables, where each variable has a unique impact on the outcome
•More commonly used than simple linear regression
y=𝛽_0 + 𝛽_1 𝑥_1+ 𝛽_2 𝑥_2+ 𝛽_3 𝑥_3…..
What is Logistic regression?
Similar to linear regression, but…
Outcome is categorical or binary/dichotomous
What is Survival analyses?
Survival analyses is used when the dependent variable is time to some event
What are the types of event in a survival analysis?
EVENT could mean death, recovery, relapse, reoffending etc
What do Survival analysis account for?
time to event
Loss to follow-up
Differences in follow up time