IA2 - Exam Flashcards
Bivariate Data
define explanatory variable
- also known as independent variable
- used to explain or predict value of response variable
Bivariate Data
define response variable
- also called dependent variable
- changes in response to the explanatory variable
Bivariate Data
P(event) = ?
P(event) = (number of successful outcomes/ total number of outcomes)
Bivariate Data
how can we tell if there is an association based on percentages?
- if the percentages are very different, there IS and association
- if they are similar there is NO association
Bivariate Data
What are the 6 features of a scatterplot?
- Explanatory variable: x - axis
- response variable: y - axis
- title, axis label (units)
- Arrows
- use ‘lightning bolt’ to show not starting at 0
- use an appropriate scale
Bivariate Data
what are the 2 types of Form (type) used to describe patterns/associations?
- linear
- non-linear
Bivariate Data
what are the 2 types of direction used to describe patterns/associations?
- positive
- negative
Bivariate Data
what are the 5 types of strength used to describe patterns/associations?
- no correlation
- weak
- moderate
- strong
- perfect
Bivariate Data
define pearson’s correlation coefficient
- does not tell if there is an association
- instead assumes there is a linear association
- gives a measurement of it’s strength and direction
Bivariate Data
how can you tell direction and strength from correlation coefficient?
direction = sign (positive or negative)
strength = value (number)
Bivariate Data
how can you tell direction and strength from correlation coefficient?
direction = sign (positive or negative)
strength = value (number)
Bivariate Data
define coefficient of determination (r squared)
R^2 tells us how much of our correlation is because of the two variables
- ie. if R^2 = 0.82, then 82% of effect is because of two variables. Other 18% is due other factors
Bivariate Data
define least squares regression line
line of best fit
- residual tells us how far away our points are from the line of best fit
Bivariate Data
how do you know if your residual is + or -?
- data points above the line of best fit have a positive residual
- data points below the line of best fit have a negative residual
- sum of residuals = 0 in a least squares line of best fit
Bivariate Data
what are the assumptions of using a LSRL?
- numerical data
- linear association
- No clear outliers