Section C.2: Regression analysis Flashcards
What is regression analysis?
Regression analysis relates to the creation of a “predictive model”, a formula which will allow you to predict the values of a dependent/response variable based on the independent/predictor variable.
Describe the AIM of linear regression analysis
The aim of linear regression analysis is to establish a linear relationship (given as a mathematical formula as in, y=mx+c) between the predictor variables (IV) and the response variable (DV), so that we can use this formula to estimate the value of the response variable, when only the predictors’ values are known.
The predictor variable is
Independent or dependent?
y-axis or x-axis?
The predictor variable is the independent variable plotted on the x-axis.
The response variable is
Independent or dependent?
y-axis or x-axis?
The response variable is the dependent variable, and is plotted on the y-axis
What is the dependent variable?
The dependent variable is the variable that may be affected by the independent variable, and is plotted on the y-axis.
What is the independent variable?
The independent variable is the variable that may affect the dependent variable/s, and is plotted on the x-axis.
What is a residual in regression analysis?
A residual is the difference between the actual value of the response variable and the predicted value generated by the model for any given point.
What is the R-Squared value in a regression model?
The R-Squared value (or coefficient of determination) is a statistical measure in a regression model that determines the proportion of variance in the dependent variable that can be explained by the independent variable, or how well the data fits the regression model (goodness of fit).
What characteristics must data have to be fit for linear regression analysis?
Numerical, normal, outlier-free, and correlated.
What is a correlation coefficient in regression analysis?
A correlation coefficient is a statistical measure that suggests the level of linear dependence between the two variables
High correlation is any correlation coefficient -0.5 < x < 0.5
Low correlation is any correlation coefficient -0.2 < x < 0.2
R-Squared value shows how much variation within the data is explained by the model, what R-Squared value is considered good for prediction?
More than 70% or (0.7) R-Squared is considered good for prediction.
What is the line of best fit in a regression model?
The line of best fit, or regression line, is the slope of the linear model. It represents the expected value of the response variable as predicted by the model.
What assumptions are made in multiple linear regression? (4)
Assumptions of multiple linear regression include:
1. Normality
2. No outliers
3. Constant variance
4. Independence
What is multiple linear regression?
What are the two types of linear regression? (2)
Simple linear regression ( 1 IV, 1 DV)
Multiple linear regression ( 2+ IV, DV)