Correlation and Regression Analysis Flashcards
What does correlation measure in statistics?
Correlation measures the strength and direction of the relationship between two variables.
Explain the difference between positive and negative correlation.
Positive correlation means that as one variable increases, the other also tends to increase, while negative correlation means that as one variable increases, the other tends to decrease.
How is the strength of correlation determined?
The strength of correlation is determined by the absolute value of the correlation coefficient, with values closer to 1 indicating stronger correlation.
What does a correlation coefficient of 0 indicate?
A correlation coefficient of 0 indicates no linear relationship between the variables.
What is the range of values for the Pearson correlation coefficient?
The range of values for the Pearson correlation coefficient is -1 to 1.
When should you use Spearman’s rank correlation coefficient instead of Pearson’s?
Spearman’s rank correlation coefficient is used when the relationship between variables is not linear or when the data are ordinal.
Describe the process of linear regression analysis.
Linear regression analysis involves fitting a straight line to the data points to model the relationship between a dependent variable and one or more independent variables.
What is the difference between simple linear regression and multiple linear regression?
Simple linear regression involves one independent variable, while multiple linear regression involves two or more independent variables.
How do you interpret the slope coefficient in regression analysis?
The slope coefficient represents the change in the dependent variable for a one-unit change in the independent variable.
What does the intercept term represent in a regression equation?
The intercept term represents the value of the dependent variable when all independent variables are set to zero.
What assumptions must be met for regression analysis?
Assumptions include linearity, independence of errors, homoscedasticity, and normality of errors.
What is the purpose of residual analysis in regression?
Residual analysis involves examining the differences between observed and predicted values to assess the model’s performance.
What are influential points in regression analysis?
Influential points are data points that have a large impact on the regression coefficients and may significantly alter the results.
How do you assess the goodness of fit in regression analysis?
Goodness of fit is assessed using measures like R-squared, which indicates the proportion of variance in the dependent variable explained by the independent variables.
What is multicollinearity, and why is it problematic in regression?
Multicollinearity occurs when independent variables in a regression model are highly correlated, leading to unreliable estimates of regression coefficients.
Explain the concept of homoscedasticity in regression analysis.
Homoscedasticity means that the variance of the errors is constant across all levels of the independent variables.