correlation and regression Flashcards

Question

Simple Regression Model

Answer 1

A statistical technique used to model the relationship between one predictor variable and one outcome variable. To understand how changes in the predictor variable are associated with changes in the outcome variable.

Answer 2

Examines how changes in one variable (predictor) are associated with changes in another variable (outcome). Used to predict the value of the outcome variable based on the value of the predictor variable. Provides an equation that describes the relationship and allows for making predictions. Can suggest causality if appropriate conditions are met, as it implies a directional relationship between variables

Answer 3

Y = bX+ c+ e

Answer 4

Y represents the outcome variable (e.g., test scores). X represents the predictor variable (e.g., study time). b is the regression coefficient, which indicates the change in Y for a one-unit change in X. c is the intercept, representing the value of Y when X is zero e is the error term, representing the difference between the observed Y and the predicted Y based on the regression equation.

Answer 5

The outcome variable is "attention span". The predictor variable is "screen time". b is the regression coefficient representing the effect of screen time on attention span. c is the intercept representing the value of attention span when screen time is zero. e is the error term representing unexplained variability in attention span.

Answer 6

It shows how well the predictor variable(s) explain the outcome variable in a regression model.

Answer 7

Between 0 and 1, where 0 means no explanation and 1 means full explanation. | 40% of the outcome variation is explained by the predictor

Answer 8

R-squared equals the square of the correlation coefficient. Example: If the correlation coefficient is -0.42, the R-squared would be (−0.42)^2 = 0.1764 17.64%

Answer 9

Statistical technique used to examine the relationship between one dependent variable and two or more independent variables assumptions include linearity, independence of errors, homoscedasticity, normality

Answer 10

Regression coefficients represent the change in the dependent variable for a one-unit change in the predictor variable, holding other variables constant.

Answer 11

The overall fit can be assessed using measures such as R-squared and adjusted R-squared, which indicate the proportion of variance explained by the model.

Answer 12

To visually inspect data and validate assumptions before interpreting correlation coefficient. Allows visual inspection of assumptions. Helps detect linearity and outliers. Ensures accurate interpretation of correlation. Before reporting a correlation coefficient between two variables, examining a scatterplot allows you to identify any nonlinear relationships or outlier points that may affect the interpretation of the correlation.

Answer 13

The presence of a third variable that influences both variables being correlated, leading to a spurious or misleading correlation.

Answer 14

Control for or consider potential third variables to accurately interpret the correlation between two variables. Use techniques like partial correlation or regression analysis to account for the influence of third variables.

Answer 15

A statistical technique used to assess the relationship between two variables while controlling for the effects of one or more additional variables. Calculates the correlation coefficient between two variables after statistically removing the influence of one or more covariates.

Answer 16

A statistical method that examines the relationship between one dependent variable and one or more independent variables. Identifies how changes in the independent variables are associated with changes in the dependent variable. Fits a regression model to the data, estimating the coefficients that represent the strength and direction of the relationships between variables.

Answer 17

Positive: Both variables increase together (e.g., height and weight).Negative: One variable increases, the other decreases (e.g., alcohol consumption and memory recall).

Answer 18

1) Plot the data using a scatterplot to visually identify outliers. 2) Calculate the correlation coefficient with and without outliers to observe changes in its magnitude. 3) Conduct sensitivity analyses by removing outliers and re-calculating the correlation coefficient. If the correlation coefficient changes substantially after removing outliers, it suggests that outliers may have influenced the correlation.

Answer 19

Small: 0.1 Medium: 0.3 Large: 0.5

Answer 20

If the confidence interval includes zero, the correlation coefficient is not statistically significant at the specified confidence level. If the confidence interval does not include zero, the correlation coefficient is statistically significant at the specified confidence level. When we say "includes zero," we mean the entire interval falls on one side of zero.

Answer 21

When we calculate a correlation coefficient between two variables, we also calculate something called a confidence interval. The confidence interval tells us a range of values within which we are reasonably confident the true correlation lies.

Answer 22

Correlation coefficient of 0.50 with a 95% confidence interval of (0.30, 0.70) indicates that we are 95% confident that the true correlation lies between 0.30 and 0.70.

Answer 23

Dependent Variable (Y): The variable being predicted or explained by the independent variables. Independent Variables (X): The variables used to predict or explain changes in the dependent variable.

Answer 24

The parameters representing the relationship between each independent variable and the dependent variable.

Answer 25

The constant term in the regression equation, representing the value of the dependent variable when all independent variables are zero. | Components of Linear Regression Model

Answer 26

The differences between the observed and predicted values of the dependent variable.

Answer 27

Represents the variability in the dependent variable that cannot be explained by the independent variables.

Answer 28

A table that shows the correlation coefficients between multiple variables in a dataset. Square matrix where each row and column represents a variable. The cells contain correlation coefficients, showing the strength and direction of relationships between variables.

Answer 29

Values range from -1 to 1. Positive values indicate a positive correlation (variables move in the same direction). Negative values indicate a negative correlation (variables move in opposite directions). Values closer to 1 or -1 represent stronger correlations, while values closer to 0 represent weaker correlations.

Answer 30

Slope (b): Represents the change in the dependent variable for a one-unit change in the independent variable. Intercept (c): Represents the predicted value of the dependent variable when all independent variables are zero.

correlation and regression Flashcards

(54 cards)