Lecture 10 Flashcards
what does correlation describe
an association between variables. when one variable changes, so does the other
what does causation mean
that changes in one variable brings about changes in the other; there is a cause-and-effect relationship between variables
what is the correlation coefficient?
a measure of the average distance of all of the points from an imaginary straight line drawn through the scatter of points.
what does Pearson’s correlation coefficient measure?
the strength of the linear association between two variables
Pearson’s correlation coefficient value from -1 indicates?
perfect negative correlation (all the points exactly on a line)
Pearson’s correlation coefficient value from 0 indicates?
no association
Pearson’s correlation coefficient value from +1 indicates?
indicating perfect positive correlation (all points exactly on a line)
when can we run spearman’s test?
when the variables are not normally distributed;
sample size<20
is spearmans test a parametric measure?
no, its non-parametric
what is a similarity between pearsons and spearmans test?
the coefficient varies form -1 to +1
define covariance
Covariance measures how two variables move with respect to each other and is an extension of the concept of variance (which tells about how a single variable varies). It can take any value from -∞ to +∞
what a positive number signifies in covariance?
it signifies positive covariance, and denotes that a direct relationship exists
what a negative number signifies in covariance?
negative covariance, which denotes an inverse relationship between two variables
is covariance good for interpreting the magnitude of the relationship?
no, its only good for defining relationship type
what is one property of covariance matrix?
it is always symmetric with the variances on its diagonal and the covariances off-diagonal
define linear regression
the prediction of value of one characteristic from knowledge of another. must be two variables and their relationship is displayed in a straight line
how many independent variables are included in multiple regression?
more than one independent variable is included in the prediction equation
define dependent variable
s the variable whose values you want to predict
define independent variable
is the variable which you use to predict the dependent variable
define coefficient of determination
a statistical measurement that examines how differences in one variable can be explained by the difference in a second variable, when predicting the outcome of a given event
define residual
The difference between the predicted value (based on the regression equation) and the actual, observed value
define outlier
In linear regression, an outlier is an observation with large residual. In other words, it is an observation whose dependent-variable value is unusual given its value on the predictor variables. An outlier may indicate a sample peculiarity or may indicate a data entry error or other problem
define leverage
Leverage is a measure of how far an independent variable deviates from its mean. High leverage points can have a great amount of effect on the estimate of regression coefficients
define influence
An observation is said to be influential if removing the observation substantially changes the estimate of the regression coefficients. Influence can be thought of as the product of leverage and outlierness
define cooks distance
A measure that combines the information of leverage and residual of the observation.