Correlation and Regression Flashcards
Correlation
Correlation is the statistical relationship between variables such that change in one variable affects change in another variable
Analysis of _______________ of two or more variables refer to __________________
Analysis of covariance of two or more variables refer to correlation
Types of correlation
1.) Positive correlation
2.) Negative correlation
3.) Linear and non-linear
4.)Partial and total
What does positive correlation indicate?
A positive correlation means that as one variable increases, the other also increases and as one decreases, the other decreases
What does negative correlation indicate?
A negative correlation means that one variable increases, the other variable decreases and vice versa
How to denote correlation coefficient
γ (gamma)
What is correlation coefficient
(definition)
It is a numerical measure that quantifies the strength and direction of linear relationship between two variables
Range of correlation coefficient
[-1,1]
What does γ=0
At γ=0 indicates no linear relationship between two variables
or
covariance(x,y) = 0
Example of real world positive correlation
Height and Weight
Example of real world negative correlation
Supply and demand
Value of γ for perfect positive correlation
1
Value of γ for perfect negative correlation
-1
Value of γ for partial positive correlation
0 < γ < 1
Value of γ for partial negative correlation
-1 < γ < 0
Value of γ for no correlation
0
What is scatter plot used for
To visualize linear correlation between two variables
Scatter diagram of perfect positive correlation
Scatter diagram of perfect negative correlation
Scatter diagram of partial positive correlation
Scatter diagram of partial negative correlation
Scatter diagram of no correlation
Linear correlation
Changes in one variable are associated with proportional changes in the other, either positively or negatively.
Non-linear correlation
Changes in one variable are not associated with proportional changes in the other
Methods of studying correlation
1.) Graphic methods
* a.)Scatter diagram
* b.)Simple graphs
2.)Mathematical Methods
* a.)Pearson’s coefficient of correlation (γ)
* b.)Spearman’s rank coefficient of correlation (ρ)
Coefficient of correlation
Pearson Coefficient of correlation
Property second of coefficient of correlation
Third property of coefficient of correlation
Fourth property of coefficient of correlation
If X, Y are orthogonal in nature, what about its correlation
γ = 0 since orthogonal vectors are independent vectors
Cov(X,Y)
E(XY) - E(X).E(Y)
Definition of rank correlation coefficient
This method based on rank is useful in dealing with quanlitave characteristics such as morality, character, beauty. It is based on the ranks given to observations .
Full name of rank correlation coeffiecient
Seaman’s rank correlation coefficient
How to denote rank correlation coefficient
ρ
Formula of rank correlation coefficient
Range of rank correlation coefficient
ρ ∈ [-1,1]
Property second of rank correlation coefficient
If ρ =1, there is complete agreement in the order if the ranks and direction of ranks is same
Property third of rank correlation coefficient
If ρ =-1, there is complete disagreement in the order of ranks and they are in opposite directions
if ρ=1, then
If ρ =1, there is complete agreement in the order if the ranks and direction of ranks is same
If ρ =-1, then
If ρ =-1, there is complete disagreement in the order of ranks and they are in opposite directions
Regression
Approximate relationship between two random variables X and Y is called regression.
The method is to estimate the unknown value of one variable from the known value of other variable is called regression.
Type of regression
Linear
Non-linear regression
Line of regression (definition and another name)
The line described in the average relationship between two variables is known as line of regression or estimating line.
What can we calculate using regression coeffiecient
We can calculate the coefficient of correlation using regression coefficient.
Explain error and also related to it (Linear regression of X and Y)
Statistical interpretation (regression line) (2)
Regression line equation of X on Y is
Regression line equation of Y on X is
Regression coefficient X on Y is
Regression Coefficient Y on X is
Correlation coefficient in terms of regression coefficients
Correlation coefficient is geometric mean of regression coefficients
Relationship between regression coefficient and regression coefficients
We know that Arithmetic mean > geometric mean
Probable error
Standard Error of coefficient of correlation
Point of intersection of regression lines
If there is two regression lines, how do you decide which one is of y on x and which one is of x on y
Important note of regression lines
If two regression lines of y on x and x on y are respectively a₁x+b₁y+c₁=0 and a₂x+b₂y+c₂=0 then
a₁b₂ < a₂b₁
Angle between two regression lines
Two special points about angle of regression and correlation coefficient
If γ=0 then θ=π/2
If γ=±1 then θ=0 or π
Sign about coefficient of regression and correlation coefficient
Regression line y on x passes through
(mean of x, mean of y)
Regression line x on y passes through
(mean of x, mean of y)