Regression 2 Flashcards
Is Pearson’s r sensitive to outliers? Justify
Pearson’s r is sensitive to outliers, which can have a great impact on
the line of best fit and the Pearson correlation coefficient, leading to
very difficult conclusions regarding your data. Therefore, it is best if
there are no outliers or they are kept to a minimum. Fortunately, you
can use Stata to detect possible outliers using scatterplots.
What is Regression Analysis?
When we make a distribution in which there is an involvement of more
than one variable, then such an analysis is called Regression Analysis. It
generally focuses on finding or rather predicting the value of the
variable that is dependent on the other.
Therefore, if we use a simple linear regression model where y depends
on x, then the regression line of y on x is:
y = a + bx, (where a- y-intercept and b-slope)
Regression Coefficient
The two constants a and b are regression parameters. Furthermore, we denote
the variable b as byx and we term it as regression coefficient of y on x.
* Also, we can have one more definition for the regression line of y on x. We can
call it the best fit as the result comes from least squares. This method is the most
suitable for finding the value of y on x i.e. the value of a dependent variable on an
independent variable.
Least Squares Method
The least square method is the process of finding
the best-fitting curve or line of best fit for a set of
data points by reducing the sum of the squares of
the offsets (residual part) of the points from the
curve.
This process is termed as regression analysis
Contingency Tables
A contingency table provides a way of portraying data that can facilitate
calculating probabilities. The table helps in determining conditional probabilities
quite easily. The table displays sample values in relation to two different variables
that may be dependent or contingent on one another