Extra Notes Flashcards
How do we report the percentile rank?
Do not use %
Just say, e.g., The percentile rank of the student who obtained a raw score of 172, is 80.
What is the rule of linear transformations?
- If you add or subtract a constant from each value in a distribution,
– The mean is increased/decreased by that constant – The standard deviation is unchanged
• If you multiply or divide each value in a distribution by a constant,
– The mean is multiplied/divided by that constant
– The standard deviation is multiplied/divided by that constant
What does it mean for a non-linear relationship to have a Pearson r value of 0?
This does not mean there is no relationship between the variables, just that there is no linear relationship!
What is the effect of outliers?
An outlier can have a large effect on the Pearson r, and on the “line of best fit”
It pulls the line over towards the outlier.
What is the effect of range restriction?
Not looking at the whole range of data may lead to a weaker relationship/r value
Before calculating a correlation coefficient, consider whether the ranges of the two variables are sufficient to show their true relationship.
How do you define Pearson r?
Pearson r is a measure of the extent to which paired scores occupy the same or opposite positions within their own distributions.
How do you interpret Pearson r values?
Equal to 0
No relationship
Between 0 and .10
Trivial
Between .10 and .30
Small to medium
Between .30 and .50
Medium to large
Greater than .50
Large to very large
What’s one thing to notice when calculating regression Y?
Regression constants (ay and by) should be reported to 4 decimal places. This helps retain accuracy in your final answer when using the equation to predict Y.
What does Pearson r tell us?
Pearson r tells us how helpful the regression line will be in predicting Yi given Xi. Forrvaluesin between 0 and 1, the regression line will produce moderate errors
Pearson r also tells us something about how much of the variability in Y is accounted for by (the variability in) X.
What is r^ 2?
proportion of the variability of Y accounted for by X
In a sample of students, height and weight are correlated with r = .65. What percentage of the variability in weight is accounted for by height in this sample?
r^2 = 42.25
Pearson r is used when X and Y are both measured on what scales?
Interval or ratio scales
What are other types of correlation coefficients are used?
If the relationship is curvilinear, the correlation coefficient eta (η) can be used to describe the strength of the relationship
Spearman rank order correlation coefficient rho (rs)
one or both variables are measured on an ordinal scale
biserial correlation coefficient (rb)
one of the variables is interval or ratio and the other is dichotomous
phi coefficient (Φ)
both variables are dichotomous
What is the least-squares regression line?
The least-squares regression line is the prediction line that minimizes the total error of prediction, according to the least-squares criterion of
∑(Y −Y ‘)2
Y’ = byX + aY
report ay and by to 4 decimal places.
Limitations of linear regression
Only appropriate:
– for linear relationships
– when the sample you used to calculate the regression line is representative of the sample you want to make predictions about
– within the range of the original variables
What is the standard error of estimate? SEE
The standard error of estimate (SEE) tells us how much error can we expect on average when we use the regression line.
Example report: We can expect about 68% of actual GPAs to will fall within 0.43 points of the prediction
What is homoscedasticity?
variability in Y stays constant across X values
How much has our prediction improved by adding a new predictor variable?
You could answer this by looking at:
– how much the standard error of estimate has decreased after adding the new predictor
– how much the proportion of variability accounted for has increased after adding the new predictor
“Total proportion of variability accounted for” (R2) is the most common measure of a regression model’s “goodness of fit” to the data