Chapter One Flashcards
What is data mining?
Sorting through large amounts of information looking for anything that is relevant.
What is more risky - interpolation or extrapolation?
Interpolation is less risky than extrapolation
Does strong correlation prove cause and effect?
No it does not
When is least squares regression applicable?
Only to linear relationships
What is plotted on y-axis?
Dependent variable
What is b?
The slope
How are regression coefficients calculated?
Using the least squares method
In a positively skewed distribution, what holds true?
Mean > median > mode
What is mean a measure of?
Location
Is the IQR affected by extremes?
No it is unaffected by extremes
Is the range a fuction of extremes?
Yes
If the plot of two variables X and Y results in a horizontal line, this tells us?
They have zero correlation as the value of the y variables is independent from the value of the x variable, it does not change regardless of the value of x
Why will interpolated results from a line of best fit not be accurate?
Since the line of best fit may not be very accurate
Why does regression analysis not highlight spurious relationships between two variables?
It does not in itself prove causality
With a positively skewed distribution…
The mean is most distorted by extreme values
Dividing the variance by the standard deviation for a sample would give
Standard deviation, as the variance is standard deviation squared
Describe R-squared
- R-squared expresses the goodness of fit of a regression model
- It’s the correlation coefficient squared so is on a scale of 0 to 1