Chapter 10 - Re-expressing Data: Get It Straight! Flashcards
Re-expression
We do this to data when we take a logarithm, square root, reciprocal, or some other mathematical operation on all values of a variable.
Ladder of Powers
This places in order the effects that many re-expressions have on the data.
What is meant by re-expressing data?
Making the data more suitable for analysis.
One of the goals of re-expressing data is to make the distribution appear more symmetric. Why is this advantageous?
It is easier to summarize the center of data and it can make distribution look more like a normal model, which means you can use the 68-95-99.7 rule.
Another goal of re-expressing data is to make the spread of several groups more alike. Why is this advantageous?
Groups that share a common spread are easier to compare.
Why is it advantageous to make the form of a scatter plot more nearly linear?
Makes scatter plot easier to describe and also makes it possible to fit linear model once the relationship is straight.
What type of data often benefits from re-expression by squaring values?
Unimodal distributions skewed to the left.
What type of data often benefits from re-expression by taking the square root of values?
Counted Data
What type of data often benefits from re-expression by taking the logarithm of values?
Data that can not be negative. Usually values that grow by percentage rates. When re-expressing, start with logs and then look at the residual plot to see which direction to go in.
What type of data often benefits from re-expression by taking the reciprocal of values?
Data that uses measurements of rate.
If your data contain zeroes, what must you do before re-expressing using logarithms or reciprocals? Explain.
Try adding small constant to all values before finding logs. You use logs because if you take y to the 0th power, you get 1. Every value would be same, so that is why you add a small constant before.
If a scatterplot of the x-values vs. the logarithm of the y-values appears to be linear, what type of relationship is there between the original x- and y-values?
Exponential.
Rewrite ^
y = ab^x.
y = a + b ln (x)
If a scatterplot of the logarithm of the x- values vs. the logarithm of the y-values appears to be linear, what type of relationship is there between the original x- and y-values?
Power
Rewrite ^
y = ax^b
log y = a+bx