Reading Quiz 4 Flashcards
logarithmic (exponential) transformation
if (x, y) display approximately exponential shape then graph of (x, logy) will display approximately linear shape.
steps in logarithmic transformation
- graph original data set
- plot ordered pairs (x, logy). shape should be approximately linear
- find linear regression equation for logy in terms of x. answer calculator gives is of form logy-hat = ax + b. check correlation coefficient and residual plot to verify that equation is fairly good fit for data
- take antilogarithm of both sides of equation to solve for y-hat
power transformation
if ordered pairs (x, y) display approximate power function graph then graph of ordered pairs is (logx, logy)
steps in power transformation
- graph original data set
- plot ordered pairs (logx, logy). shape should be approximately linear
- find linear regression equation for logy in terms of logx. answer calculator gives is of form logy-hat = a + blogx. check correlation coefficient and residual plot
- take antilogarithm of both sides to solve for y-hat
important notes
- when explanatory variable is years, transform date to years since so values are smaller
- if function resembles power function then reasonable that point (0,0) should lie on graph
- can use any type of logarithm in log transformation
- extrapolation is use of regression line for prediction outside of range of values of explanatory variable x
- interpolation is use of regression line for prediction inside range of values of x (more trustworthy)
important notes cont
- association does not imply causation
- a lurking variable is a variable that has an important effect on relationship among variables but is not included in variables
- a confounding variable is a lurking variable that affects only the response variable but creates a situation where it’s impossible to determine whether the affect on the response variable is caused by the explanatory variable, the confounding variable, or neither
two way table
organizes data about two categorical variables
often used to summarize large amounts of data by grouping outcomes into categories
row variables
label rows that run across the table
column variable
label the columns that run down the table
marginal distributions
row totals and column totals in a two way table give the marginal distributions of the two individual variables
conditional distribution
look at one row and one column
find each entry in column as percent of column total
conditional distribution of row variable for each column in the table
how to describe association between row and column variables when column variable is explanatory
compare conditional distributions
how to describe association between row and column variables when row variable is explanatory
compare conditional distributions of column variable for each value of row variable
simpson’s paradox
a comparison between two variables that holds for each individual value of a third variable can be reversed when the data for all values of the third variable are combined
example of effect of lurking variables on observed association
common response
effect of lurking variables can operate through common response if changes in both explanatory and response variables are caused by changes in lurking variable
confounding
cannot distinguish between two variables’ effects on the response variable
best evidence that association is due to causation
comes from experiment in which explanatory variable is directly changed and other influences on response are controlled
NEED TO DO MORE READING QUESTIONS
OKAY
- True or False: if we have a curvilinear function, and we want to straighten it out to make a linear function, we can’t do that by multiplying or dividing by constants or adding or subtracting constants (i.e. by using linear transformations).
true