Correlation and Linear Regression Flashcards

1
Q

scatterplots

A

graphical summary of the relationship between two quantitative variables, plotted as (x, y) pairs

the straight line is called a regression line

must determine which varible is

  1. explantory variable (predictor, independant) variable (x)
  2. response (dependant) variable (y)

the regression line describes how the response variable changes as the explanatory variable changes

  • the process of fitting a line through the data means drawing a line that comes as close as possible to the points
  • equation of this line will be given by: y = a + bx

a = intercept coefficient

b = slope coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

linear correlation coefficient (r)

A

measures the strength and direction of a linear relationship between two quantitative variables

  • range of r is less than or equal to 1, greater or equal to -1
  • sign of r indicates direction
  • magnitude of r indicates strength of linear relationship between 2 variables
  • r = 0 indicates no linear relationship or close to 0 mean weak relationship
  • positive values of r is (+x), (+y) negative values of r is (+x), (-y)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

regression line for prediction

A

after the fitted regression equation, use it to predict variables of y for any value of x (even for values of x that were not in the original sample data)

  • making use of a regression line for prediction outside the range of data is called extrapolation and should not be done
  • we only know about the relationship for observed range of x values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

outliers

A

observations that lie outside the overall pattern of other observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

influential points

A

observations that, if removed, would considerably change correlation or line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

best method to establish a relationship

A

best method: manipulate explantory variable in an experiment institute control for other variables

  • show a strong and consistent association
  • have an alleged cause preceding the effect in time
  • have an alleged cause that is plausible
    ex) in studies of lung cancer among people who smoked or not
  • observational studies were done since individuals had decided to become smokers before the data was collected
  • it was once argued that people who choose to smoke may be more susceptible to lung cancer due to other reasons
  • further evidence has shown a strong link between the two

people who smoke more often or a longer period get lung cancer more often put people who stop reduce risk

lung cancer develops after years of smoking and was rare among women until wmen began to smoke

non-observational animal studies have shawn tar from cigarettes does cause cancer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly