Statistics: Regressions and associations Flashcards

1
Q

What does the b in the line equation represent

A

The slope equals the amount that y changes when x increases by one unit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the function of the absolute value of the slope?

A

Describes the magnitude of the change in y^ for 1 unit change in x, the larger it is the steeper the slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is meant by prediction error? Give another word for these

A

The difference between the actual y value and the predicted y value. These are also known as residuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When is it a positive residual

A

When the actual y is larger than the predicted y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is meant by the least squares method?

A

This chooses the best possible regression line that has the smallest value of the residual sum of squares.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Apart from making errors small as possible name two characteristics of the regression line

A
  • Has positive and negative residuals

- passes through the mean point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why can’t we just use a slope to measure correlation

A

Different units of measurement for the variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Name two connections between correlations and regressions

A
  • They are both appropriate when the relationship between two quantitative variables can be approximated by a straight line
  • The correlation and the slope of the regression line have the same sign. If one is positive, so is the other one.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Name three key differences between regression and correlation

A

With regression you must identify explanatory and response variables as this will affect the direction of the slope, this is not the case with correlation. The regression line also depends on the measurement units. Finally the correlation falls between -1 and 1 while the slope can take on any figure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the typical way to interpret r^2

A

The proportion of the variation in the y-values that is accounted for by the linear relationship of y with x.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Name 4 pitfalls of association analysis

A

Extrapolation (using a regression line to predict y values for x values outside of the range of data)
Influential regressive outliers
Implying causality
lurking variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What two conditions are required for an outlier to be influential

A
  • x value is relatively loe or high compared to the rest of the data
  • Falls far from the trend that the rest of the data follows (regression outlier)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are predictions of the future using time series data called

A

forecasts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is meant by non resistant in data

A

prone to distortion by outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is meant by a lurking variable?

A

One that influences the association between two other variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the simpson’s paradox

A

When a trend appears in several different groups on data but disappears or reverses when these groups are combined

17
Q

When does confounding occur?

A

When two explanatory variables are both associated with a response variable but are also associated with each other

18
Q

Whats the difference in a lurking and confounding variable?

A

a lurking variable is not measured in the study