Chapter 10 - Relationships between Numeric Variables: Regression and Correlation Flashcards

1
Q

What to look for in a scatter plot:

A

Trend (pattern), scatter, outliers, strength of the relationship, association, groupings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Association

A

A pattern that connects two (or more) variables. This pattern would be unlikely to be generated by purely chance. Conversely there is no relationship when learning the value of one variable would tell you nothing new about the likely value of the other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Correlation

A

The strength and direction of the relationship between two numeric variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Positive correlation

A

The values of one variable tend to increase as the values of the other variable increase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Negative correlation

A

The values of one variable tend to decrease as the values of the other variable increase.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Correlation coefficient

A

A number between -1 and 1 calculated so that the number represents the strength and direction of the linear relationship between two numeric variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A correlation coefficient of 1

A

Indicates a perfect linear relationship with positive slope.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A correlation coefficient of -1

A

Indicates a perfect linear relationship with negative slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Direction

A

A descriptor of whether an association between two variables is positive or negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explanatory variable

A

A variable that we want to use to try to explain or predict the behaviour of a response variable, or just to investigate whether this might be possible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Extrapolate

A

To estimate the value of one variable based on knowing the value of the other variable, where the known value is outside the range of values of that variable for the data on which the estimation is based.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Least squares regression line

A

A line used to represent a linear trend between the two numeric variables displayed in a scatter plot where the line is chosen to minimise the sum of the squares of the residuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

(Simple) linear regression

A

A procedure used when one numeric variable (explanatory) is used to predict or explain the behaviour of a second numeric variable (response variable) and the overall pattern between the two variables can be represented by a line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Linear trend

A

The overall pattern between the two numeric variables displayed in a scatter plot when that pattern can be represented by a line,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Outlier(s)

A

Value(s) that lie so far away from the bulk of the data that they look odd and make us wonder β€œis that a mistake?”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Prediction intertval

A

This predicts the Y-value for an individual with a specified value of x.

17
Q

Regression line

A

A line used to represent a linear trend between the two numeric variables displayed in a scatter plot (Alternative name: fitted line)

18
Q

Residual

A

The difference between an observed value of the response variable and the value of the response variable predicted from the regression line, i.e., observed value - predicted (fitted) value
(Alternative name: prediction error)

19
Q

Scatter

A

In a scatter plot, the extent to which the values of the response variable deviate from the trend.

20
Q

Trend

A

The overall pattern between the two numeric variables displayed in a scatter plot.

21
Q

y-intercept

A

For a point on a line, the value of Y when X = 0. In regression this is often a meaningless value.