correlation Flashcards

1
Q

bivariate data

A

pairs of values as variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

independent variable

A

x axis ( explanatory variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

dependent

A

y axis (response variable )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

types of corelation

A

strong negative, weak negative strogn positive, weak positve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

casual relationship

A

if one variable causes a change in the other !

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

comment on the claim that hotter coutnries have less rainfall

A

the graph does not support the statement that hotter coutnries have less rainfall !

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

“describe and interpret the corellation between 2 variables “

A

there is a positive/negative corelation since as ……. increases/decrerases, ??????? increases/decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

there is a weak negative corelation between internet speed and house value. danyal concludes this ; suggest why he may be wrong

A

there may be 3rd variable that influences house value and internet connection - eg distance from built up areas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

outlier formula

A

upper: q3 + 1.5(IQR) Lower: q1` - 1.5 (IQR)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

give a reason why you might exclude an anomaly give a reason for including an anomally ?

A

exclude: anomally is an outlier and not representative
include: “anomally” part of distribution data so include it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what kind of corellation is this ?

and what does it show

A

weak negative ( overall downward trend )

a casual relationship between two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

type of line of best fit

A

least squares regression line

regression line that minimises sum of squares of distrances of each data point

D=point on graph

minimses value of… D1 ^2 + D2^2 + D3^2etc.. .

(x,y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

formula for regression line

A

y= a +bx

order of variabels = importnat

regression line of y on x is different from x on y

coefficent (B) = changei n y for each unit of change in x, example: if b is negative…then data negatively correlated

vice versa

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

w= windspeed ( knots)

g= gust ( knots)

give an interpretation of the value of the gradient of this regression line?

A

just say what the gradient does…

if the valeu of windspeed is 10 knots (Exmaple ) , the daily maximum ust increases by 18 knots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

justify the use of a linear regression line ?

A

because corelation suggests a linear relationship!

17
Q

when should you use lienar regression line ?

A

values of dependent variable that ware within a range of given data ( interpolation )

18
Q

value is inside the range of data so is linear regrression used

A

said value is within the range of data so linear regression is more likely to be accurate

if outside the data ( extrapoaltion), linear regressio nlessl ikely to be accurate

19
Q

regression equation y= 2 + 6x

man wants to estimate x from y ( x is independent variable.. y is dependent variable !)

suggest why this is bad

A

independent variable is x , you shoudl onyl make predicitons for dependent variable! so you sohuld not use this model to predict a value of x for a given value of y !

isntead, you need to use regression line of x on y

20
Q

line of regression common question

comment of the reliability of said valeus x and y ( y is outside of the range of values )

A

x is reliable as its within the range of data

y is not reliable as it is outside the range of data

21
Q
A

There are two key problems with Helen’s statement: First, 10 coats of paint is very far outside our range of given data, and we cannot assume that this linear relationship continues as we extrapolate, so using the regression line is not necessarily valid. Second, even if we accept the extrapolation as valid, a gradient of 1.45 means that, for every extra coat of paint, the protection will increase by 1.45 years. Therefore, if 10 coats of paint are applied, the protection will be 14.5 years longer than if no paint were applied. Helen has, however, forgotten to include the constant 2.93 years, which is the weather resistance if no paint were applied. After 10 coats of paint the protection will last approximately 2.93 + 14.5 = 17.43 years.

comment on how data is outside of range of valeus thus extrapolation may not be correct for this regression equation

comment on the fact that the constant has not been included !

22
Q

note

A

if negative weak corelation present, then the coefficient of gradient on regression line should be negative

23
Q

the equation for the line of regression for houses….

y= 900 + 5x

x= number of bedrooms

person says that if there is no bedrooms, the price of the house will be 900

why is this unreasonable ?

A

This is not a reasonable statement as there are unlikely to be any houses with no bedrooms, so she is extrapolating outside of the range of data, where the linear relationship is unlikely to continue.

24
Q

regression equation should be used to give a value for.,,

v= h + 100x

A

v given h

(this is an example !)