IA2 - Exam Flashcards

1
Q

Bivariate Data

define explanatory variable

A
  • also known as independent variable
  • used to explain or predict value of response variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Bivariate Data

define response variable

A
  • also called dependent variable
  • changes in response to the explanatory variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Bivariate Data

P(event) = ?

A

P(event) = (number of successful outcomes/ total number of outcomes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Bivariate Data

how can we tell if there is an association based on percentages?

A
  • if the percentages are very different, there IS and association
  • if they are similar there is NO association
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bivariate Data

What are the 6 features of a scatterplot?

A
  • Explanatory variable: x - axis
  • response variable: y - axis
  • title, axis label (units)
  • Arrows
  • use ‘lightning bolt’ to show not starting at 0
  • use an appropriate scale
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Bivariate Data

what are the 2 types of Form (type) used to describe patterns/associations?

A
  1. linear
  2. non-linear
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Bivariate Data

what are the 2 types of direction used to describe patterns/associations?

A
  • positive
  • negative
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Bivariate Data

what are the 5 types of strength used to describe patterns/associations?

A
  • no correlation
  • weak
  • moderate
  • strong
  • perfect
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Bivariate Data

define pearson’s correlation coefficient

A
  • does not tell if there is an association
  • instead assumes there is a linear association
  • gives a measurement of it’s strength and direction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Bivariate Data

how can you tell direction and strength from correlation coefficient?

A

direction = sign (positive or negative)
strength = value (number)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Bivariate Data

how can you tell direction and strength from correlation coefficient?

A

direction = sign (positive or negative)
strength = value (number)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Bivariate Data

define coefficient of determination (r squared)

A

R^2 tells us how much of our correlation is because of the two variables
- ie. if R^2 = 0.82, then 82% of effect is because of two variables. Other 18% is due other factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Bivariate Data

define least squares regression line

A

line of best fit
- residual tells us how far away our points are from the line of best fit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Bivariate Data

how do you know if your residual is + or -?

A
  • data points above the line of best fit have a positive residual
  • data points below the line of best fit have a negative residual
  • sum of residuals = 0 in a least squares line of best fit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Bivariate Data

what are the assumptions of using a LSRL?

A
  • numerical data
  • linear association
  • No clear outliers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Bivariate Data

what is the equation of LSRL?

A

refer to photo

16
Q

Bivariate Data

how do you find LSRL using calculator?

A

refer to photo

17
Q

Bivariate Data

what is the formula for calculating residual values?

A
  • residual plots mean same thing as LSRL
18
Q

Bivariate Data

how do you know if residual plots are linear or non-linear?

A
  • even number of points above and below line = linear (R = 0)
  • if there is some sort of patterns = non-linear
19
Q

Bivariate Data

Recall and explain three reasons why causation may not be present?

A
  1. common response
    - when 2 variables are associated because they are both strongly assoicated with a common third variable
  2. confounding variables
    - when there is at least two possible causal explanations for the observed association, but we have no way of knowing their separate effects. The effects of the two possible explanatory variables are said to be confounded because there is no way of knowing which is the actual cause of the association
  3. coincidence
    - when it is impossible to identify any feasible confounding variable to explain a particular association
    - ie. happens by chance
20
Q

how do we describe trends in time series plots?

A

Ignores fluctutaion but reflects overall trend of plot
- positive (upward)
- negative (downward)
- constant
can have multiple trends in the one plot

21
Q

features of cycles

A

repeated patterns
usually greater than a year

22
Q

describe seasonal fluctuations

A
  • seasonal factors (time of day, day of week, month of year, quarter of year, season of year (winter ect)
    - quarter = Jan-mar, Apr-Jun, Jul-Sep, Oct-Dec
    • peaks and troughs consistently occur after the same time interval; e.g. ice-cream sales peak in warmer months and drop away in cooler months
23
Q

describe outliers

A
  • one-off unanticipated events; can be difficult to recognise, especially if data is irregular or seasonal
    • including an outlier may be detrimental for forecasting (predicing)
    • possible outliers should be investigated before being ‘eliminated’ from data