Quiz 2 Flashcards

1
Q

Y increases as X increases by?

A

Slope B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Simple linear regression model in words

A

response = predictor + error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a signal

A

Predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is noise

A

Error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Formal statistical model:

A

response = intercept(p) + slope(p) + error “Where p = predictor variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Descripe linear model when pages is the response variable and words is the predictor

A

pages = words + error or pages = wordsp + wordsp1 + error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Multiple linear regression model

A

response = predictor 1 + predictor 2 + error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Simple linear regression is:

A

Linear regression with one continuous response variable Y and ONE continuous predictor variable X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Multiple linear regression is:

A

Linear regression with one continuous response variable Y, and MORE THAN ONE continuous predictor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the basic assumptions of linear regression

A

Linear, normally distributed residual with homogeneous variances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does B1 quantify different things between simple and multiple regression

A

The effects of X1 on Y controlls for effect of X2. Isolates the influence of x1 independent of x2 by estimating b1 holding x2 constant.
Does not allow X2 to interfere when assessing the effect of X1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain what B1 is in multiple regression model

A

For every additional X1 (predictor), the number of Y (response) increases by b1, holding the number of X2 constant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Main difference between b1 in linear regression and multiple regression

A

b1 in linear is regression slope while in regression b1 and b2 are partial regression slopes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the 2nd complication in multiple linear regression?

A

Multiple predictors can interact in their effect on the response variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the regression model for interaction? Multiplicative model

A

response = b1 + b2 + (b3xb2) + error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the third complication in multiple regression models?

A

Predictor variables can themselves be corelated

17
Q

What are the assumptions of multiple regression models?

A
  1. Linear relationship between predictor and response variable
  2. Equal variance of residuals around regression line
  3. normally distributed residual
  4. Predictors should not be strongly correlated (ie. no collinearity)
18
Q

How do you detect collinearity?

A
  1. Think about which predictor variables are likely to be collinear before building model
  2. Plot predictor variables against each other
  3. Calculate the TOLERANCE associated with each predictor.
19
Q

Tolerance

A

Lower tolerance is bad.
Tolerance < 0.1 is really bad

20
Q

VIR

A

Variance inflation factor
VIF = 1/tolerance
Higher VIF is bad
VIF >10 is really bad.

21
Q

Method 1 of writing multiple linear regression

A
22
Q

Method 2 of writing multiple linear regression

A
23
Q

Types of linear models

A

Simple
Y = B0 + b1x1 + error
Multiple linear
Y = B0 + B1X1 + B2X2 + error
(More than one continuous predictor variable)
Anova model
Y = B0 + B1X1a + B2X1b + error
(One or more categorical predictor variables that have more than one level [eg. a and b])
Ancova model
Y = B0 +B1X1a + B2X1b +B3X2 + error
(One or more categorical predictor variables that have more than one level AND one or more continuous predictor variables)

24
Q

Linear statistical model with one categorical predictor variable

A

Yij = u +B1xaij +B2xbij + B3xcij + errorij
Where:
j represents a single observation from single organisms and i represents the level of the predictor
u = Mean of all observations across all levels of all factors
B1 = difference between the mean of ‘a’ (level) and ‘u’ (mean)
B2 = difference between the mean of ‘b’ (level) and ‘u’ (mean)
B2 = difference between the mean of ‘c’ (level) and ‘u’ (mean)

25
Q

What is the sum of all the squared deviation in analysis of variance (SS)(ANOVA)

A

Eij(Yij - Yj)2
Where Yij = One observation I within group j
Yj = mean of all observations in group j

26
Q

DFresidual = ?

A

n-k
where n = total number of observations (true replicates)
k = number of levels within the predictor variable (number of groups or factor levels)

27
Q

MS =?

A

MS = SS / df
Average deviation of the data from the group means

28
Q

Write an anova table

A

Source of var | SS | df | MS | F-value
groups | SS | df | SS/df | MSgroups / MSresidual (signal/noise)
residuals | SS | df | SS/df |
total | SS | df |

29
Q

What is an observational study

A

Cannot isolate casual drivers from effect of potentially confounding variables.
(confounder is a variable that influences both the dependent variable and independent variable)

30
Q

What is an experimental study

A

Can potentially isolate casual drivers from the effect of confounding variables.

31
Q

Lurking variables examples

A

Lurking variables can make experiments useless and their influence can only be neutralized with good experimental design.
Z
X /–>\ Y
Z can influence Y through X
Z Can directly influence Y
The key to strength of experiments is that they allow us to explicitly ISOLATE the effect of X on Y without the lurking variable Z interfering with the experiment.

32
Q

Ways to neutralize lurking variables

A
  1. Replication
  2. Randomized design
  3. Blocking
33
Q

What are the benefits of replication?

A

Any casual relationship between variables may be caused by lurking variables we are unaware of.
This is also called the sampling effect and small samples are more vulnerable thus we increase replication to minimize interaction from lurking variables.
Essential to notice we must replicate the correct thing and avoid pseudoreplication as it increases the F-ratio, which decreases the p-value, which increases the chance we incorrectly reject the null hypothesis.

34
Q

What are the benefits of randomization?

A

Reducing bias: Randomization helps to reduce the impact of selection bias and confounding variables, which can affect the validity and generalizability of study results.
Improving statistical power: Randomization helps to increase the statistical power of a study, which refers to the ability of a study to detect a true effect if it exists.

35
Q

What are the benefits of blocking?

A