Quiz 2 Flashcards by Ricardo Sanchez

Y increases as X increases by?

Slope B

How well did you know this?

Not at all

Perfectly

Simple linear regression model in words

response = predictor + error

How well did you know this?

Not at all

Perfectly

What is a signal

Predictor

How well did you know this?

Not at all

Perfectly

What is noise

Error

How well did you know this?

Not at all

Perfectly

Formal statistical model:

response = intercept(p) + slope(p) + error “Where p = predictor variable)

How well did you know this?

Not at all

Perfectly

Descripe linear model when pages is the response variable and words is the predictor

pages = words + error or pages = wordsp + wordsp1 + error

How well did you know this?

Not at all

Perfectly

Multiple linear regression model

response = predictor 1 + predictor 2 + error

How well did you know this?

Not at all

Perfectly

Simple linear regression is:

Linear regression with one continuous response variable Y and ONE continuous predictor variable X

How well did you know this?

Not at all

Perfectly

Multiple linear regression is:

Linear regression with one continuous response variable Y, and MORE THAN ONE continuous predictor.

How well did you know this?

Not at all

Perfectly

What are the basic assumptions of linear regression

Linear, normally distributed residual with homogeneous variances

How well did you know this?

Not at all

Perfectly

How does B1 quantify different things between simple and multiple regression

The effects of X1 on Y controlls for effect of X2. Isolates the influence of x1 independent of x2 by estimating b1 holding x2 constant.
Does not allow X2 to interfere when assessing the effect of X1

How well did you know this?

Not at all

Perfectly

Explain what B1 is in multiple regression model

For every additional X1 (predictor), the number of Y (response) increases by b1, holding the number of X2 constant.

How well did you know this?

Not at all

Perfectly

Main difference between b1 in linear regression and multiple regression

b1 in linear is regression slope while in regression b1 and b2 are partial regression slopes

How well did you know this?

Not at all

Perfectly

What is the 2nd complication in multiple linear regression?

Multiple predictors can interact in their effect on the response variable.

How well did you know this?

Not at all

Perfectly

What is the regression model for interaction? Multiplicative model

response = b1 + b2 + (b3xb2) + error

How well did you know this?

Not at all

Perfectly

What is the third complication in multiple regression models?

Study These Flashcards

Predictor variables can themselves be corelated

What are the assumptions of multiple regression models?

Study These Flashcards

Linear relationship between predictor and response variable
Equal variance of residuals around regression line
normally distributed residual
Predictors should not be strongly correlated (ie. no collinearity)

How do you detect collinearity?

Study These Flashcards

Think about which predictor variables are likely to be collinear before building model
Plot predictor variables against each other
Calculate the TOLERANCE associated with each predictor.

Tolerance

Study These Flashcards

Lower tolerance is bad.
Tolerance < 0.1 is really bad

VIR

Study These Flashcards

Variance inflation factor
VIF = 1/tolerance
Higher VIF is bad
VIF >10 is really bad.

Method 1 of writing multiple linear regression

Study These Flashcards

Method 2 of writing multiple linear regression

Study These Flashcards

Types of linear models

Study These Flashcards

Simple
Y = B0 + b1x1 + error
Multiple linear
Y = B0 + B1X1 + B2X2 + error
(More than one continuous predictor variable)
Anova model
Y = B0 + B1X1a + B2X1b + error
(One or more categorical predictor variables that have more than one level [eg. a and b])
Ancova model
Y = B0 +B1X1a + B2X1b +B3X2 + error
(One or more categorical predictor variables that have more than one level AND one or more continuous predictor variables)

Linear statistical model with one categorical predictor variable

Study These Flashcards

Yij = u +B1xaij +B2xbij + B3xcij + errorij
Where:
j represents a single observation from single organisms and i represents the level of the predictor
u = Mean of all observations across all levels of all factors
B1 = difference between the mean of ‘a’ (level) and ‘u’ (mean)
B2 = difference between the mean of ‘b’ (level) and ‘u’ (mean)
B2 = difference between the mean of ‘c’ (level) and ‘u’ (mean)

What is the sum of all the squared deviation in analysis of variance (SS)(ANOVA)

Eij(Yij - Yj)2 Where Yij = One observation I within group j Yj = mean of all observations in group j

DFresidual = ?

n-k where n = total number of observations (true replicates) k = number of levels within the predictor variable (number of groups or factor levels)

MS =?

MS = SS / df Average deviation of the data from the group means

Write an anova table

Source of var | SS | df | MS | F-value groups | SS | df | SS/df | MSgroups / MSresidual (signal/noise) residuals | SS | df | SS/df | total | SS | df |

What is an observational study

Cannot isolate casual drivers from effect of potentially confounding variables. (confounder is a variable that influences both the dependent variable and independent variable)

What is an experimental study

Can potentially isolate casual drivers from the effect of confounding variables.

Lurking variables examples

Lurking variables can make experiments useless and their influence can only be neutralized with good experimental design. Z X /-->\ Y Z can influence Y through X Z Can directly influence Y The key to strength of experiments is that they allow us to explicitly ISOLATE the effect of X on Y without the lurking variable Z interfering with the experiment.

Ways to neutralize lurking variables

1. Replication 2. Randomized design 3. Blocking

What are the benefits of replication?

Any casual relationship between variables may be caused by lurking variables we are unaware of. This is also called the sampling effect and small samples are more vulnerable thus we increase replication to minimize interaction from lurking variables. Essential to notice we must replicate the correct thing and avoid pseudoreplication as it increases the F-ratio, which decreases the p-value, which increases the chance we incorrectly reject the null hypothesis.

What are the benefits of randomization?

Reducing bias: Randomization helps to reduce the impact of selection bias and confounding variables, which can affect the validity and generalizability of study results. Improving statistical power: Randomization helps to increase the statistical power of a study, which refers to the ability of a study to detect a true effect if it exists.

What are the benefits of blocking?

Quiz 2 Flashcards

(35 cards)