week 3 Flashcards

1
Q

B0

A

intercept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

B1

A

regression coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If you know the values for β0 and β1, then you can find….

A

a straight line that describes the linear relationship between x and y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

β0 is the value of y when…

A

x equals to 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

B1 is the amount of change in y…

A

when x is increased by 1 unit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

ANOVA and regression are both part of the same

A

General Linear Model (GLM).

ANOVA is a special case of regression where the IVs are categorical or ordinal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What can the simple linear regression can be used as…

A

As a descriptive technique.

It can also be used for statistical inference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does simple linear regression using for stat inference:

A
  • involves statistical modelling
  • involves thinking about the true population model.
  • involves hypothesis testing
  • use sample regression coefficient to make inferences about the population regression coefficient.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does simple linear regression involve the mechanics of?

A

Fitting a line to data

Minimization problem in math

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Does simple linear regression involve statistical modelling?

A
  • does not involve statistical modelling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Predictor variable

A

x is the predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Criterion variable

A

y is the criterion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The goal of the simple linear regression is …

A

to search for a best-fitting linear line that describes the relationship between x and y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what method is used to determine the best fitting line?

A

Use the least squares method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the least squares method involve?

A

least squares method involves calculating the sum of squared residual (SSresidual). Let ei represent the residual for each
participant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Criterion for the “best fitting line”

A

The line that minimizes the SSresidual.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

residuals

A

ei = yi - yˆi

find the differences between observed yi and predicted yˆi.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Computing SS residual

A

observed - predicted

Square the residuals

then sum them up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

rxy

A

correlation between x and y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

sx, sy

A

standard deviation for x, y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

sxy

A

covariance of x and y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

In the least square method, we minimize the sum of the squared vertical distances between the observed and predicted values to find the __________________

A

“best fitting line”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

There are other criteria for finding the best fitting lines.

A

minimize the sum of the squared horizontal distances.

§ minimize the sum of the squared perpendicular distances.

§ minimize the sum of the absolute vertical distances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Based on the equation for β0, what is the predicted value of the criterion variable y when the predictor variable x is at its mean? In other words, for the regression equation

yˆ = β0 + β1x,

what is yˆ when x = x¯, given that β0 = y¯ - β1x¯ ?

A

It shows that the point (¯x, ¯y) always passes through the regression line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

For the simple linear regression, based on the equation for β1, do you think the regression coefficient and correlation coefficient will always have the same sign? (Hint: think about the range of possible values for sy and sx).

A

B1 = rxy(sy/sx)

Recall the standard deviation is always positive. Therefore, sy
sx is always positive. This means the regression coefficient and correlation coefficient will have the same sign in simple linear regression.

Positive β1: direct relationship between x and y and positive rxy.

Negative β1: inverse relationship between x and y and negative rxy

26
Q

Simple Linear Regression with Standardized Variables

A

we standardize x and y and denote standardized x and y as zx and zy, respectively

zbarx = 0, szx =1 zbary =0, szy =1

when we conduct the simple linear regression analysis with zx and zy, the regression coefficient equals to the following

B1 = rxy

27
Q

Therefore, for the simple linear regression, the regression coefficient β1 for the standardized variables is the…

A

correlation coefficient rxy.

28
Q

If we use the standardized variables zx and zy to conduct the regression analysis, what will the intercept be?

A

B0 = 0

for simple linear regression, when we have standardized variables, the regression line always passes through the origin and its slope (regression coefficient) is the correlation coefficient.

29
Q

If we want to do statistical inference (e.g., test hypothesis), we need…

A

statistical modelling

involves making assumptions about the true population model.

30
Q

The population regression model is the regression equation found via the least-squares method using

A

population data.

31
Q

The sample regression model is the regression equation found via the least-squares method using

A

sample data.

32
Q

xi

A

score on the predictor for ith participant

considered a constant across repeated studies in the classical regression analysis.

.For experimental designs, the predictor is fixed by the experimenter

33
Q

u yi|xi :

A

the predicted score on the criterion variable for the ith participant using the population regression model.

also the long-run average of the observed criterion variable yi conditioning on the value of xi across repeated studies.

34
Q

In the population regression model, the difference between the observed score and the predicted score on the criterion variable is called the:

35
Q

yi:

A

the score on the criterion variable for the ith participant.

36
Q

An important assumption of the error term ϵi

A

ei ~ N(0, o2)

37
Q

Population Model – Assumption for the Error Term

Based on probability theory, this assumption implies that…

A

yi ~ N(B0 + B1xi, o2)

38
Q

Population Model – Assumption for the Error Term

A
  1. Normality: ϵi and yi are normally distributed.
  2. Linearity: Because the mean of ϵi is 0, the predicted score in the population is a linear function of the predictor: µyi|xi = β0 + β1xi.
  3. Constant Variance (a.k.a., homoscedasticity): Var(ei) = σ2 are constant across participants.
  4. Independence: ϵi is not related to ϵj when i and j represent different participants.
39
Q

Bhat0

A

estimate of the population β0.

Note: the hat accent means the “sample estimate” of a certain parameter.

40
Q

Bhat1

A

estimate of the population β1.

41
Q

yhati

A

predicted value of the criterion variable for the ith participant based on the sample regression model.

42
Q

residual

A

In the sample regression model, the difference between the observed score and the predicted score on the criterion variable

43
Q

yi

A

the score on the criterion variable for the ith participant.

44
Q

yˆi = βˆ0 + βˆ1xi

A

the predicted score on the criterion variable for the ith participant using the sample regression model.

45
Q

For the sample regression model, the sample regression line is obtained via the least-square method by

A

minimizing the sum of squared residuals:

46
Q

For simple linear regression (or multiple regression in general), you can conduct a hypothesis test….

A

§ for the intercept
§ for each of the regression coefficients
§ for the overall regression model taking into account all predictors.

47
Q

For simple linear regression, since we only have one predictor x1, the hypothesis test for the overall regression model is equivalent to

A

the hypothesis test for the regression
coefficient β1 for x1.

48
Q

A significant result on a simple linear regression

A

indicates the observed criterion variable
y can be significantly predicted or explained by the predictor x.

49
Q

To conduct a hypothesis test regarding the population regression coefficient, we need to figure out the…

A

sampling distribution of the sample regression coefficient.

That is the distribution of the sample regression coefficient over repeated studies

50
Q

µβˆ:

A

mean of sample regression coefficients over repeated samples.

51
Q

β1

A

population regression coefficient

52
Q

o2B

A

variance of sample regression coefficients over repeated

53
Q

o2

A

population error variance

54
Q

s2x

A

variance of the predictor x

55
Q

Replacing σ with its sample estimate results in a t-stats following the

A

t-distribution.

56
Q

σhat

A

is the sample estimate of the population error variance σ.

57
Q

Assuming H0 : β1 = 0 is true, over repeated studies, the sampling distribution of t-stats is

A

t-stats ~ t(n-2)

58
Q

With the standard error formula, we can also find the 95% CI for the regression coefficient:

A

95% CI = Bhat1 +/- tcrit (SE(Bhat1))

59
Q

Interpret non-significant p-value

A

The p=0.0702 means that assuming H0 : β1 =0 is true, the probability of obtaining a sample t-statistic as extreme as the one we have obtained (i.e., t = 2.088) is 0.0702.

60
Q

Correct interpretation of confidence interval

A

Over repeated studies, 95% of the CIs contain the population regression coefficient β1.