Lecture 5 Flashcards

1
Q

is it possible that your population mean is not within your sample confidence interval?

A

yes.

in general, you would be highly unlikely to know the true population mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

is the width of the confidence intervals of all samples in a population the same everywhere?

A

width of interval may not be the same for all samples in a population, because the population standard deviation is always unknown

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

why in multiple regression, the corresponding value for each IV is not the same as the one in simple regression?

A

because in multiple regression, the correlation among IVs in their relationship to the DV is partialled out (removed)

in multiple regression equation, each IV’s regression coefficient is independent of the other IV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what does intercept indicate in a regression model?

A

it is the predicted value of DV when an individual scores 0 on the IV

0 cancels out the regression coefficient found in each IV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what does it mean when you use an unbiased interval estimator

A

means that the actual coverage rate will be 95% over the long run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what does it mean when you use a biased interval estimator

A

means that the actual coverage rate would be SMALLER or LARGER than the nominal rate over the long run (eg: 89% or 96%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what does it mean when you use a consistent interval estimator

A

means that the actual coverage will get closer to 95% over the long run as sample size get larger

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what does nominal level of coverage rate refers to

A

it refers to the 95% in the description of a confidence interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what does multiple R-squared = 0.47 mean

A

means that 47% of all variation is explained by IV 1 and IV 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

if sample size is 100, what is the df?

A

97

df= sample size - numerator df (no of IV) - 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

4 things we need in order to calculate R2

A
  1. estimated value (of R2)
  2. numerator df (no of IVs)
  3. denominator df (n - no of IVs -1)
  4. desired level of confidence (0.95)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what does it mean if your R2 and adjusted R2 are different

A

means that the R2 estimate is biased

adj r2 is not unbiased but it’s less biased than r2 estimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what affects the possible difference between R2 and adjusted R2?

A
  • the no. of IVs (eg: 2 IVs vs 20 IVs –> many IVs = greater obs r2, adj r2 would be smaller)
  • the sample size (smaller sample = obs r2 larger, adj r2 smaller)

many IVs + large sample size = larger difference between obs r2 and adj r2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

2 ways to make regression coefficient directly comparable

A
  • standardised partial regression coefficient

- use squared semi-partial correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how does standardised partial regression coefficient work

A
  • all in standard deviation unit

- transform all obs scores to to z scores (all standardised ot have sd of 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how do you interpret standardised regression coefficient?

A

same as reporting unstandardised regression coefficient, but need to acknowledge what the scaling of one unit change represent (which is in standard deviation unit )

17
Q

what does it mean when you have standardised regression coefficient of 0.74 in IV1

A

means that 1 sd increase in IV1 would result in 0.74 sd increase in DV, holding constant IV2

18
Q

what does it mean when you have standardised regression coefficient of -0.13 in IV2

A

means that for 1 sd increase of IV2, there would be a decrease of 0.13 in DV, holding constant IV1

19
Q

what is semi-partial correlation

A

semipartial correlation is the correlation of DV and IV1 when the correlation between IV1 and IV2 has been removed

20
Q

why is squared semi-partial correlation useful?

A
  • it indicates direct correspondence to R2

- it also indicates the unique proportion of variation in the DV explained by each IV

21
Q

assumption of linear regression

A
  • independence of observations
  • linearity
  • constant variance of residuals (homoscedasticity)
  • normality
22
Q

what does independence of observations mean?

A

one person’s scores is independent to the others’

23
Q

what does linearity mean?

A

scores on dv are additive linear function of scores on the set of iv

24
Q

what does homoscedasticity mean?

A

variance of residuals is the same for each score on each iv

25
Q

what does normality mean?

A

well-modeled by a normal distribution

qqplot –> within the ‘window’ ???

26
Q

how to make sure independence of observation is met?

A
  • don’t duplicate scores (to make bigger sample)

- as long as responses on one variable do not determine the responses on the other

27
Q

scatter plot with LOESS line –> how to tell if non linearity is present?

A

red dotted line straight or not straight?

28
Q

in multiple regression equation, each coefficient for each IV is called ……..

A

partial regression coefficient

29
Q

how do you get values of intercept and regression coefficients in multiple linear regression equation?

A

by the method of Ordinary Least Squares (whereas intercept and regression model is estimated in such a way that would minimise the Sum of Squared residuals)

30
Q

What is a residual score in a linear regression model?

A

That part of the observed scores on the dependent variable not being explained by the regression model.

31
Q

which summary characteristics are of most interest to us in analysing data using a linear regression model with two independent variables

A
  • The overall strength of prediction of the model (indicated by the size of the R-squared statistic)
  • the relative strength of prediction of each independent variable (indicated by the standardised regression coefficient
32
Q

Why might prediction be viewed as indicating an asymmetric relationship?

A

one variable is defined to have a different function and role in the relationship to any other variables

33
Q

what defines an important deviation score that is central to multiple linear regression?

A

The predicted Y score deviating from the observed Y score.

34
Q

residual scores can be either positive or negative in value, true or false?

A

true

35
Q

what is R-squared statistic

A

It is a measure of the strength of prediction in the regression model.

36
Q

what does R-squared value of 0 mean

A

no prediction

37
Q

What is the difference between an unstandardised partial regression coefficient and a standardised partial regression coefficient?

A

A standardised partial regression coefficient is estimated in a multiple linear regression model using z scores rather than observed scores.

  • NOT simple linear regression
38
Q

If a standardised regression coefficient is ‒0.25, what does it mean?

A

decrease in IV1 by 1 sd unit leads to increase of 0.25 unit at DV, holding constant IV2

39
Q

Why is using a straight line function different to displaying a two-dimensional scatterplot of real data ?

A

Real data would (almost certainly) not form a straight line of individual data values.

If real data formed a straight line in a two-dimensional scatterplot, then this implies that the values of X variable are each perfectly predicting the values of the Y variables.