Sabinas lecture 1 to 7 Flashcards

1
Q

Variance & SD

A

Variance & SD
sigma2 or V = the degree to which a variable ‘varies’ around its
mean V = sum (X-X)2/N-1 = SS/df
SD = squ root sigma 2 or  V (in the same units, easier to interpret)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

› Covariance

A

CoV = the degree to which two variables ‘vary’ simultaneously or co-vary
Note: the variance of a variable is… its covariance with itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Correlation

A

degree of linear
relationship between two variables and, essentially, it is a
standardised covariance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Continuous versus discrete variables

A

continuous and discrete (categorical)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Regression sum of sqares

A

Regression sum of sqares is about something we can predict. (1-R2) is what we can not predict.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how to work out t

A

b/ SEb = t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

df Residual

A

df residual is proportion of the variable we cannot predict. N-K-1 predictors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Confidence Intervals (CI)

A

b is an estimate of the population parameter. Ultimately, we want to know the true value of the regression coefficient. Having the CI helps to illustrate this idea (i.e., if we conducted this research 100 times, there is XX% chance that the true (yet unknown) slope is within the specified range of values) .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how to use CIs

A

If the range includes 0, then we can conclude that the findings are NOT n statistically significant, and vice versa.
› We can also use the CI to test whether the slope is different from a particular value (e.g., whether this slope is different from the one found in previous studies).
› SPSS does not calculate CI automatically

CI is sort of our parameter line. If I perform the experiment 100 times, this is the range I expect the B to be in.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Converting from b to β [in italics!]

A
ß = b x (SDx/SDy)
b = ß x √(Vx/Vy)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If B is equal to zero

A

there is no equation. It is not important. It will still be featured in the regression equation (DO NOT TAKE IT OUT).
The most common null hyp is that b = zero. Slope is not different to zero, nothing systematic is happening.

It doesn’t have to be zero, the slope is stagnant. Is the new slope different to 1.5 or not? If the CI includes this, it is fine as a null hyp.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

MR advantages

A

Can use both categorical and continuous independent
variables
› Can easily incorporate multiple independent variables
› Is appropriate for the analysis of experimental or
nonexperimental research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Factors Affecting the Results

of the Regression Equation

A
Sample size (N)
The amount of scatter of points around
the regression line [indexed by (Y-Y’)2
or SSresidual] = Other things being equal, the smaller
SSresidual, the larger SSregression, and
hence larger the F-ratio

›The range of values in the X variable,
indicated by (X-X)2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Assumptions Underlying MR (only a

glimpse now)

A

Dependent variable is a linear function of the IVs
- can be overlooked if one selects extreme cases of X… selection of only extreme cases can ‘force’ the regression to appear linear, even if it might be curvilinear for the X values. Bad practice…
› Each observation is drawn independently
› Errors are normally distributed
› The mean of errors is = 0
› Errors are not correlated with each other, nor with the IV
› Homoscedasticity of variance
- Variance of errors is not a function of IVs
- The variance of errors at all values of X is constant, meaning that it is the same at all levels of IV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

reg df

A

number of IVS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

do you report the non significant parts in regression conclusion?

A

YES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

decimal places for b

A

three decimal places. .003 etc

18
Q

what happens when you shorten the effect sample line graph?

A

B is same, ß changes. distribution is different so SDs change

19
Q

why is ß the same as ry2 when the two IVs don’t correlate

A

because the overlap is not in the ven diagram. ß = ry2 when r12= 0

20
Q

assumptions of error

A

we assume that they are normally distributed, independent,

and have constant variance.

21
Q

regression line

A

that the IVs are differentially weighted
so that the prediction is optimised and the sum of the errors2 of prediction is minimised.
That is, the sum of squared values for each residual term is smaller than for any other
possible straight line, thus the term least squares

22
Q

ß way of writing conc

A

standard scores or stand deviations. not standard units

23
Q

what is a different metric

A

includes different scale of same dimension like cm is DIFF to hours. cm is DIFF to meters. must be exactly the same or use beta

24
Q

when is something not a common cause

A

a, b and c paths equivalent to ß’s, where DV is VarY, and it is regressed on Variables X1 and X2
› If VarX1 has no effect on Y (b=0), but it has an effect on X2, then:
- it is not a common cause
- ßYX2 = r YX2 = c
- c does not change with the inclusion or exclusion of X1
- OR
- If VarX1 has no effect on X2 (a=0),
but it has an effect on Y, then:
- it is not a common cause
- ßYX2 = r YX2 = c
- c does not change with the inclusion or exclusion of X1

25
Q

importance of r2

A

For explanation, a high R2 less important than proper
variable selection

R2 should be within expected range
- Explaining 25% of the variance may be surprisingly high for some questions, low for others
› A high (?) R2 is important for prediction
› “Human freedom may then rest in the error term

26
Q

Indirect Effects

A

The regression weight for Parent Education changed
because a mediating variable (Previous Achievement) was included in the model.
› A portion of the direct effect from the first regression is now indirect (e.g., paths d and a)
› Mediating variables do not have to be included to interpret regression coefficients as effects
› However, this type of regression only focuses on
direct effect.

27
Q

mean when you standardise something

A

every time you standardise something the mean will be very close to zero like z score distribution

28
Q

what do you look at when you have intercorrelations output?

A

Correlations that our IVs have with our DV

- Correlations that our IVs have with each other

29
Q

including a common cause

A

prevents inflating the other variables bs and ßs

30
Q

df for change statistics in sequential

A

always 1 because only adding 1 vriable at a time

31
Q

order of entry

A

the variable entered first has the most opportunity to capture the higherst proportion of variance. the one entered last has a tiny ∆R2

32
Q

total effects

A

the direct effect plus e x d (the indirect effects lines timsed together)

33
Q

importnce measure btter than ∆R2

A

√∆R2

34
Q

Unique Variance

A

Some researchers add each variable last in a sequential regression to determine its “unique” effect/variance
› Can get the same information in simultaneous regression, requesting semipartial (part) correlations
› Square the part correlations to determine unique variance

35
Q

what to do with stepwise

A

large N and cross validation necessary

36
Q

interactions and curevs

A

Test for interactions by sequentially adding a cross-product
term to the regression
› Test for curves in the regression plane by sequentially adding
powers of variables (e.g., variable2)

37
Q

purpose sequential

A

Is a variable (or block of variables) important for an outcome?
- Does a variable explain/predict variance beyond that
explained by other influences?
- Does a variable explain/predict unique variance in an
outcome?
- Test for statistical significance of interactions and curves
- Does a variable aid in predicting some criterion

38
Q

What to Interpret in sequential

A

magnitude = importance √∆R2,

stat significancec ∆R2

39
Q

when to use sequential

A

Useful for explanation when guided by theory
› Useful for testing interactions & curves
› Estimates total effects in implied model
› More ‘similar’ to ANOVA method (?)

be careful with order

40
Q

alternatives to Stepwise

A

Simultaneous regression
› Sequential regression (final equation)
› Study correlations between IVs. If some are highly
intercorrelated, consider combining them in a composite.
› SEM (…?…)