QDA2 Flashcards

1
Q

Variance

A

statistical measurement (1) of the spread (2) between numbers in a data set (3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why (2) do we use PCA

A
  1. go from large number of variables to smaller number of variables
  2. helps to show which variables are strongly related/seperated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data requirements PCA

A

1) quantitative variables (normally distributed, based on correlation)
2) large numbers of observations
3) strong correlation amongst variables (>0.3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Where does the b stand for in the equation for PCA

A

component loadings (= how much they relate to each other)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

difference between common and unique variance

A

common variance is variance that is shared among other variables in the data set and unique variance is variance that is specific to a variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

communality

A

proportion of common variance once that is present in a variable, the higher, the more common variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what does EV > 1 mean

A

component explains more variance that individual variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what does kaiser-meyer Olkin measure?

A

compares observed correlations with partial correlations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

whats the conclusion for Kaiser-meyer Olin

A

P = x and x > 0.5 so there is more shared variance than unique variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

whats the conclusion for Bartletss test

A

P = x so significant, this means that the correlations between items differs from zero. H0 = all variables are uncorrelated so we can reject H0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is component rotation

A

redistributes explained variance over components and makes components loadings more extreme. makes it easier to assign and interpret.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the conclusion of R2

A

the model explains x% of the variance in the OV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

when it comes to the follow up tests, if the variances are not homogeneous, what do we do use then?

A

Welch’s test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

whats the disadvantage of POST HOC TEST

A

artificial inflation of alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What do you do it a component loading loads on an unexpected component

A

you do nothing, apparently the items correlate more strongly with a latent factor that is not the one that they were expected to load on. this doesn’t make the analysis unreliable, it basically indicated that it is good that you did the analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does Tom have to do after component rotation, before he goes further with his analysis?

A

Recode the negative loaded factors

17
Q

5 criteria ANOVA

A

1) categorical PV, quantitative OV
2) residuals are equally distributed
3) homogeneity of variances
4) mutually exclusive (overlap)
5) EQUAL SAMPLE SIZES

18
Q

whats the H0 of ANOVA

A

H0: µx = µx2 = µx3

19
Q

categorical vs quantitative

A

Categorical: nominal, ordinal
Quantitative: ratio, discrete, interval (temperature), continuous (weight, age)

20
Q

make table and show criteria

A

are they equal to zero horizontally? are they vertically equal to zero (orthonogality) and why is there orthogonality: because they are used together in one part so you cannot use them again, inflation of alpha

21
Q

is they ask about which pv has strongest effect on OV, we use

A

partial eta squared

22
Q

main effect on direction and position

A

direction: core pv, position: legenda

23
Q

when you dont have the p value for the t statistic nor the CI and you need to check if it wil have an impact on OV what do you do?

A

=2 or > 2 then it has a significant impact, otherwise its just too low to have an impact

24
Q

use two statistics to conclude if the second model is better than the first

A

F change + p value and the R square/Rsquare adjusted value

25
Q

multicollinearity and whats the problem with that?

A

two or more PVs in a multiple regression model are highly correlated, which affects the testing for the coefficient testing

26
Q

why do we use dummy variables

A

If you enter Education as PV, SPSS it will treat the nominal categories as numeric, meaning it will estimate a linear
effect for this PV, which makes no sense for nominal PVS.

27
Q

F test why used

A

The additional explanatory power of model 2
compared to model 1 is formally tested in the F-change test.

28
Q
A