QDA2 Flashcards
Variance
statistical measurement (1) of the spread (2) between numbers in a data set (3)
Why (2) do we use PCA
- go from large number of variables to smaller number of variables
- helps to show which variables are strongly related/seperated
Data requirements PCA
1) quantitative variables (normally distributed, based on correlation)
2) large numbers of observations
3) strong correlation amongst variables (>0.3)
Where does the b stand for in the equation for PCA
component loadings (= how much they relate to each other)
difference between common and unique variance
common variance is variance that is shared among other variables in the data set and unique variance is variance that is specific to a variable.
communality
proportion of common variance once that is present in a variable, the higher, the more common variance
what does EV > 1 mean
component explains more variance that individual variable
what does kaiser-meyer Olkin measure?
compares observed correlations with partial correlations.
whats the conclusion for Kaiser-meyer Olin
P = x and x > 0.5 so there is more shared variance than unique variance.
whats the conclusion for Bartletss test
P = x so significant, this means that the correlations between items differs from zero. H0 = all variables are uncorrelated so we can reject H0.
what is component rotation
redistributes explained variance over components and makes components loadings more extreme. makes it easier to assign and interpret.
what is the conclusion of R2
the model explains x% of the variance in the OV
when it comes to the follow up tests, if the variances are not homogeneous, what do we do use then?
Welch’s test
whats the disadvantage of POST HOC TEST
artificial inflation of alpha
What do you do it a component loading loads on an unexpected component
you do nothing, apparently the items correlate more strongly with a latent factor that is not the one that they were expected to load on. this doesn’t make the analysis unreliable, it basically indicated that it is good that you did the analysis.
What does Tom have to do after component rotation, before he goes further with his analysis?
Recode the negative loaded factors
5 criteria ANOVA
1) categorical PV, quantitative OV
2) residuals are equally distributed
3) homogeneity of variances
4) mutually exclusive (overlap)
5) EQUAL SAMPLE SIZES
whats the H0 of ANOVA
H0: µx = µx2 = µx3
categorical vs quantitative
Categorical: nominal, ordinal
Quantitative: ratio, discrete, interval (temperature), continuous (weight, age)
make table and show criteria
are they equal to zero horizontally? are they vertically equal to zero (orthonogality) and why is there orthogonality: because they are used together in one part so you cannot use them again, inflation of alpha
is they ask about which pv has strongest effect on OV, we use
partial eta squared
main effect on direction and position
direction: core pv, position: legenda
when you dont have the p value for the t statistic nor the CI and you need to check if it wil have an impact on OV what do you do?
=2 or > 2 then it has a significant impact, otherwise its just too low to have an impact
use two statistics to conclude if the second model is better than the first
F change + p value and the R square/Rsquare adjusted value
multicollinearity and whats the problem with that?
two or more PVs in a multiple regression model are highly correlated, which affects the testing for the coefficient testing
why do we use dummy variables
If you enter Education as PV, SPSS it will treat the nominal categories as numeric, meaning it will estimate a linear
effect for this PV, which makes no sense for nominal PVS.
F test why used
The additional explanatory power of model 2
compared to model 1 is formally tested in the F-change test.