Exam Example questions Flashcards
Why should care be taken before using accuracy as a predictor in multiple linear regression?
accuracy is non-linear so it violates the assumption of MLR - therefore it can give rise to impossible values - therefore if you using it in terms of regression you should use logistic regression, is you use it for test of different you use a transformation i.e. arcsin
How, with a single predictor variable, can you perform curvilinear regression?
with a single predictor,you could perform curvilinear regression by restrct range of curves. Note that curve could be offset from zero.
What does Mauchly’s test evaluate and when is it used?
Mauchly’s test is used to test the assumption of sphericity (Homogeneity of covariance)
Sphericity is an assumption of within-subjects ANOVA, similar to homogeneity of variance, but refers to difference scores (i.e. the difference between participants scores in one condition compared with an another). A within-subjects ANOVA assumes that variance in these difference scores is equivalent.
If Mauchly’s test is significant (sig. can assume sphericity, so consult the sphericity assumed row of results
In a 1-way ANOVA, what is the expected value of the variance ratio F when the null hypothesis is true and why?
F=MSbtw/MSer =(Var cond+ Var error)/Var error
If null hypothesis H0 is true, the nthe means of each groups or condition are the same, meaning that Var condition = 0
F=MSbtw/ MSerr = (0 +Var error) / Var error =1
MSbtw =MSerr
What precautions should you take when comparing differences between means in a 1-way ANOVA and why?
make sure the assumptions met:
interval/ratio data,
normally distributed
homogeneity of variance across conditions
homogeneity of covariance (if wirhin ANOVA)
If you select an appropriate covariate for your analysis, why will it increase the power of your analysis? Why might an inappropriate choice it decrease the power of your analysis?
If you select an appropriate covariate for your analysis, it ‘remove’ unwanted variability in the dependent variable, some of the DV variability is attributed to the covariate and hence removed from the error sums of squares, hence increases power
If the covariate is unrelated to the DV, the degree of freedom in the error term (1df per covariate) with no change in sums of squares
If add a second covariate that strongly related to (‘co-linear’ with) the first covariate
costs a degree of freedom, but does not reduce the error sums of squares
reduce power compared to the single covariate
If the covariate is related to the condition things will get worse…
Covariate reduces between condition sums of squares
Covariate makes little change to error sums of squares and costs df
What is the typical null hypothesis associated with a contingency table?
H0 =rows and columns are independent
assuming independence of rows and columns
What shape line does a logistic regression equation give and why is it a useful
shape?
logistic regression use with binary outcome and continuous IV
using normal regression to predict binary outcome is not enough because linear regression can get impossible data points (p1) and expected the residuals to depend on the X value
logistic regression restrict the value to the range of (0..1), input can take randome values.
the logit is the logarithim of the odds, in other words, logistic regression is estimating the odds as a function of the predictors
What sorts of clusters are difficult to detect accurately using k-means clusters?
using k-means clustering, you specify k, the number of clusters, the initial centroids influence the redult. if there is a large group (much larger than the other groups in the data), you might end up dividing the large clusters and combine the small ones)
What is the usual method of post-hoc analysis for a significant effect in MANOVA?
After you get significant MANOVA, you run separate ANOVA’s on individual DVs,
1) you should correct the alpha level using Bonferroni or Sidak’s correction on alpha level of the individual ANOVAs
2) a nonsignificance indicate complex combinations of DVs are needed
Correlated DVs may result in several significant ANOVA’s all because the same underlying cause
While running ANOVAs on individual DV makes sens statistically as a post-hoc test
you have to be able to justify why you used MANOVA in the first place, t
this is eay from a astas perspective but hard from a theoretical/ substantive psychological perspective
In discriminant function analysis (DFA), how is group membership of each case or response set determined?
DFA uses the scores to try and predict group membership
uses new DV scores tat minimises the distance from their centroid compared to distance to centroids of other groups.
i.e. take the raw dependent variables–>do the MANOVA trick and rescore, ask how far is each datum from the centroids
Why does it make intuitive sense to use a cut of eigenvalues > 1 when determining which components to use in principle component analysis (PCA)?
a cut of eigenvalues>1
as eigenvalues are the variability of normalized data, this is saying use the components who explain more than their ‘share’ of variability’
What are the basic steps in evaluating the significance of a mediating variable?
- only look for mediation because of theoretical considerations , create variables that may correlate with or cause each other
e. g., WHO suggests that there could be relationship between physical impairment (I), activity limitation(A), participation in normal social functioning (P) - if there are 3 variables, we want to compare the mediation pathway (I–>A—>P) with the direct pathway (i–>P, A–>P)
- check there is something to test for. Run regression. If non-significant, stop.
e.g., Does I predict P? standrd coeff=0.378, pP 0.098*(0.378**)
so there is evidence that A is a mediator
both legs of the I-A-P pathway are significant. when the mediator is included, the pathway from I-P is significantly reduced (assessing using the Sobel test)
As the I-P pathway is still significant, the mediation is partial