Exam revision questions Flashcards

Question 1

Q

What are the assumptions for t-tests, chi-squared tests, ANOVA and regression?

Question 2

Q

What is the effect size measure for a chi-squared test of independence?

Answer

A

Cramer’s V.

Question 3

Q

what is the difference between a one-sample and two-samples (independent and paired) t-test?

Answer

A

A one-sample t-test is comparing a sample mean to a value, such as a known population mean or a specific value.
A two-samples t-test is when we are comparing to sample means.

Question 4

Q

If the assumption of normality of variables is violated, what can you do if you wanted to do a one-samples t-test originally?

Answer

A

You can do a Wilcoxon test. Wilcoxon tests compare data by ranks and not actual values.

Question 5

Q

What does a p-value represent?

Can we say the null hypothesis is true or false?

Answer

A

A p-value tells us the probability, if the null hypothesis is true, of observing a test stat at least as extreme as ours.

We cannot claim whether the null hypothesis or the alternative hypothesis are true based on p-values. We can only draw inferences about how likely it is that either are true.

Question 6

Q

What is the standard deviation of the sampling distribution of the mean?

Answer

A

The standard error of the mean, or SEM = sample sd/square root of the sample size.

Question 7

Q

What factors influence power?
What is power?

Answer

A

Power is when we reject the null hypothesis and the null hypothesis is actually false. It measured by 1-beta, where beta is the type II error rate.

Power is dependent on sample size, alpha and effect size.

Question 8

Q

Can you say that p is the probability that the null hypothesis is true?

Answer

A

No!
The p-value is how likely you are to see your data IF the null was true.

Question 9

Q

What are the assumptions of a chi-squared test?

Answer

A

Large frequencies - at least 5 in each cell.
If violated use Fischer Exact Test.
Independence of data - no one contributed for than one piece of data.
If violated used McNemar test.

Question 10

Q

What is the common effect size used when doing chi-squared tests?

Answer

A

Cramer’s V.

The higher Cramer’s V, the more likely the variables are associated, not independent.
Think chi-square test for association or independence.

Question 11

Q

By definition do all z-scores have a mean of 0 and standard deviation of 1?

Answer

A

Yes.
I don’t quite understand what that means.

Question 12

Q

How do we come up with the t-distribution? Even if we know the population mean we do not know the standard deviation…

Answer

A

The t-distribution is created by taking the population mean then averaging over lots of different possible population sds. As N increases, we become more accurate at predicting population sd and the t-distribution becomes more normal and tighter.

Question 13

Q

As the t-distribution is dependent on the degrees of freedom, is it true that whether a t-statistic is significant depends on the sample size?

Answer

A

Yes.

A given value for a t-stat may be significant for a sample of 20, but not for a sammple of 10.

Question 14

Q

Do the degrees of freedom used in a Welch t-test of independence take into account just how different the variance is within each groups/sample?

Question 15

Q

What type of t-test assumes homogeneity of variance?
What type of t-test takes into account the different standard deviations of the samples?

Answer

A

Student independent samples t-test.

Welch independent samples t-test.

Question 16

Q

What are the assumptions made for an independent samples t-test?

Answer

A

Samples distributions are normal.
Data are independent.
Variance of samples are the same if using a student independent samples t-test. If violated, which it normally is, then use Welch independent samples t-test).

Question 17

Q

When do we use a paired-samples t-test?

Answer

A

When we are interested in the difference scores, not just whether two means ARE different.
Examples would be in repeated measures designs, such as pre- and post-treatment. Also, if the there is a common object in each group, such as two people giving ratings for the same set of hats and we wanted to know whether their mean rating of the hats overall differed.

Question 18

Q

One of the assumptions of all t-tests is that the data in each group are normally distributed. How do we test this both qualitatively and quantitatively?

Answer

A

QQ plots can be used to qualitatively check this.

Shapiro-Wilk tests can be used to quantitatively assess this.

Question 19

Q

What does a Shapiro-Wilk test with a W less than 1 and a p < .05 suggest.

Answer

A

That the data are not normally distributed.

Question 20

Q

What is the null hypothesis for a Shapiro-Wilk test?

Answer

A

That the data are distributed normally.

Question 21

Q

Is it true that Shapiro-Wilk tests will often be significant, i.e. imply non-normality, even if the data are normally distributed?

What can we do check normality if we have a large sample size and Shapiro-wilk is coming significant?

Answer

A

Yes.
If sample size is over 40-50 then Shapiro-Wilk test likely to be significant.

We can look at QQ plots and histograms to assess whether there is a normal distribution of data.

Question 22

Q

For an independet samples t-test you need to check the normality of each group. When checking the assumption of normalcy for paired-samples t-test, what group/s are we testing?

Answer

A

The difference variable. This is because the test we are doing is essentially a one samples t test on the difference variable.

Question 23

Q

What are non-parametric tests?

Answer

A

Statistical tests that do not make assumptions about the distribution of data.

Question 24

Q

What are some limitations with non-parametric tests?

Answer

A

They are not as powerful, i.e. they have higher type two error rates.

Question 25

Q

If the assumption of normality was violated, what is a non-parametric test we can use instead of a t-test?

Answer

A

A Wilcoxon test.
The test stat is W.
Essentially it measures how many times values from one group are larger than the other.

Question 26

Q

The effect size used for t-tests that meet assumption of normality is Cohen’s d. However, this effect size relies on normal distribution of data. What effect size can we use if we do a non-parametric test, such as a Wilcoxon?

Answer

A

Wilcoxon effect size r, which has a similar interpretation as Cohen’s d.

Question 27

Q

What influences the size of the t-statistic?

Answer

A

How different the means are (obviously).
The degrees of freedom/ sample size.
The variance of the two sample. Increased variance decreases t.

Question 28

Q

What is the difference between a one-way ANOVA and a two-way ANOVA?

Question 29

Q

In an ANOVA, does the sum of squares between measure the squared difference between each group mean and the grand mean, taking into account the sample size of each group?

Question 30

Q

Why is the total variability in an ANOVA, SSb + SSw, not our test stat?

Answer

A

Because this does not tell us whether there are multiple populations or not. We are interested in the relationshiop between the variability within groups and variability between groups. This will give us an indication as to whether we are observing multiple populations.

Hence, the test stat for ANOVA takes into account the ratio between SSb and SSw.

Question 31

Q

In an ANOVA, what are the degrees of freedom for the between groups variability?

What are the degrees of freedom for the within groups variability?

Answer

A

G-1, where G is number of groups.

N-G, where N is combined sample size and G is number of groups.

Question 32

Q

What is the test statistic for ANOVA?

Answer

A

F=MSb/MSw, where MSb=SSb/G-1, MSw=SSw/N-G.

Question 33

Q

What does a larger F indicate in an ANOVA?

Answer

A

That the means of the different groups are likely to be significantly different.

F is larger when the between groups variation is larger than the within groups variation.

Question 34

Q

What would we expect the F stat in an ANOVA to look like if the null hypothesis was true?

Answer

A

Small. If the null were true then the means of the different groups would be similar and the between groups sums of squares would be low. Even if SSw was low the F stat would still be low.

Question 35

Q

What is the effect size for ANOVA?

Question 36

Q

What does an eta-squared value of one indicate?

What does an eta-squared value of zero indicate?

Answer

A

This means that the between groups variance explains the total variance, i.e. they are the same, and therefore know which group something in is all you need to know to know its value.

That the total variance arises completely from within group variance and therefore it is unlikely we are looking at different groups and knowing which group something is in tells us nothing of the value of that entity.

Question 37

Q

Are eta-squared values from zero to one?

Question 38

Q

What does eta-squared tell us?

Answer

A

The proportion of the total variance explained by the grouping variable.

e.g. an eta-squared of 0.5 suggests that 50% of the variance in the dependent or outcome variable is explained by the predictor variable or grouping variable.

Question 39

Q

What is a family-wise type I error rate?

What do we want this to be?

Answer

A

The type I error rate associated with multiple tests, such as multiple t-test done when doing an ANOVA. In other words, it is the probability of obtaining at least one type I error across multiple tests.

We want the family-wise type I error rate to be 5%.

Question 40

Q

What are some ways we can adjust our p-values for individual tests when we are doing multiple that are contributing to the same analysis?

Answer

A

Bonferonni correction.

Holm correction.

Question 41

Q

What are some limitations with the Bonferonni correction?

Answer

A

Bonferonni correction is done by multiply the number of tests by the p-values of each test.
This is a very conservative approach and leads to a large loss of power and potentially interesting and important information, i.e. high type II error rate.

Question 42

Q

What is the Holm correction and why is it a preferable correction to use when doing multiple tests than Bonferroni correction?

Answer

A

The Holm correction has the same type I error rate, but lower type II error rate.

It works by multiplying the lowest p-value by the number of tests then the next lowest p-value by number of tests-1 and so on until it gets to a p-value it cannot reject.

Question 43

Q

When reporting post-hoc t-tests for ANOVA, is it enough to just mention the p-values and not the t-stat or df?

Answer

A

Yes.
Need to also include the correction that was done.

Question 44

Q

What are the assumptions for an ANOVA?

Answer

A

That the residuals are normally distribured.
There is equal variance across groups.

Question 45

Q

One of the assumptions for an ANOVA is that the residuals are normally distributed.
What are the residuals?

Answer

A

The residuals are the within groups variance. In other words they are the difference between the individual data points and the their group mean.
These need to be normally distributed. It does not matter whether variable itself is normally distributed or not.

Question 46

Q

If we are wanting to do an ANOVA and the residuals are not normally distributed, what is another test we can do?

Answer

A

Kruskall-Wallis test. This does ANOVA on ranked data as opposed to actual data, similar to Wilcoxon.

Question 47

Q

What is the effect size for a Kruskall-Wallis test?

Answer

A

It is just a Kruskall-Wallis effect size and it is interpreted like eta-squared, such that 0.23 says that the grouping variabe accounts for 23% of the variance in the outcome variable.

Question 48

Q

One of the assumptions for doing an ANOVA, is that there is equal variance across groups.
How do we check this?

Answer

A

By using Levene’s test.
This tests checks whether the standard deviations of each group are equal.
It yield an F-stat and a p-value. If p-value is <.05 then there is not equal variance across groups.
We can then use a Welch one-way ANOVA.

Question 49

Q

Is it true that two-way ANOVAs can only be done on balanced designs that have the same number of data points in each cell?

Question 50

Q

Why do two-way ANOVAs generate different results than a one-way ANOVA?

Answer

A

The residuals are different. A two-way ANOVA takes into account variation of two grouping variables

Question 51

Q

Can you compare multiple means even if there are two grouping variables?

Answer

A

Yes.
You use a two-way ANOVA.

Question 52

Q

When would we use a two-way ANOVA?

Answer

A

When we wanted to see how the mean of a quantitative dependent variable varied according to the levels of two categorical variables.
e.g if you wanted to see how the amount of food harvested varied depending on both what type of land the food was grown on and the type of food that was being grown.

Question 53

Q

When we do a two-way ANOVA do we get multiple F stats?

Answer

A

Yes.
We get an F stat for each factor or grouping variable and for an interaction term, if we include this.
This tells us whether a factor has a significant effect on the dependent or outcome variable when we take into account the other factor’s influence on the outcome variable.
We can either have two significant main main effects (from both grouping variables) or one or none.

Question 54

Q

How are the F-stats calculated for two-way ANOVA?

Answer

A

Each factor or grouping variable has an F stat calcualted for it.
F=MSb/MSr,
where MSb = sums of squares between for the factor/(G-1)
MSr=sums of squares of residuals/(N-R-C+1).
Residuals are the variance that is not accounted for by the factor after both factors have been taken into account.
In other words, the residuals reflect how much variation there is in the outcome variable after taking into account the variation associated with the two factors.

Question 55

Q

In a two-way ANOVA what is meant by an interaction effect?

Answer

A

An interaction term tells us whether the effect of one grouping variable is dependent on the other grouping variable.

Question 56

Q

When we include an interaction term in a two-way ANOVA, how are the residuals different to the residuals in a two-way ANOVA with no interaction term?

Answer

A

The residuals when we include an interaction term will tend to be smaller because they are reflective of the variance that is not accounted for by the two factors AND the interaction of these factors.

Question 57

Q

What does the partial eta squared tell us in a two-way ANOVA?

Answer

A

It tells us the effect of one factor if we assume all other factors are zero. Not very useful.
If you wanted to know this then do a one-way ANOVA.

Question 58

Q

What do we use if we want to measure a correlation between two continuous numeric variables?

Answer

A

Pearson’s correlation if the relationship is linear.

Spearman’s correlation if the relationship is non-linear.

Question 59

Q

How is a Spearman’s correlation calculated?

Answer

A

It converts all data to ranks and then does a correlation on the ranked data.

Question 60

Q

How is a regression line fitted?

Answer

A

On the basis of the least squares principle, where the summed deviations between the predicted Y values and the actual Y values are the smallest.
A regression line aims to minimise residual sums of squares (analogous to within groups sums of squares in ANOVA).

Question 61

Q

When we have one numeric outcome variable and two predictor variables, what kind of test do we do and how is the relationship modelled?

Answer

A

Multiple regression.
The model is a plane of best fit.

Question 62

Q

When we have one numeric continuous outcome and two numeric continuous predictor variables AND an interaction between those predictor variables how is the relationship modelled and how do we interpret the interaction?

Answer

A

A multiple regression is done taking into account the interaction term.
The model is a curved plane of best fit.
Interactions tell us that we cannot understand the relationship between one predictor variable and the outcome variable unless we know the value of the other predictor variable.

Question 63

Q

When we have interaction term in a multiple regression with two predictor variables, what governs the relationship?

Answer

A

The sign of the interaction term.
If the sign of the interaction term is positive, then when both predictors are negative or positive (i.e. they have the same sign) then the outcome variable will be high.
When the two predictor variables have opposite signs then the outcome variable is low.

Question 64

Q

If a mutiple regression model has a negative interaction term, then what does it mean for the outcome variable when the two predictors have the same sign?

Answer

A

The outcome variable is low.
It will be high if the two variables have opposite signs.

Answer 58

A

F stat.
Analysis is analogous to ANOVA.
The two values that are used for a regression F stat are:
1. the model sum of squares, or SSm
2. the residual sum of squares, SSr

Answer 59

A

Model sum of squares.
The model sum of squares looks at how different the regression line/plane predictions are compared to the mean of the outcome variable.
The degrees of freedom are the number of predictors.

The steeper the slope the stronger the relationship between predictors and outcome variable and the larger the model sum of squares will be.

Answer 60

A

The residual sum of squares, which is the difference between the data and the regression line/plane/curve predictions.
The larger the summed difference between the actual data and the predictions of the regression model the larger SSr will be and the less likely F will be significant.

Answer 61

A

We convert data to z-scores and then run the analysis. The coefficients we get are called standardised coefficients. They allow us to compare the impact of each predictor on the outcome variable to each other.

Answer 62

A

We do t-tests on the slope of the regression model for the predictor and the null hypothesis, which states that the slope should be zero.

Answer 63

A

Linearity.
Normality of residuals.
High influence points.
4.Collinearity

Answer 64

A

We can plot the residuals and the predicted values of our test. If there is equal range of residuals across values predicted by model then we can assume linearity.
We can look at the model alone or we can also look at residuals and values for each predictor.
A Tukey test will have a p-value that indicates whether the model is linear or not. p <.05 indicates that the model is not linear.

Answer 65

A

An outlier has a large residual, but does not influence the model much.

A high leverage point is quite far away from the rest of the data, but does not have a large residual, but does influence or leverage the predicted model somewhat.

A high influence point is one that has a large residual and has high leverage. Including a high infuence point significantly alters the regression model.

Answer 66

A

Using Cook’s distance.
Cook’s distance takes into account both a given data points residual as well as its leverage on the model (measured by hat values).
2k/N, where k is the number of coefficients - don’t forget the intercept!

Answer 67

A

Collinearity refers to whether predictor variables are correlated with each other.
The more correlated they are the more uncertainty there is around the coefficients in a regression model.

Answer 68

A

Using a Variance Inflation Factor (VIF).
Square root of the VIF tells us roughly much bigger the confidence interval becomes when we add this predictor, given its collinearity with the other predictor variables.

Answer 69

A

VIF values greater than 2 or 3.

Answer 70

A

No.
This is because, in general, the model that explains the most variance would be the model that has the most predictors.

Answer 71

A

AIC and BIC.
Lower values indicate better model choice.

Answer 72

A

A research paper that predefines the exclusion/inclusion criteria for papers, search strategies, and how information from the papers will be coded.
These are a response to narrative reviews of research papers that are too vague and subjective with their inclusion or exclusion of certain research papers.
There is still subjectivity to these reviews.

Answer 73

A

Yes.
This is because they are a statistical evaluation based on the statistical evaluations done by lots of other research papers.

Answer 74

A

They effectively have much larger sample sizes, because they are taking the sample sizes of multiple studies and putting them together. Larger sample sizes increase likelihood of generatign significant results.

Answer 75

A

Vote counting takes non-significant results as evidence for no effect. This is not what non-significant results show.

Answer 76

A

Yes.
They allow for research to be collated and interesting, even life-saving, conclusions to be arrived at, e.g. streptokinase as a treatment for heart attack example.

Answer 77

A

There are many studies that do not get published because they do not show significant results. This means that when significant results are published they will be received with more weight than they may represent, as there has been prior evidence that would diminish this confidence. This is especially pertinent in meta-analyses, as these analyses generally can only work with published works and so cannot take other statistical analysis and research findings into account when doing a meta-analysis.

Answer 78

A

No.
But not all tests that reliable are valid.
A test is valid if it is reliable AND is measuring the construct it purports to measure.
Reliability refers to a given test generating the same result when re-administered. That is, it consistently generates the same results.

Answer 79

A

Proposed by Spearman in mid 20th century.
Theory of reliability?

Answer 80

A

All observed values, say for psych assessment, are made up of the true score and an error (endogenous and/or exogenous).

Classical Test Theory assumes the follwoing:
1. The expected error is zero.
2. Errors do not correlate with each other.
3. Errors do not correlate with true scores.
4. Expected value of the test is equal to the true score.

Answer 81

A

Signal/(Signal + Noise)

We want to signal to be strong enough to almost cancel out or at least significantly reduce the effect of the noise or error.

Answer 82

A

Yes.
This is because true reliability relies on population data that we do not have. We therefore have to estimate the reliability.

Answer 83

A

Test-retest reliability.
Alternate forms reliability.
Split-half reliability
Cronbach’s alpha.

Answer 84

A

Cronbach’s alpha.
Generates a very conservative estimate for reliability.

Answer 85

A

We calculate the standard error of estimation and then multiply it by 1.96. The predicted true score plus or minus this value is the 95% confidence interval for our predicted true score.
Essentially this is saying that we are 95% confidence that the clients true score falls within this range.

Answer 86

A

Increase the relationship between the psychological construct and the test.
Remove sources of inconsistency in test administration and interpretation.
Increase the number of items on the test.

Answer 87

A

It tells us how much the reliability of our test will change if we change the number of questions on our test.
Reliability will increase if we increase the number of questions on our test and decrease if we decrease the number of questions on our test.

Answer 88

A

“the degree to which evidence and theory support the interpretations of test scores entailed by the proposed uses of the test.”

Answer 89

A

Criterion validity. e.g does the test correlate to other gold standard measures of the construct?
Content validity-does the test cover the whole domain of the construct?
Construct validity - does the test actually assess the construct of interest? Can be evaluated by looking at whether correlation between the test and other related constructs, such as a new anxiety test and its correlation with a well-established test for stress.

Answer 90

A

Construct validity.

Answer 91

A

A test’s ability to correctly identify positive cases, that is to correctly identify those with a certain disease.

True positives/ (true positives + false negatives)

Answer 92

A

The ability of a test to correctly identify negative cases.

True negatives/(true negatives + false positives)

Answer 93

A

No. But Bayesianists can.

Answer 94

A

The Bayes factor tells us the relative probability of seeing the data given our two hypotheses.
Probability of seeing data given alternative hypothesis is true/Probability of observing data given null hypothesis is true.

Answer 95

A

Construct Validity.
External validity.
Internal validity.
Statistical conclusion validity.