Exam revision questions Flashcards

1
Q

What are the assumptions for t-tests, chi-squared tests, ANOVA and regression?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the effect size measure for a chi-squared test of independence?

A

Cramer’s V.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the difference between a one-sample and two-samples (independent and paired) t-test?

A

A one-sample t-test is comparing a sample mean to a value, such as a known population mean or a specific value.
A two-samples t-test is when we are comparing to sample means.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

If the assumption of normality of variables is violated, what can you do if you wanted to do a one-samples t-test originally?

A

You can do a Wilcoxon test. Wilcoxon tests compare data by ranks and not actual values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does a p-value represent?

Can we say the null hypothesis is true or false?

A

A p-value tells us the probability, if the null hypothesis is true, of observing a test stat at least as extreme as ours.

We cannot claim whether the null hypothesis or the alternative hypothesis are true based on p-values. We can only draw inferences about how likely it is that either are true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the standard deviation of the sampling distribution of the mean?

A

The standard error of the mean, or SEM = sample sd/square root of the sample size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What factors influence power?
What is power?

A

Power is when we reject the null hypothesis and the null hypothesis is actually false. It measured by 1-beta, where beta is the type II error rate.

Power is dependent on sample size, alpha and effect size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Can you say that p is the probability that the null hypothesis is true?

A

No!
The p-value is how likely you are to see your data IF the null was true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the assumptions of a chi-squared test?

A
  1. Large frequencies - at least 5 in each cell.
    If violated use Fischer Exact Test.
  2. Independence of data - no one contributed for than one piece of data.
    If violated used McNemar test.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the common effect size used when doing chi-squared tests?

A

Cramer’s V.

The higher Cramer’s V, the more likely the variables are associated, not independent.
Think chi-square test for association or independence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

By definition do all z-scores have a mean of 0 and standard deviation of 1?

A

Yes.
I don’t quite understand what that means.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do we come up with the t-distribution? Even if we know the population mean we do not know the standard deviation…

A

The t-distribution is created by taking the population mean then averaging over lots of different possible population sds. As N increases, we become more accurate at predicting population sd and the t-distribution becomes more normal and tighter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

As the t-distribution is dependent on the degrees of freedom, is it true that whether a t-statistic is significant depends on the sample size?

A

Yes.

A given value for a t-stat may be significant for a sample of 20, but not for a sammple of 10.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Do the degrees of freedom used in a Welch t-test of independence take into account just how different the variance is within each groups/sample?

A

Yes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What type of t-test assumes homogeneity of variance?
What type of t-test takes into account the different standard deviations of the samples?

A

Student independent samples t-test.

Welch independent samples t-test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the assumptions made for an independent samples t-test?

A
  1. Samples distributions are normal.
  2. Data are independent.
  3. Variance of samples are the same if using a student independent samples t-test. If violated, which it normally is, then use Welch independent samples t-test).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

When do we use a paired-samples t-test?

A

When we are interested in the difference scores, not just whether two means ARE different.
Examples would be in repeated measures designs, such as pre- and post-treatment. Also, if the there is a common object in each group, such as two people giving ratings for the same set of hats and we wanted to know whether their mean rating of the hats overall differed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

One of the assumptions of all t-tests is that the data in each group are normally distributed. How do we test this both qualitatively and quantitatively?

A

QQ plots can be used to qualitatively check this.

Shapiro-Wilk tests can be used to quantitatively assess this.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does a Shapiro-Wilk test with a W less than 1 and a p < .05 suggest.

A

That the data are not normally distributed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the null hypothesis for a Shapiro-Wilk test?

A

That the data are distributed normally.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Is it true that Shapiro-Wilk tests will often be significant, i.e. imply non-normality, even if the data are normally distributed?

What can we do check normality if we have a large sample size and Shapiro-wilk is coming significant?

A

Yes.
If sample size is over 40-50 then Shapiro-Wilk test likely to be significant.

We can look at QQ plots and histograms to assess whether there is a normal distribution of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

For an independet samples t-test you need to check the normality of each group. When checking the assumption of normalcy for paired-samples t-test, what group/s are we testing?

A

The difference variable. This is because the test we are doing is essentially a one samples t test on the difference variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are non-parametric tests?

A

Statistical tests that do not make assumptions about the distribution of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are some limitations with non-parametric tests?

A

They are not as powerful, i.e. they have higher type two error rates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

If the assumption of normality was violated, what is a non-parametric test we can use instead of a t-test?

A

A Wilcoxon test.
The test stat is W.
Essentially it measures how many times values from one group are larger than the other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

The effect size used for t-tests that meet assumption of normality is Cohen’s d. However, this effect size relies on normal distribution of data. What effect size can we use if we do a non-parametric test, such as a Wilcoxon?

A

Wilcoxon effect size r, which has a similar interpretation as Cohen’s d.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What influences the size of the t-statistic?

A
  1. How different the means are (obviously).
  2. The degrees of freedom/ sample size.
  3. The variance of the two sample. Increased variance decreases t.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is the difference between a one-way ANOVA and a two-way ANOVA?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

In an ANOVA, does the sum of squares between measure the squared difference between each group mean and the grand mean, taking into account the sample size of each group?

A

Yes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Why is the total variability in an ANOVA, SSb + SSw, not our test stat?

A

Because this does not tell us whether there are multiple populations or not. We are interested in the relationshiop between the variability within groups and variability between groups. This will give us an indication as to whether we are observing multiple populations.

Hence, the test stat for ANOVA takes into account the ratio between SSb and SSw.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

In an ANOVA, what are the degrees of freedom for the between groups variability?

What are the degrees of freedom for the within groups variability?

A

G-1, where G is number of groups.

N-G, where N is combined sample size and G is number of groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the test statistic for ANOVA?

A

F=MSb/MSw, where MSb=SSb/G-1, MSw=SSw/N-G.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What does a larger F indicate in an ANOVA?

A

That the means of the different groups are likely to be significantly different.

F is larger when the between groups variation is larger than the within groups variation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What would we expect the F stat in an ANOVA to look like if the null hypothesis was true?

A

Small. If the null were true then the means of the different groups would be similar and the between groups sums of squares would be low. Even if SSw was low the F stat would still be low.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is the effect size for ANOVA?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What does an eta-squared value of one indicate?

What does an eta-squared value of zero indicate?

A

This means that the between groups variance explains the total variance, i.e. they are the same, and therefore know which group something in is all you need to know to know its value.

That the total variance arises completely from within group variance and therefore it is unlikely we are looking at different groups and knowing which group something is in tells us nothing of the value of that entity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Are eta-squared values from zero to one?

A

Yes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What does eta-squared tell us?

A

The proportion of the total variance explained by the grouping variable.

e.g. an eta-squared of 0.5 suggests that 50% of the variance in the dependent or outcome variable is explained by the predictor variable or grouping variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is a family-wise type I error rate?

What do we want this to be?

A

The type I error rate associated with multiple tests, such as multiple t-test done when doing an ANOVA. In other words, it is the probability of obtaining at least one type I error across multiple tests.

We want the family-wise type I error rate to be 5%.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What are some ways we can adjust our p-values for individual tests when we are doing multiple that are contributing to the same analysis?

A

Bonferonni correction.

Holm correction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What are some limitations with the Bonferonni correction?

A

Bonferonni correction is done by multiply the number of tests by the p-values of each test.
This is a very conservative approach and leads to a large loss of power and potentially interesting and important information, i.e. high type II error rate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What is the Holm correction and why is it a preferable correction to use when doing multiple tests than Bonferroni correction?

A

The Holm correction has the same type I error rate, but lower type II error rate.

It works by multiplying the lowest p-value by the number of tests then the next lowest p-value by number of tests-1 and so on until it gets to a p-value it cannot reject.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

When reporting post-hoc t-tests for ANOVA, is it enough to just mention the p-values and not the t-stat or df?

A

Yes.
Need to also include the correction that was done.

44
Q

What are the assumptions for an ANOVA?

A
  1. That the residuals are normally distribured.
  2. There is equal variance across groups.
45
Q

One of the assumptions for an ANOVA is that the residuals are normally distributed.
What are the residuals?

A

The residuals are the within groups variance. In other words they are the difference between the individual data points and the their group mean.
These need to be normally distributed. It does not matter whether variable itself is normally distributed or not.

46
Q

If we are wanting to do an ANOVA and the residuals are not normally distributed, what is another test we can do?

A

Kruskall-Wallis test. This does ANOVA on ranked data as opposed to actual data, similar to Wilcoxon.

47
Q

What is the effect size for a Kruskall-Wallis test?

A

It is just a Kruskall-Wallis effect size and it is interpreted like eta-squared, such that 0.23 says that the grouping variabe accounts for 23% of the variance in the outcome variable.

48
Q

One of the assumptions for doing an ANOVA, is that there is equal variance across groups.
How do we check this?

A

By using Levene’s test.
This tests checks whether the standard deviations of each group are equal.
It yield an F-stat and a p-value. If p-value is <.05 then there is not equal variance across groups.
We can then use a Welch one-way ANOVA.

49
Q

Is it true that two-way ANOVAs can only be done on balanced designs that have the same number of data points in each cell?

A

Yes.

50
Q

Why do two-way ANOVAs generate different results than a one-way ANOVA?

A

The residuals are different. A two-way ANOVA takes into account variation of two grouping variables

51
Q

Can you compare multiple means even if there are two grouping variables?

A

Yes.
You use a two-way ANOVA.

52
Q

When would we use a two-way ANOVA?

A

When we wanted to see how the mean of a quantitative dependent variable varied according to the levels of two categorical variables.
e.g if you wanted to see how the amount of food harvested varied depending on both what type of land the food was grown on and the type of food that was being grown.

53
Q

When we do a two-way ANOVA do we get multiple F stats?

A

Yes.
We get an F stat for each factor or grouping variable and for an interaction term, if we include this.
This tells us whether a factor has a significant effect on the dependent or outcome variable when we take into account the other factor’s influence on the outcome variable.
We can either have two significant main main effects (from both grouping variables) or one or none.

54
Q

How are the F-stats calculated for two-way ANOVA?

A

Each factor or grouping variable has an F stat calcualted for it.
F=MSb/MSr,
where MSb = sums of squares between for the factor/(G-1)
MSr=sums of squares of residuals/(N-R-C+1).
Residuals are the variance that is not accounted for by the factor after both factors have been taken into account.
In other words, the residuals reflect how much variation there is in the outcome variable after taking into account the variation associated with the two factors.

55
Q

In a two-way ANOVA what is meant by an interaction effect?

A

An interaction term tells us whether the effect of one grouping variable is dependent on the other grouping variable.

56
Q

When we include an interaction term in a two-way ANOVA, how are the residuals different to the residuals in a two-way ANOVA with no interaction term?

A

The residuals when we include an interaction term will tend to be smaller because they are reflective of the variance that is not accounted for by the two factors AND the interaction of these factors.

57
Q

What does the partial eta squared tell us in a two-way ANOVA?

A

It tells us the effect of one factor if we assume all other factors are zero. Not very useful.
If you wanted to know this then do a one-way ANOVA.

58
Q

What do we use if we want to measure a correlation between two continuous numeric variables?

A

Pearson’s correlation if the relationship is linear.

Spearman’s correlation if the relationship is non-linear.

59
Q

How is a Spearman’s correlation calculated?

A

It converts all data to ranks and then does a correlation on the ranked data.

60
Q

How is a regression line fitted?

A

On the basis of the least squares principle, where the summed deviations between the predicted Y values and the actual Y values are the smallest.
A regression line aims to minimise residual sums of squares (analogous to within groups sums of squares in ANOVA).

61
Q

When we have one numeric outcome variable and two predictor variables, what kind of test do we do and how is the relationship modelled?

A

Multiple regression.
The model is a plane of best fit.

62
Q

When we have one numeric continuous outcome and two numeric continuous predictor variables AND an interaction between those predictor variables how is the relationship modelled and how do we interpret the interaction?

A

A multiple regression is done taking into account the interaction term.
The model is a curved plane of best fit.
Interactions tell us that we cannot understand the relationship between one predictor variable and the outcome variable unless we know the value of the other predictor variable.

63
Q

When we have interaction term in a multiple regression with two predictor variables, what governs the relationship?

A

The sign of the interaction term.
If the sign of the interaction term is positive, then when both predictors are negative or positive (i.e. they have the same sign) then the outcome variable will be high.
When the two predictor variables have opposite signs then the outcome variable is low.

64
Q

If a mutiple regression model has a negative interaction term, then what does it mean for the outcome variable when the two predictors have the same sign?

A

The outcome variable is low.
It will be high if the two variables have opposite signs.

65
Q

What is the test statistic for analysing in the significance of a regression model?

A

F stat.
Analysis is analogous to ANOVA.
The two values that are used for a regression F stat are:
1. the model sum of squares, or SSm
2. the residual sum of squares, SSr

66
Q

What is the regression equivalent to between groups variance in an ANOVA?

A

Model sum of squares.
The model sum of squares looks at how different the regression line/plane predictions are compared to the mean of the outcome variable.
The degrees of freedom are the number of predictors.

The steeper the slope the stronger the relationship between predictors and outcome variable and the larger the model sum of squares will be.

67
Q

What is the regression equivalent to the within groups variance/sum of squares in ANOVA?

A

The residual sum of squares, which is the difference between the data and the regression line/plane/curve predictions.
The larger the summed difference between the actual data and the predictions of the regression model the larger SSr will be and the less likely F will be significant.

68
Q

In multiple regression how can we tell which predictor variable has more of an influence on the outcome variable when the predictor variables are measured on different scales?

A

We convert data to z-scores and then run the analysis. The coefficients we get are called standardised coefficients. They allow us to compare the impact of each predictor on the outcome variable to each other.

69
Q

How do we determine which predictor variables have a significant influence or relationship with the outcome variable?

A

We do t-tests on the slope of the regression model for the predictor and the null hypothesis, which states that the slope should be zero.

70
Q

What are the assumptions made with a linear regression (multiple or single predictor)?

A
  1. Linearity.
  2. Normality of residuals.
  3. High influence points.
    4.Collinearity
71
Q

How do we determine whether there is a linear relationship between our model and the outcome or our predictor variables and our outcome?

A

We can plot the residuals and the predicted values of our test. If there is equal range of residuals across values predicted by model then we can assume linearity.
We can look at the model alone or we can also look at residuals and values for each predictor.
A Tukey test will have a p-value that indicates whether the model is linear or not. p <.05 indicates that the model is not linear.

72
Q

What is the difference between an outlier, a high leverage point, and a high influence point?

A

An outlier has a large residual, but does not influence the model much.

A high leverage point is quite far away from the rest of the data, but does not have a large residual, but does influence or leverage the predicted model somewhat.

A high influence point is one that has a large residual and has high leverage. Including a high infuence point significantly alters the regression model.

73
Q

How are high influence identified?

A

Using Cook’s distance.
Cook’s distance takes into account both a given data points residual as well as its leverage on the model (measured by hat values).
2k/N, where k is the number of coefficients - don’t forget the intercept!

74
Q

What is collinearity?

A

Collinearity refers to whether predictor variables are correlated with each other.
The more correlated they are the more uncertainty there is around the coefficients in a regression model.

75
Q

How do we measure how much the collinearity between predictor variables is influencing the confidence intervals for the coefficients of the model?

A

Using a Variance Inflation Factor (VIF).
Square root of the VIF tells us roughly much bigger the confidence interval becomes when we add this predictor, given its collinearity with the other predictor variables.

76
Q

What values of VIF indicate that the inclusion of a predictor is jeopardising our certainty in a model too much due to collinearity?

A

VIF values greater than 2 or 3.

77
Q

When trying to figure out which model to choose that best describes a relationship between predictors and an outcome, should you always choose the model that explains the most variance?

A

No.
This is because, in general, the model that explains the most variance would be the model that has the most predictors.

78
Q

How can we measure and penalise for model complexity of multiple regression models?

A

AIC and BIC.
Lower values indicate better model choice.

79
Q

What is a systematic review?

A

A research paper that predefines the exclusion/inclusion criteria for papers, search strategies, and how information from the papers will be coded.
These are a response to narrative reviews of research papers that are too vague and subjective with their inclusion or exclusion of certain research papers.
There is still subjectivity to these reviews.

80
Q

Are systematic reviews done before a meta-analysis?

A

Yes.

81
Q

Is there a lot of critique on meta-analyses?

A

Yes.
This is because they are a statistical evaluation based on the statistical evaluations done by lots of other research papers.

82
Q

What is one of the reasons that meta-analyses generate more significant results and tighter confidence intervals than individual studies?

A

They effectively have much larger sample sizes, because they are taking the sample sizes of multiple studies and putting them together. Larger sample sizes increase likelihood of generatign significant results.

83
Q

What are some reasons ‘vote counting’ is bad?

A
  1. Vote counting takes non-significant results as evidence for no effect. This is not what non-significant results show.
84
Q

Can meta-analyses be good?

A

Yes.
They allow for research to be collated and interesting, even life-saving, conclusions to be arrived at, e.g. streptokinase as a treatment for heart attack example.

85
Q

What is the file-drawer problem or publication bias?

A

There are many studies that do not get published because they do not show significant results. This means that when significant results are published they will be received with more weight than they may represent, as there has been prior evidence that would diminish this confidence. This is especially pertinent in meta-analyses, as these analyses generally can only work with published works and so cannot take other statistical analysis and research findings into account when doing a meta-analysis.

86
Q

In general, can a test be valid and not reliable?

A

No.
But not all tests that reliable are valid.
A test is valid if it is reliable AND is measuring the construct it purports to measure.
Reliability refers to a given test generating the same result when re-administered. That is, it consistently generates the same results.

87
Q

What is Classical Test Theory?

A

Proposed by Spearman in mid 20th century.
Theory of reliability?

88
Q

What are the four assumptions of Classical Test Theory?

A

All observed values, say for psych assessment, are made up of the true score and an error (endogenous and/or exogenous).

Classical Test Theory assumes the follwoing:
1. The expected error is zero.
2. Errors do not correlate with each other.
3. Errors do not correlate with true scores.
4. Expected value of the test is equal to the true score.

89
Q

What is a simple calculation for reliability?

A

Signal/(Signal + Noise)

We want to signal to be strong enough to almost cancel out or at least significantly reduce the effect of the noise or error.

90
Q

Is it true that we cannot calculate the true reliability of a test?

A

Yes.
This is because true reliability relies on population data that we do not have. We therefore have to estimate the reliability.

91
Q

What is the reliability value if we have perfect reliability?

A

1.

92
Q

What are some of the ways we can estimate reliability?

A
  1. Test-retest reliability.
  2. Alternate forms reliability.
  3. Split-half reliability
  4. Cronbach’s alpha.
93
Q

What is one of the most common measures used to estimate reliability and what is one of the main criticisms for it?

A

Cronbach’s alpha.
Generates a very conservative estimate for reliability.

94
Q

How do we calculate the confidence interval around our predicted true score?

A

We calculate the standard error of estimation and then multiply it by 1.96. The predicted true score plus or minus this value is the 95% confidence interval for our predicted true score.
Essentially this is saying that we are 95% confidence that the clients true score falls within this range.

95
Q

What are some of the ways to increase correlation between test scores and constructs?

A
  1. Increase the relationship between the psychological construct and the test.
  2. Remove sources of inconsistency in test administration and interpretation.
  3. Increase the number of items on the test.
96
Q

What does the Spearman-Brown Prophecy formula tell us?

A

It tells us how much the reliability of our test will change if we change the number of questions on our test.
Reliability will increase if we increase the number of questions on our test and decrease if we decrease the number of questions on our test.

97
Q

What is one of the most influential ways validity is defined?

A

“the degree to which evidence and theory support the interpretations of test scores entailed by the proposed uses of the test.”

98
Q

What are the three types of validity we discussed in lecture?

A
  1. Criterion validity. e.g does the test correlate to other gold standard measures of the construct?
  2. Content validity-does the test cover the whole domain of the construct?
  3. Construct validity - does the test actually assess the construct of interest? Can be evaluated by looking at whether correlation between the test and other related constructs, such as a new anxiety test and its correlation with a well-established test for stress.
99
Q

Which type of validity is most important, according to current view?

A

Construct validity.

100
Q

What is test sensitivity?

A

A test’s ability to correctly identify positive cases, that is to correctly identify those with a certain disease.

True positives/ (true positives + false negatives)

101
Q

What is test specificity?

A

The ability of a test to correctly identify negative cases.

True negatives/(true negatives + false positives)

102
Q

Is monotrait-monomethod correlation the same as reliability?

A

Yes.

103
Q

Is NHST just one approach to statistical analysis done by frequentists?

A

Yes.

104
Q

Can frequentist say anything about a single event, such as a meteor hitting the earth and causing the extinction of the dinosaurs?

A

No. But Bayesianists can.

105
Q

What’s a Bayes factor?

A

The Bayes factor tells us the relative probability of seeing the data given our two hypotheses.
Probability of seeing data given alternative hypothesis is true/Probability of observing data given null hypothesis is true.

106
Q

Four types of validity discussed in seminar two?

A
  1. Construct Validity.
  2. External validity.
  3. Internal validity.
  4. Statistical conclusion validity.