Quantitative Revision Flashcards

1
Q

In what circumstances would you perform a simple linear regression test?

A

To determine if there are linear relationships/associations between ratio/interval variables i.e. X and Y

Enable prediction of the values of Y (DV) from the values of X (IV)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What assumptions must be met in order for you to use the simple linear regression test with your data?

A

Ratio/interval data

Linear relationship between X and Y

Data are randomly sampled

No outliers amongst data

Residuals must be approximately normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What would be an appropriate null and alternative hypotheses for the simple linear regression test?

Non-directional (two-tailed)
Directional (one-tailed)

A

H0: There is no linear relationship between X and Y.
H1: There is a linear relationship between X and Y.

H0: There is no positive linear relationship between X and Y.
H1: There is a positive linear relationship between X and Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe what the results mean for a simple linear regression test./
Interpret the results
Write-up of conclusion and results

A

Standardized coefficient
r: strength of the relationship between X and Y (with 1 being the strongest)
Beta: predictedeffectonYif X increases by 1 SD –> When X increasesby1SD,Yispredictedtoincreaseby.85SDs UsefulwheretherearemultipleIVs(inmultipleregression)

r^2: represents the variability in Y that can be explained by X

Unstandardized coefficient

b: For every increase in 1 unit of X, Y increases by b units
a: only interpret this if it makes sense/there is meaning/it is useful in knowing the value of Y when X = 0

significance (sig.) (i.e.p-value).: tells us the significance of
association between X and Y
effect of X on Y
The statistical significance associated with height matters
IGNORE the statistical significance associated with the constant

Be sure to answer in terms of the question and its scenario

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In what circumstances would you perform a Pearson’s (r) correlation test?

A

To determine the (strength and direction of an) association between 2 variables i.e. X and Y, where neither is categorical, but instead continuous outcome:
ratio/interval(parametric) e.g. weight (kg)
ordinalscale(non‐parametricequivalent) e.g. world ranking No.1, No.5 etc.

Parametric data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What assumptions must be met in order for you to use the Pearson’s (r) correlation test with your data?

A

X and Ymustberatio/interval

Linearassociation between X and Y(scatterplot)

Theassociationmustshowhomogeneity of variance(scatterplot), wherethedatapointsareevenly distributedalongtheregressionline

Data for X and Y should follow a normal distribution (histogram, box plot, normal probability Q-Q plot, skewness and kurtosis z-scores, mean = median)

No outliers (scatter plot, box plot)

Ideally,shouldonlybeused withasampleofn>=100
[Forsmallersamplesizes,thereisariskthatoneortwo extremedatapoints‘drive’theassociation]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What would be an appropriate null and alternative hypotheses for the Pearson’s (r) correlation test?

Non-directional (two-tailed)
Directional (one-tailed)

A

H0: There is no association between X and Y.
HA: There is an association between X and Y.

H0: There is no positive association between X and Y.
H1: There is a positive association between X and Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe what the results mean for a Pearson’s (r) correlation test./
Interpret the results
Write-up of conclusion and results

A

The results show a significant/non-significant (significance) weak/strong (strength) negative/positive (direction) correlation between X and Y

r: represents the strength of the relationship/association between X and Y

sig (i.e.p-value).: tells us the significance of the association between X and Y

r^2: represents the variability in Y that can be explained by X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In what circumstances would you perform a Spearman’s (rho) test?

Spearman’s rho calculates the ranked scores for each variable and considers the association between the ranks

A

To determine the (strength and direction of an) association between the ranks of X and Y, where X and Y are both non-categorical (i.e. not ordinal)

Non-parametric data i.e. parametric assumptions have been violated/breached

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What assumptions must be met in order for you to use the Spearman’s (rho) test with your data?

A

X and Ymustberatio/interval

Association between the ranks of X and Y does not need to be linear but it must be monotonic (i.e. does not change direction) (scatterplot)

Theassociationmustshowhomogeneity of variance(scatterplot), wherethedatapointsareevenly distributedalongtheregressionline

Onlyappropriatewheren (samplesize) is at least 20 or more

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What would be an appropriate null and alternative hypotheses for the Spearman’s (rho) test?

A

H0: There is no association between the ranks of X and Y.
H1: There is an association between the ranks of X and Y.

H0: There is no positive association between the ranks of X and Y.
H1: There is a positive association between the ranks of X and Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe what the results mean for a Spearman’s (rho) test.

A

The results show a significant strong positive correlation between the ranks of X and Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In what circumstances would you perform a Kendall’s (tau) test?

A

To determine the (strength and direction of an) association between the ranks of X and Y, where X and Y are both non-categorical (i.e. not ordinal)

Non-parametric data (data is not normally distributed) i.e. parametric assumptions have been violated/breached

Useful with small data set n < 20

Can deal with a large number of tied ranks in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What assumptions must be met in order for you to use the Kendall’s (tau) test with your data?

A

Bothvariablesmustberatio/interval

Association between the ranks of X and Y does not need to be linear but it must be monotonic (i.e. does not change direction) (scatterplot)

Theassociationmustshowhomogeneity of variance(scatterplot), wherethedatapointsareevenly distributedalongtheregressionline

Onlyusefulwheren < 20

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What would be an appropriate null and alternative hypotheses for the Kendall’s (tau) test?

A

H0: There is no association between the ranks of X and Y.
H1: There is an association between the ranks of X and Y.

H0: There is no positive association between the ranks of X and Y.
H1: There is a positive association between the ranks of X and Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe what the results mean for a Kendall’s (tau) test.

A

The results show a non-significant weak negative correlation between the ranks of X and Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In what circumstances would you perform a multidimensional Chi-Square test?

A

Relationship/association between variables (Test of association)

Variables are both categorical i.e. nominal

Independent research design (No subjects/participants appears in > one group)

[Compare the observed and expected counts i.e. Test for differences where samples are independent]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What assumptions must be met in order for you to use the multidimensional Chi-Square test with your data?

A

Randomly sampled

Variables must be categorical i.e. nominal

Independentmeasures

Counts(actualnumbers), notpercentages

No calculatedexpected value < 1

No > 20% of expected values < 5

Solution=collect more data, collapse categories, or use an exact test (SPSS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What would be an appropriate null and alternative hypotheses for the multidimensional Chi-Square test?

ResearchQuestion: Does the proportion of athletes who are normal weight or overweight differ by sport?

A

(H0):Inthepopulation,thethreesportsdo not differ in the proportions who are normal and overweight.

(H1):Inthepopulation,thethreesportsdo differ in the proportions who are normal and overweight.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Describe what the results mean for a multidimensional Chi-Square test./
Interpret the results
Write-up of conclusion and results

A

Method
A Chi-square test was performed to test the H0 that the 3 sports do not differ in the proportions who are normal and overweight

Results
There was a difference between the proportion of those athletes who are normal and those who are overweight in the 3 sports (Field, Netball and Rowing), Chi-Square statistic = … (df = …, n = …), p = …

Basically:
Method: Test was performed to test the H0
Results: 
Conclusion/result
Chi-Square statistic
df
n
p-value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

In what circumstances would you perform a McNemar’s (Chi-Square) test?

A

Relationship/association between variables (Test of association)

Variables are both nominal

Repeatedmeasuresdesignwithtwo dichotomous variables

[Test for differences where samples are paired]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What assumptions must be met in order for you to use the McNemar’s test with your data?

A

Randomly sampled

Dependent/repeated measures

DV and IV must be
dichotomous
of only 2 categories each

Variables must be categorical i.e. nominal

Counts(actualnumbers), notpercentages

No calculatedexpected value < 1

No > 20% of expected values < 5

Solution=collect more data, collapse categories, or use an exact test (SPSS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What would be an appropriate null and alternative hypotheses for the McNemar’s test?

Research question: To investigate the number of correct identifications of the writer’s sex by their handwriting style

49Psychologystudentswereaskedtowriteusingtheir normal handwritingandthenaskedtowriteimitatingthe handwritingoftheopposite sex
Students recruited a participant to judge the handwriting of both samples and identify the sex (repeatedmeasures)

IV: handwritingstyle
DV:participant’sjudgementof handwriter’ssex

A

H0: There will be no difference in the number of correct identifications of the writer’s sex from the 2 handwriting samples.

H1: There will be a difference in the number of correct identifications of the writer’s sex from the two handwriting samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Describe what the results mean for a McNemar’s test.

A

Method
A McNemar’s Chi-Square test was performed to test the H0 that there will be no difference in the number of correct identifications of the writer’s sex from the two handwriting samples

Results
There is a significant difference in the number of correct judgements between the two conditions of handwriting style (n = …, exact p = …)
Of the 49 participants, ‘..’ correctly identified the handwriter’s sex for normal writing. Of the ‘…’ who were incorrect for the normal handwriting, ‘…’ of them correctly identified the handwriter’s opposite handwriting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

In what circumstances would you perform an independent samples design t-test?

A

Parametric data

Independent (i.e. different) data/groups/samples

To compare means - compare sample mean to another sample mean
i.e. to compare differences between groups (mean)

e.g. Intervention and control group –> study participant is in one group only

[independent data: data that comes from different (independent) groups of people]

26
Q

What assumptions must be met in order for you to use the independent t-test with your data?

A

DependentVariableisratio/interval

Measurementsincondition1areindependentof measurementsincondition2

For n < or equal to 30 –> distribution of DV data for each group (X and Y) should not be badly skewed i.e. should follow a normal distribution
(Can use CLT to help explain, if we still remember)

Homogeneity of variance:
Thevariance oftheDVdata forthetwogroupsshould not be very different
A problematic difference in variances is indicated by a significant Levene’s Test
Ifsignificant,interpretthep-valueassociatedwith‘equal variancesnotassumed’
Ifnon‐significant,interpretp-valueassociatedwith‘equal variancesassumed’

27
Q

What would be an appropriate null and alternative hypotheses for the independent t- test?

two-tailed
one-tailed

A

H0: There is no difference between the population means of X and Y.
H1: There is a difference between the population means of X and Y.

H0: The population mean of X not > population mean of Y.
H1: The population mean of X > population mean of Y.

28
Q

Describe what the results mean for an independent t-test.

A

p < or equal to 0.05 or 0.01 –>
There is a significant difference between the population means of X and Y

or

The population mean of X is significantly > population mean of Y

29
Q

In what circumstances would you perform a paired design t-test?

A

Parametric data

Dependent/paired (i.e. same) data/groups/samples

To compare means - compare sample mean to another sample mean
i.e. to compare differences within groups (mean)

e.g. pre-test post-test study
Data collected from an/the same individual at different points in time/under different conditions
Compare differences in outcome between time 1 & 2 or condition 1 & 2 (mean)

[dependent/paired data: data that comes from one group of individuals]

30
Q

What assumptions must be met in order for you to use the paired t-test with your data?

A

DependentVariableisratio/interval

Observationsnotindependent
EachmeasurementinCondition/TIme1hasamatchin Condition/Time2

For n < or equal to 30 –> distribution of differences between X and Y (i.e. X - Y) should not be badly skewed i.e. should follow a normal distribution
(Can use CLT to help explain, if we still remember)

Homogeneity of variance

31
Q

What would be an appropriate null and alternative hypotheses for the paired t-test?

two-tailed
one-tailed

A

H0: No difference in the means before and after.
H1: A difference in the means before and after.

H0: Mean after < or equal to mean before.
H1: Mean after > mean before.

or

H0: Mean difference = 0.
H1: Mean difference is not = 0.

H0: Mean difference is not positive.
H1: Mean difference is positive.

32
Q

Describe what the results mean for a paired t-test.

A

p < or equal to 0.05 or 0.01 –>
Significant difference between the means before and after

or

Mean after is significantly > mean before

33
Q

In what circumstances would you perform a Mann Whitney U test?

A

Non-parametric data:
OrdinalscaleDV
Ratio/intervalDVthatdoesnotmeetparametric assumptions
(Samplesizesaresmallandnormalityis questionable
Datacontainoutliersthat becauseof theirmagnitude distort themeanvaluesandaffecttheoutcomeofthecomparison)

Independent (i.e. different) data/groups/samples

To compare mean ranks/medians - compare sample medians to another sample median
i.e. to compare differences between groups (median)

e.g. Intervention and control group –> study participant is in one group only

[Totestthe H0that
2 samplescomefromthesamepopulation(i.e.have the same median)
observationsinonesample>than observationsintheother]

34
Q

What assumptions must be met in order for you to use the MWU test with your data?

A

Independent data/samples

Data distributionsofX and Yarethesameshape

Nottoomanytiesinranksofdata

[Datavaluesareassignedranksrelative tobothsamples combined]

35
Q

What would be an appropriate null and alternative hypotheses for the MWU test?

Two-tailed
One-tailed

A

H0: There is no difference between the population medians of X and Y.
H1: There is a difference between the population medians of X and Y.

H0: The population median of X not > population median of Y.
H1: The population median of X > population median of Y.

36
Q

Describe what the results mean for a MWU test.

A

p < or equal to 0.05 or 0.01 –>
There is a significant difference between the population medians of X and Y

or

The population median of X is significantly > population median of Y

37
Q

In what circumstances would you perform a Wilcoxon signed rank test?

[A Wilcoxon signed rank test:
Measuresthedifferencesbetweeneachvariable
Comparespaireddata
Is usedwhenyoucannot justify a normality assumption forthedifferences
Very simple–>countsthenumberofdifferencesthatare positive (+) and those that are negative (‐) and makes adecisionbasedonthesecounts]

A

Non-parametric data

Dependent/paired (i.e. same) data/groups/samples

To compare medians - compare sample medians to another sample median
i.e. to compare differences within groups (median)

e.g. pre-test post-test study
Data collected from an/the same individual at different points in time/under different conditions
Compare differences in the ranks of the outcome between time 1 & 2 or condition 1 & 2 (median)

38
Q

What assumptions must be met in order for you to use Wilcoxon test with your data?

A

Paired/dependent data/samples

Non-categorical data

39
Q

What would be an appropriate null and alternative hypotheses for the Wilcoxon test?

A

H0: No difference in the medians before and after.
H1: A difference in the medians before and after.

H0: Median after < or equal to median before.
H1: Median after > median before.

or

H0: Median difference = 0.
H1: Median difference is not = 0.

H0: Median difference is not positive.
H1: Median difference is positive.

40
Q

Describe what the results mean for a Wilcoxon test.

A

p < or equal to 0.05 or 0.01 –>
Significant difference between the medians before and after

or

Median after is significantly > median before

41
Q

What is a type I error?

A

False positive

Incorrectly rejecting the H0 when it is actually true

Saying that there is a difference when in reality/actually there is no difference

e.g. Telling a man that he is pregnant

42
Q

What is a type II error?

A

False negative

Incorrectly failing to reject i.e. accepting the H0 when it is actually wrong

Saying that there is no difference when in reality/actually there is a difference

e.g. Telling a pregnant women that she is not pregnant (when it is so obvious that she is!)

43
Q

What is the common structure of all statistical tests?/What are the 7 steps of hypothesis testing?

A

Set H0 and H1

Establish alpha i.e. level of significance

Determine p-value

Accept or reject H0

OR

Define study question and choose an inferential test

Set hypotheses

Select/establish level of significane i.e. alpha = 0.05

EDA and assess test assumptions to see if they are met/satisfied

Go ahead and run the test

Obtain p-value

Decide whether to reject or accept H0 + conclusion, interpretation and write-up of results

44
Q

What is the benefit of using a paired t-test over an independent t-test?

A

Independent t-test gives rise to more random error because the control group might, by chance, be very different from the treatment group

Variation is limited in paired t-test as each person is their own control

45
Q

What are residuals?

A

= Predicted - actual value of y

Difference between the predicted value of Y (line) and the actual value of Y (points)

An observable estimate of the unobservable statistical error

46
Q

What is the simple linear regression equation?

A

Y=a+bX
i.e. DV=constant+ coefficient x (IV)

a: constantorintercept

b: coefficient or slope of the line associated with this independent variable
AsXincreasesby1unit, Y increases by b unit

47
Q

What does r^2 = 0.8 mean?

A

80%ofvariabilityinYisexplainedbyX

*Note: Inanexam,interprettheAdjustedRSquare (if it is given) as it is more accurate

48
Q

What is the assumption that all inferential tests make about the sample?

A

The sample is randomly sampled from the population

49
Q

What is heteroscedasticity?

A

No linearity

Data points fan out, does not go along regression line (evenly)

50
Q

How do we obtain the p-value for one-tailed test (directional) from the p-value of/for two-tailed test (non-directional)?

A

p-value for one-tailed test = Half the p-value for two-tailed test

51
Q

What is the difference between one-tailed and two-tailed tests with regard to rejecting the H0?

A

Two-tailed tests are non-directional. We would reject H0 if we found a positive or negative association or difference etc.

One-tailed tests are directional. We only reject H0 if the association or difference etc. is in the direction that we specified/expected

52
Q

What does the multidimensional Chi-Square test compare?

A

Compares observed frequencies in our sample with the frequencies we would expect if there were no relationship at all between the twovariables in the population that the sample was drawn from

53
Q

What is the formula for Chi-square?/How do we obtain a Chi-square statistic?

A

Chi-Square =SUM((O‐E)^2/E)

O: observed count
E: expected count

For each cell, apply the formula (O-E)^2/E
Then sum up all the cells to get the Chi-Square statistic

54
Q

What is (the concept of) degrees of freedom?

How do we calculate it?

A

The more categories there are in the IV and DV, the more chance there is of the analysis being affected by sampling error

(No. of categories in the row variable minus 1) x (No. of categories in the column variable minus 1)
i.e. (rows-1)(columns-1)
EXCLUDE marginal cells!

55
Q

From the study done by Chris Gratton and Ian Jones on Research methods for Sports Studies (2008), what are the 4 purposes of data analysis?

A

Describe

Compare

Examine similarities

Examine differences

56
Q

What are the aims of Descriptive statistics?

A

Check for errors and outliers

Describe and summarise the data

Spread of the data

Ensure appropriate analysis

Data parametric or non-parametric?

57
Q

Ways of summarising interval/ratio data

A

Measure of Central Tendency
mean
median
mode

Measure of Dispersion
range
SD
variance

Normal curve, skewness, kurtosis

58
Q

What do parametric tests assume about the characteristics of the sample in terms of its distribution?

A

Data is drawn from a normally distributed population (i.e. data is not skewed)

Have the same variance or spread on the variables being measured

59
Q

What assumptions do non-parametric tests make about the characteristics of the sample in terms of its distribution?

A

Do not make any assumption

60
Q

What is p-value?

A

ExactprobabilitythatH0 istrue

Probability that the difference found occurred by chance

61
Q

When do we use non-parametric tests?

A

When assumptions of parametric tests are not met (i.e.breached)
levelofmeasurement (e.g.,interval or ratio data)
normal distribution
homogeneity of variances across groups

Not always possible to correct for problems with the distribution of a data set (i.e. data transformation) –> havetousenon‐parametrictests:
Make fewer assumptions about the type of data on which they can be used
Manyofthesetestsuse“ranked”data

62
Q

What is alpha/level of significance?

A

The chance of making a Type 1 error and tolerating it

Alphalevelof.05(5%), decidetorejectH0 andacceptHA whenp-value isnomorethan.05 –>
up to 5% chance that you are wrong in concluding that there is a difference (makingaType1error) when there actually isn’t (false positive)