Research Methods - Quant Flashcards

1
Q

You want to determine if there is an association between the measures. Assuming the data is normally distributed what statistical test would you use?

A

Pearson’s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

You want to determine if there is an association between the measures. The data is not normally distributed, what statistical test would you use?

A

Spearman’s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Categories for determining correlation coefficient? (Cohen)

A
Small
= -0.3 to 0.3
Medium
= -0.5 to -0.3 or 0.3 to 0.5
Strong
= -0.5 to -0.9 or 0.5 to 0.9
Very Strong
= -1.0 to -0.9 or 0.9 to 1.0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is meant by covariance?

A
  • Is a measure of the joint variability of two variables - If two variables are related, when one variables deviates from its mean we would expect the other variable to deviate from its mean in a similar way, i.e. they covary.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What type of association is shown in a graph with a linear +ve line, followed by a plateau then another linear +ve line?

A

Positive monotonic relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

You wish to determine the relationship between X and X. State the Null hypothesis using statistical symbols.

A

H0: p = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When examining the association between two variables, which of the following should you never do:
A – plot the data to determine if the relationship is linear
B – extrapolate the relationship beyond the range of the data
C – imply a causal relationship between variables
D – A and B
E – B and C

A

B and C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Example of how a relationship may arise between two variables?

A
  • x causes y or y causes x
  • time
  • coincidence
  • confounding variables
  • correlation induced by a third (confounding) variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can you control for variation between subjects (not comparing like for like)?

A

try and make groups as homogeneous as possible (try avoiding any bias/confounding factors)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how can you control for too many output variables being measured (complex study design)?

A

reduce the number of measured output variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how can you control for too many interventions being used?

A

employ one intervention at a time and include washout periods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

which methods are best for minimising bias?

A

utilising a cross-over study or a randomised control study/trial (RCT)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Types of blinding (with definitions)

A

Single blind
- participants unaware of intervention group (e.g not aware if receiving drug or placebo)
Double Blind
- participant and researcher both unaware (assigned by third party)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

(what test) You have measured stature for a group of male and female students and want to determine if there is a difference between the two groups. You have established that the data is normally distributed.

A

Independent t test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

You have measured resting blood pressure and percentage body fat in a group of sedentary middle managers. You want to establish if there is a relationship between the two measures. Assuming the data is normally distributed what statistical test would you use?

A

Pearson’s correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

You have conducted research into the effect of beetroot juice on 20m sprint performance. A group of sprinters were each tested twice: before ingestion of beetroot juice and 1 hour after ingestion of beetroot juice. You want to evaluate if the ingestion of beetroot juice has affected the 20m sprint performance. What parametric statistical test would you use to find out if the beetroot juice had an effect?

A

Paired/Dependent t test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

You have measured weight and standing long jump distance in a group of 12-14 year old girls. You want to establish if there is a relationship between the two measures. You establish the data is not normally distributed, what is the most appropriate non-parametric statistical test?

A

Spearman’s correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

It is important to consider gender differences in anthropometric measurements especially during childhood because children are still growing. Recent research suggests that on average boys are shorter, lighter and have smaller waist circumference measurements compared to girls the same age. You establish the data is not normally distributed, what is the most appropriate non-parametric statistical test?

A

Mann-Whitney

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

You wish to establish if there is a difference in the average height of the children between four different schools? What statistical test is most appropriate to answer this question?

A

One-way ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

A school had percentage body fat measurements taken using two techniques (BIA and DXA). You wish to establish if there is a difference in the average percentage body fat between the different techniques. Assuming assumptions from parametric tests are met, what is the appropriate statistical test?

A

Paired t test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

A group of cardiac patients are participants in an exercise-based cardiac rehabilitation programme and have volunteered to be assessed just before they started the programme, 3 months later and at the end of the programme. What is the appropriate parametric test to assess whether resting heart rate has changed significantly at any of the three test points?

A

Repeated measures ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

You wish to establish if waist circumference is a significant predictor of 20m sprint speed in a sample of children. What is the appropriate statistical test?

A

Simple linear regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

You have collected leg power data using a Wingate cycle test in a group of male cyclists. They each performed two tests: one following consumption of water and one following consumption of a carbohydrate drink. You want to establish if there is a difference between the leg power scores. You establish the data is not normally distributed, what is the most appropriate non-parametric statistical test?

A

Wilcoxon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Method to calculate variation from the mean (6 steps)

A

1) calculate the mean (x̅)
2) calculate the individual data point deviations from the mean (x-x̅)
3) square the deviation from the mean (x-x̅)^2
4) sum the squared deviations ∑(x-x̅)^2
5) calculate (n-1)
6) divide the sum of the squared deviations by (n-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

if data is negatively skewed, which direction is the data skewed towards?

A

Left

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

if data is positively skewed, which direction is the data skewed towards?

A

Right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

If data is normally distributed, you’re able to calculate the chances of randomly selecting data/between certain data. Where are you more likely to select data from?

A

from the middle and not the extremities, around the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

with data, method of determining if you can accept or reject the null hypothesis?

A
  • firstly need to determine null distribution
  • see where observed value fits in this distribution
  • if it lies in the tails (far left/right), good chance the observed variable comes from an alternative distribution
  • so you reject the null hypothesis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

when conducting an independent t test, what are the assumptions that the data has to meet? (x7)

A
  • DV that is continuous (interval or ratio level)
  • IV that is categorical (two groups)
  • Independent samples/groups (i.e. independence of observations)
  • random sample of data from the population
  • normal distribution (approx.) of the DV for each group
  • no outliers
  • homogeneity of variances (i.e. variances approx. equal across all groups)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

when conducting a dependent (paired) t test, what are the assumptions that the data has to meet? (x5)

A
  • DV that is continuous (internal or ratio level(
  • related samples/groups (i.e. dependent observations)
  • random sample of data from population
  • normal distribution (approx.) of the difference between the paired values (difference needs to be normal distributed)
  • no outliers in the difference between the two related groups
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

what does a monotonic function do to variables?

A
  • a function that can either increase or decrease, not linear, either;
    1) as the value of one variable increases, so does the other variable
    2) as the value of one variable increases, the other variable value decreases
32
Q

what would you expect to happen to one variable if another related variable deviated from the mean?

A

you would expect the variable to deviate from the mean in a similar way that the other variable deviated

33
Q

how to gain standard deviation units from deviations from the mean

A

deviation from the mean divided by the SD (giving us the deviation in SD units)

34
Q

what is the Model Sum of Squares (SSm) (theory of ANOVA)

A

how much variability is due to the experimental manipulation

35
Q

what is the Residual Sum of Squares (SSr) (theory of ANOVA)

A

how much variability is due to individual differences in performance

36
Q

how does ANOVA compare variances

A

using the F ratio (F ratio = SSm/SSr)

37
Q

what does it mean if the F ratio is < 1

A

means there is no chance of a sig. difference (noise > signal)

38
Q

Assumptions for needed to run an ANOVA test (x5) (although if not met, won’t alter f value sig.)

A
  • normal distribution in the population (ANOVA is robust and not sig. affected by violations of this)
  • homogeneity of variance
  • scores in various groups are independent
  • data measured at the interval or ratio level
  • the largest of the sample SD’s should not be greater than twice the smallest of the sample SD’s
39
Q

according to O’Donoghue (2012), when testing normality of data, when should a Kolmogorov-Smirnov test be used and when should you use a Shapiro-Wilk test instead?

A

Use Kolmogorov-Smirnov test when there are > (or equal to) 50 data points

Use Shapiro-Wilk test when there are < 50 data points

40
Q

what test is used to test homogeneity of variances and how to understand the results (think of p value)

A
  • Levene’s test
  • if the test result is p > 0.05, we can assume homogeneity of variance
  • if the result is p < 0.05 (or equal to) we should consider data transformation or using non-parametric equivalent
41
Q

when do post hoc tests need to be used?

A
  • when the F ratio comes out at a sig. level (p < (or equal to) 0.05)
  • these tests reveal which of the possible pairs of data sets are sig. different
42
Q

You wish to establish if there is a difference in the average 50 metre sprint speed between rugby, football, cricket and hockey teams? What is most appropriate parametric statistical test to answer this question?

A

One-way ANOVA

43
Q

How does one-way ANOVA determine whether there are significant differences between groups?

A
  • It calculates how much of the variance (variability) in the data can be explained by our model (experimental manipulation) and how much remains unexplained (residual).
  • The ratio between these is the F statistic
  • This is assessed against a critical value based on the degrees of freedom.
44
Q

How does the Bonferroni method control the familywise error rate?

A

By recalculating the critical alpha value as: α/n (where n is the number of pairwise comparisons)

45
Q

Why is it important to plot the data before undertaking simple linear regression?

A

To check if the data is actually linear/represents a straight line

46
Q

You have undertaken a simple linear regression to predict total body fat from the sum of five skinfold measurements and found a regression coefficient of 5.3. Explain what this means.

A

For every one unit change in sum of five skinfold (x), there is a 5.3 increase in predicted body fat % (y)

47
Q

State three ways to determine how good a fit a regression line is to the data.

A
  • Coefficient of determination
  • F Ratio
  • Hypothesis testing of model coefficient
48
Q

What does the coefficient of determination tell you (R^2)?

A

The percentage of the variation in the dependent variable that can be accounted for (explained) by the regression model variables.

49
Q

What is the intercept?

A
  • the point at which the regression line crosses the y=axis

- the value of y when x = 0

50
Q

What is the regression coefficient?

A
  • the gradient/slope of the regression line
  • change in y with a one unit change in x
  • can be +ve or -ve
51
Q

In an SPSS output, where would you find the intercept?

A

next to the constant (top line) underneath unstandardised coefficients

52
Q

In an SPSS output, where would you find the regression coefficient?

A

next to the predictor variable (bottom line) underneath unstandardised coefficients

53
Q

What does the adjusted R squared value tell? (e.g. adjusted R square = 0.361)

A
  • how much of the variability can be explained by the model (SSm) compared to how much cannot be explained by the model (SSr)
  • e.g. 36.1% of variation in the … is accounted for by the ∑…
  • OR 63.9% of variation is unexplained
54
Q

When using ‘Hypothesis testing of model coefficients’, if the gradient of the slope is equal to zero, what does this mean?

A

There is no relationship between the variables

55
Q

During multiple linear regression with categorical data (rather than continuous), how are B2 and X2 interpreted?

A
  • B2 still interpreted as the difference in Y for one unit difference in X
  • However, X2 is coded as either 0 or 1, so a one unit difference represents switching from one category to the other
56
Q

What are the benefits (advantages) of using repeated measures research designs?

A

Improved sensitivity
- reduced unsystematic variance and more sensitive to experimental effects
Economy
- fewer participants needed

57
Q

Assumptions for repeated measures ANOVA to assume Sphericity

A
  • homogeneity of variance

- homogeneity of covariance (equal or similar correlations between the groups of values for the DV)

58
Q

When performing a Sphericity assessment (Mauchly’s test), what to do is p > 0.05?

A

if p > 0.05, then sphericity is met and the ANOVA score is read from the “Sphericity assumed” row

59
Q

When performing a Sphericity assessment (Mauchly’s test), what to do is p < 0.05?

A
  • if p < 0.05, the sphericity is violated and a corrected score must be read from either the “Greenhouse-Geiser” or the “Huynh-Feldt” rows
  • this adjusts the degrees of freedom to reduce the risk of making a type l error
60
Q

when looking at the interaction effects, what to do if there is no interaction effect?

A

post-hoc tests can be applied to each individual as a whole

61
Q

when looking at the interaction effects, what to do if

there is a significant interaction?

A

post-hoc tests need to applied to a factor for each level of the factor it interacts with

62
Q

Define the terms ‘mode’ and ‘median’

A

Mode - the most frequently occurring event/value in a data set
Median - the point which has half the values in a sample above and half below, i.e. the middle term in a data set that has been rank-ordered

63
Q

Identify the following statistical symbols

H1 & s2

A

s2 sample variance

H1 alternative to null hypothesis

64
Q

Describe the difference between ordinal and ratio data? Please provide examples of each.

A

Ordinal – can be rank-ordered, however distance between ranks have no meanings, e.g. race positions, army ranks
Ratio - the distance between units has meaning, it is possible to construct meaningful fractions due to true absolute zero e.g. height, weight

65
Q

Provide 2 measures of variability

A

Range, standard deviation, variance

66
Q

Calculate the variance for the following data stating the equation used and showing all workings;
x: 2 3 8 3 4
Deviation from the mean (x-x ̅ ): 2 -1 4 -1 0
Deviation Squared (x-x ̅ )^2: 4 1 16 1 0

A
(equation)  s^2=(∑(x-x ̅ )^2 )/(n-1)
x ̅ = 20 ÷ 5 = 4
Sum of squared deviations ∑(x-x ̅ )^2  = 22
Divide by 5 – 1 (𝑛−1) = 4
s2 = 5.5
67
Q

Outline the difference between descriptive and inferential statistics

A

Descriptive - used to organize, summarise and describe measures of a sample. No prediction or inferences are made regarding population parameters.

Inferential - used to infer or predict population parameters from sample measures, used to test hypotheses

68
Q

Approximately what percentage of the area of a normal distribution falls with ± 3 standard deviations of the mean?

A

99.7%

69
Q

Probability as a number can lie between which two values?

A

0 and 1

70
Q

A sample of 10 year old school children have had percentage body fat measurements taken using bio-electrical impedance analysis. You wish to establish if there is a difference in the average percentage body fat between boys and girls. Assuming assumptions for parametric tests are met, what is the appropriate statistical test?

A

Independent t test

71
Q

Define what is meant by a P value of 0.01.

A

There is a 1% probability of getting the observed result if the null hypothesis is true

72
Q

When would you use a paired t-test?

A

Looking for a difference in the same group at two time points

73
Q

You have undertaken a Pearson’s correlation between X and X and found an r value of -0.67. What does this correlation coefficient tell you about the relationship between X and X?

A

It is a strong negative relationship

74
Q

Identify the following statistical symbols
μ
σ
σ^2

A

μ - population mean
σ - population SD
σ^2 - population variance

75
Q

Identify the following statistical symbols
N
n

A

N - number of sampling units in a population

n - number of sampling units in a sample

76
Q

Identify the following statistical symbols
z
H0

A

z - SD unit of a normal curve
H0 - null hypothesis
x̅ - sample mean