Quantitative research methods Flashcards

1
Q

What’s P hacking?

A

Changing the data so you get a significant value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What’s a proxy measure?

A

When you can’t directly measure it so you take lots of other values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What’s Harking?

A

Hypothesising after the results are known

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What’s publication bias?

A

Not getting a significant difference is hard to get published

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What’s a type 1 error?

A

If you find a significant difference where none should exist

Incorrect rejection of a true null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Whats the Nuremberg code?

A

Informed consent is essential

Research should be based on prior animal work

Risks should be justified by benefits

Qualified scientists

Physical and mental suffering avoided

Research that could result in death or disabling shouldn’t be done

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What’s a type 11 error?

A

When there is a significant difference but you fail to find it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What’s a cross sectional design?

A

Comparing different groups performances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are longitudinal studies?

A

Comparing same groups performance at different time points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What’s observational research?

A

Correlation
Linear regression
Multiple regression

Useful for establishing relationships between variables, difficult to infer if its an actual cause and effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How to establish cause and effect?

A

The dependent variable should vary only in changes to the independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does nominal mean?

A

Numbers used to distinguish amongst objects without quantitative value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does ordinal mean?

A

Numbers used only to place objects in order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does interval mean?

A

Scale on which equal intervals between objects represent equal differences (no true zero)

eg. celcius

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does ratio mean?

A

: Scale with a true zero point – ratios are meaningful. Ratio scale are often common physical ones of length, volume, time etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What’s Quasi experimental design?

A

Treatment group compared to control group

Static group comparison

No random assignment.
• Difficult to ensure baseline equivalence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

True experimental gold standard approach?

A

Random assignment for treatment and control group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Effective research design?

A

Maximise systematic variance (driven by independent variables)
• Minimise error (random) variance
• Identify and control for confounding variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How to Maximise systematic variance?

A

Proper manipulation of experimental conditions will ensure high variability of independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How to Minimise error (random) variance?

A

Reducing the part of the variability that is caused by measurement error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are nuisance variables?

A

Variables that produces underside variation in the dependent variable

Fixed by:

Conduct experiment in a controlled environment

Larger samples – randomly assign your subjects to different conditions

With small samples match your subjects on all
demographic variables across conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are placebo and demand?

A

Some portion of effect due to the participant’s belief in the intervention

Participants want to please the experimenter.

Controlled by:
Good control conditions
• Keep the subjects ‘blind’
• Keep the purpose of the study hidden from the participant.
• If possible, disguise the independent variable.
• Sometimes this is difficult to balance with the ethics of participant recruitment and informed consent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What’s central tendency?

A

Describes measures of the centre of a distribution:

Mean, median, mode

mean = average value (best one, as uses all values)

median = middle value

mode = most frequent recurring value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

advantages and disadvantages of the mean average?

A
  • Can be influenced by extreme values.
  • Is affected by skewed distributions.
  • Can only be used with interval or ratio data.
  • Uses every value in the data set.
  • Tends to be vey stable in different samples (enabling us to compare samples).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What does bimodal mean?
When there are 2 values for the mode multimodal is when there is more than 2
26
Advantages and disadvantages of the mode?
* Easy to determine * Not affected by extreme values * Ignores most of the values in a data set * Can be influenced by a small number of values in the data set
27
Advantages and disadvantages of median?
* Not affected by extreme values * Not affected by skewed distributions (the majority of the data is at one end of the scale) * Can be used with ordinal, interval and ratio data • Ignores most of the values in a data set
28
What's range?
Largest value - smallest value Greatly effected by extreme values
29
How to work out lower quartile, second quartile and upper quartile?
Find the median of the data set (second quartile), then find the median above and below
30
Interquartile range?
Upper quartile - lower quartile Good because not affected by extreme values, but you aren't considering half your data set
31
How to calculate sum of squared errors?
Find the deviance of each value from the mean (how far each value is from the mean) and then square it and then sum it all up
32
What's population variance?
Calculated the same was as the mean, and it's for a population
33
What's sample variance?
The same as calculating mean but divide by the degrees of freedom (n-1)
34
Square root of variance = ?
Standard deviation
35
SD values?
68% of population in one SD 95% in 2 SD 99% in 3 SD
36
Standard error =?
standard deviation / square root of the sample size
37
Normal distribution?
Bell curve shape
38
WHat's positive skew?
the tallest bars at the lower end of the scale
39
What's negative skew?
Tallest bars at higher end of the scale
40
What's Kurtis?
How pointy the distribution is . A distribution that has a lot of values in the tails (called a heavy tailed distribution) is usually pointy. This is called a leptokurtic distribution and is said to have positive kurtosis A distribution that is thin in the tails (has light tails) is usually flatter than normal. This is called a platykurtic distribution and is said to have negative kurtosis . If a data set has a normal distribution of data we call it a mesokurtic distribution.
41
How to know if something is skewed or kurtois?
If skewness is twice as big as the standard error then it's skewed This is the same for kurtois
42
Test to see if population is normal?
Kolmogorov-Smirnov test and Shapirio-Wilk test If it's non significant (bigger than 0.05) so it's normal If it's smaller then it's not normal Shapiro-will is better for small sample sizes
43
What can you do if the data is not normal?
Remove outliers - use a stem and leaf plot This is done with a standard deviation or a percentage based rule Perform a non parametric version of the statistical tests you want to do Collect more data, mote likely to be normally distributed
44
Independent t test?
2 groups you want to compare Large t value shows that the groups are different (above 1) if bellow 1 shows that they are similar t = ((mean 1 - mean 2)) / (square root of all, Standard error of the mean from sample 1 ^2 + Standard error of the mean from sample 2 ^2)
45
Sample t test?
One group across 2 conditions Large t value shows that the groups are different (above 1) if bellow 1 shows that they are similar
46
Assumptions for a t test?
* Data is continuous. This can be either interval or ratio data. * Both groups are randomly drawn from the population (they are independent from each other). * The data for each group is normally distributed. * That there is homogeneity of variance between the samples. In other words, each of the samples comes from a population with the same variance.
47
How to measure homogeneity?
Levene's test If more than 0.05 equal variances are assumed use top ine If bellow 0.05 not assumed use bottom line
48
For an independent t test you need to report?
The t value Degrees of freedom The exact significance value (p) mean and standard deviation/standard error mean difference and confidence intervals range Findings of the Levene's test eg. Levene's Test for Equality of Variances revealed that there was homogeneity of varaince (p = 0.18). Therefore, an independent t-test was run on the data with a 95% confidence interval (CI) for the mean difference. It was found that creativity in the first time (42.15 ± 8.38) contestants was significantly higher than in returning (37.94 ± 7.41) contestants (t(66) = 2.20, p = 0.03), with a difference of 4.21 (95% CI, 0.38 to 8.03).
49
Equation for t value of a paired t test?
t = d(bar) / (SD/ square root of n) d(bar) = mean of the differences between each individuals score in each test condition (SD/ square root of n) = Estimate of variability of mean differences between scores in population
50
Assumptions for a paired t test?
* The data is continuous (interval or ratio data) | * The differences between the samples are normally distributed
51
Reporting the output of a paired t test?
* The t value (t) * The degrees of freedom (df) * The exact significance value (p) The format of the test result is: t(df) = t-statistic, p = significance value. * The mean and standard deviation/standard error for each group * The mean difference and the confidence intervals range eg. Participants were less mischievous when not wearing the invisibility cloak (3.75 ± 1.91) than when wearing the cloak (5.00 ± 1.65). There was a statistically significantly reduction mischief of -1.25 (95% CI -1.97,-0.53) when participants weren’t wearing the invisibility cloak, t(11) = -3.80, p = 0.01.
52
How is one tailed different to 2 tailed?
One tailed means you only test in one direction eg. is it bigger or smaller than 2 tailed is both directions Should pretty much only use 2 tail
53
Method on how to calculate the variance for ANOVA? (think its the same for everything)
Work out the mean of the data set Subtract the mean from each value in the dates to you the 'distance' of each value from the mean (deviance) Square each deviance so they are positive Add them together to get the sum of squares Divide by the degrees of freedom to get the sample variance Or divide the sum of squares by the number of items in the dataset to get population variance (easiest way to do this is to create a table)
54
Good way to examine variance?
Looking at error bars on a graph - show the standard deviation which is the square root of the variance
55
What other bars can be shown on graph?
Standard error bars - standard deviation divided by the square root of the sample size Standard error is smaller than the mean and accounts for the size of the data set (because n Is in the equation)
56
When to use standard deviation as your error bars?
If your assumptions of normality are met and you are interested in exploring the spread and variability of the data then the standard deviation is a more useful value to use. Essentially the standard deviation tells about the range within which we expect to observe values
57
When to use standard error as your error bars?
interested in the precision of the mean you have calculated or in comparing and testing differences between your mean and the mean of another data set then the standard error is more useful the standard error gives us some information about a range in which a statistic is expected to vary.
58
What's the null hypothesis? (H0)
always that there is no difference between groups/conditions and that any difference between the values of the data are caused by random noise and are nothing to do with the intervention. Your experiment will always be seeking to disprove the null hypothesis.
59
What's the alternative hypotheses? (H1)
Your alternative hypothesis (experimental hypothesis) is always that there is a difference or a relationship between the groups or conditions in your experiment and that any differences between the data sets are real and not caused by a random effect. This is essentially what you think is likely to happen in your experiment based on previous literature.
60
What's a type 1 error?
when we reject a null hypothesis that is actually true The level of type 1 error we are willing to risk is called our alpha (α). Typically this we set our alpha to 0.05,The level of type 1 error we are willing to risk is called our alpha (α). Typically this we set our alpha to 0.05, type 1 errors will occur 5% of the time
61
Why is a p value of <0.05 significant?
a good trade-off between not being incorrect too often while still having a chance of finding something which is actually there
62
What's experimental power?
Experimental power is your ability to detect an effect of a certain size. If your study is underpowered you will be unlikely to be able to detect small effects. This is a problem because in reality most effects you are likely to examine are probably pretty small, given that you are examining slight adaptations of established paradigms. The easiest way to ensure you have enough power is to test plenty of people. There is no such thing as ‘too much power’!
63
What is a type 2 error?
A type 2 error is the failure to reject a false null hypothesis. In other words a type 2 error often causes us to conclude that an effect or relationship doesn’t exist when in fact it does. The level of type 2 error we are willing to risk is called our beta (β). If we power our study to 0.8 (which is the common value used), this means our beta is 0.2 and we have a 20% chance of a type 2 error. Can occur due to: o Too small a sample to detect an effect of a certain size o Inadequate variability within independent variable o Measurement error o Nuisance variables
64
What are null findings?
In isolation p values are unable to demonstrate that there are no differences between conditions. A p value of >0.05 does not allow you to accept the null hypothesis, tell you a manipulation has no effect or tell you that the scores in the two conditions are the same as one another. In other words a lack of a significant p value does not enable you to conclude that your intervention had no effect. Under the null hypothesis, p values are uniformly distributed. If there’s no difference between your groups, you are just as likely to get a p value of 0.001 are you are to get a p value of 0.97.
65
Sources of measurement error?
* Inadequate measurement instruments. * Response error from participants. * Contextual factors which impair/interact with the measurement instrument’s ability to measure accurately or the subject’s ability to perform normally (e.g., construction work outside lab).
66
Benefits of null findings?
They enable us to provide evidence which contradicts the current literature base They also enable us to replicate previously described effects (the ability to replicate a result is very important in science).
67
What's the family wise error?
The familywise error rate is also known as the rate of type 1 errors. The familywise error rate is usually at 5% however, it increases proportionately with the number of tests conducted. Familywise error = 1 - (1 - α)^n ``` n = number of tests a = Alpha (usually 0.05) ```
68
How to account for increased error rate?
* Bonferroni * Tukey’s HSD * Holm * Scheffe However, all of these corrections reduce your ability to detect a true effect i.e., they lower your experimental power and increase your chance of making a type 2 error.
69
Describe the ANOVA test?
It's the analysis of variance It compares the type of variance you want (effect variance) with the type of variance you do not want (error variance). This ratio between wanted (effect) and unwanted variance (error) is called the F-ratio.
70
Different word for variance?
Mean square
71
Describe one way independent anova?
The alternative to running multiple independent t-tests is to run a one way independent ANOVA. This tests allows you to compare as many groups as you want in a single test without inflating the familywise error rate or reducing your experimental power. The one way independent ANOVA examines the ratio of variance between your groups and the variance within your groups MSbetween / MSwithin Significant effects occur when you have at least four times as much variance between your groups as you do within your groups. Have to test to see if data is normal or not, also test it's homogeneity
72
Assumptions for a one way independent anova?
* The samples are unrelated to one another (independent). | * The data is interval/ratio scale
73
Describe the F ratio?
When the treatment/manipulation creates little variability in scores, the ANOVA is expected to produce an F statistic close to 1. A larger F statistic may indicate a significant difference (depending on the degrees of freedom). The F value can never be negative and is (almost never) less than 1. The ratio is calculated in the following way: F = (effect + error) / error To see if f ratio is significant need to use a table with the degrees of freedom: Between groups df = g - 1 (where g is the number of groups) Within groups df = N - g N = number of scores in the entire study g = number of groups Use these to find the critical value, if the f ratio is larger then it's significant at the 0.05 level
74
What's error variance?
The error variance is also known as measurement error. Measurement is always present because variables cannot be measured perfectly. Another factor that causes measurement error is individual differences. This is always present because scores might naturally vary from person to person. For the one way independent ANOVA, the error variance (MSE) is always occurs within each group (MSwithin).§
75
How to actually work out the F ratio?
Determine the mean value of each group Subtract the mean score from the dataset from each individual in it's own data set Square those values Add all the squared distances together to get the sum of squares within (SS within) Now we need to find the between: Calculate the sum of the means Square the means and add them together Then use the equation SSm = Sum of (means^2) - ((sum of the means)^2) / number of groups) Then you multiply this number by the amount of people in each group
76
Describe effect size?
= SS between / (SS between + SS within) This is the percentage of variability in the dependent variable explained by the independent variable
77
What's a post Hoc test?
Following a one way independent ANOVA you can perform pairwise post hoc comparisons to determine which pairs of means are different from one another • (Normally select this one on SPS) Bonferroni correction
78
Different types of ANOVA?
``` One independent variable: Between subjects (different participants) Repeated measures / within subjects (Same participants) ``` More than one independent variable: two way, three way, four way
79
Essentially what do p values show?
Only tell us the probability that this result occurred by chance
80
What's effect size?
Tells us how large the effect of our experimental manipulation was Allows standardisation In ANOVA it's partial eta squared, which is the percentage of variability in the DV explained by the IV = SS between/ (SS between + SS error) But it's bias 0. 01 = small effect 0. 06 = medium effect 0. 14 = large effect
81
Describe the Bonferroni test more?
Most conservative correction Directly counteracts inflated family wise error Calculates a new alpha by divine the original alpha (0.05) by the number of tests conducted Some argue it overcorrects for inflated type 1 error and sacrifices too much power Avoid when have a lot of conditions
82
Describe the TUKEY HSD?
Honest significant difference Calculates a new critical value Less conservative than bonferroni
83
Describe LSD test?
least significant difference Makes no attempt to correct for multiple comparisons Equivalent to performing multiple t tests on data Not good
84
Describe Games-Howell test?
Used when you have unequal variances between groups
85
When do we want to use repeated measures anova?
You have to recruit fewer subjects for the same experimental power You account for individuals differences
86
When you can't use repeated measures ANOVA?
Gender Special populations Undertaking condition 1 makes undertaking condition 2 a waste of time If study is too long or too boring
87
IS the F ratio for repeated measure ANOVA?
YEs Still obeys interval/ratio scale Data normally distributed Sphericity - tests variances of the differences between each condition Sphericity is tested via MAUCHLY's test Happens automatically when we run our repeated ANOVA in SPSS
88
What happens if I violate sphericity?
Adjust the degrees of freedom Therefore the p value associated with the f ratio is larger Done via Greenhouse - Geisser
89
How to calculate repeated measure ANOVA by hand?
The F value represents the ratio of wanted (effect or model) variance compared to unwanted (error or residual) variance Difference is that we only have within Calculate the mean for each condition Then we calculate the mean for each participant Then we calculate the variance of each participant over the 4 conditions Getting the F: Step 1 Calculate the total variance: Calculate the grand mean (every single value in study accounted for) Calculate our grand variance (subtract the mean from each value in the dataset and square these distances, then divide this by the degrees of freedom) (every single value in study) Calculate DF (N-1) Then SSt = Grand variance x DF SSt = total variance Confusing step Step 2: Calculate the within-participant variance Take each participants variance and multiple it by the amount of conditions - 1, and sum them all up to get SSw Step 3: Calculate the effect (model) variance SSm = (number of participants per condition (mean of that condition - the grand variance)^2) + do the same for each other condition and add them all up SSm = effect model variance Step 4: Calculate the error (residual variance): SSr = SSw - SSm Working out the degrees of freedom: n = number of participants k = number of means (conditions) Step 1: kn -1 = oft Step 2: n(k-1) = Dow Step 3: k - 1 = dfm Step 4: (n-1)(k-1) = dfr Ones from steps 3 and 4 used to see if f is significant Calculate the mean squares effect variance MSm = SSm / dfm Then calculate the mean squares error variance MSS = SSr / dfr Then calculate the F ratio F = MSm / MSr Then look in table with dfm and dfr, and see if our f value is larger than the one in the table if it is then it is significant report in the same way as one way F (df) = f value, p value, effect size, mean square error Report Mauchly's test of sphericity And report which subsequent correction was applied
90
Assumptions for factorial ANOVA?
Interval/ratio scale (like one way) Data normally distributed (like one way) Sphericity - the assumption that the variances of the differences between conditions are equal
91
In the exam what SPSS questions will be asked?
How to interpret the SPSS output of a RM ANOVA and write up results of the ANOVA in APA style
92
How to report APA style? (I think)
If mauchly test was failed, we will report the greenhouse Geisser corrected degrees of freedom and p values An example would be for an independent: F(1.6,11.2) = 3.79, p=.06, np^2 = 0.35, MSE = 13.71 F(greenhouse geiser degrees of freedom, error greenhouse geisser degrees of freedom) = F value, p = 0.06, np^2 = partial eta squared, MSE = mean square of error Only report pairwise comparisons eg Bonferroni if you have a significant ANOVA
93
Example of how you would report a non significant repeated measures anova?
A one-way repeated-measures ANOVA was conducted with the dose of protein as the independent variable and amount of exercise as the dependent variable. Mauchly's test suggested that sphericity has been violated, so the Greenhouse Geisser correction was used for the ANOVA. The effect of dose was not significant at the p < .05 level, F(1.6, 11.2) = 3,79, p =.06, 𝜂_𝑝^2 = .35, MSE = 13.71. In summary, we found no evidence that the type of protein taken had an effect on exercise performance.
94
Example of how to report a significant repeated measures ANOVA?
A one-way repeated-measures ANOVA was conducted with the time of season as the independent variable and agility as the dependent variable. Mauchly's test suggested that sphericity had not been violated, so no correction was used for the ANOVA. The effect of time of season was significant at the p < .05 level, F(2, 14) = 4.16, p =.04, 𝜂_𝑝^2 = .37, MSE = 8.69. However, Bonferroni corrected pairwise comparisons revealed no differences between preseason (M = 9.88, SD = 7.10), midseason (M =12.13, SD = 7.79) and postseason (M = 7.88, SD = 4.36).
95
What does a factorial anova allow?
Lets you look how the variables interact with one another
96
Terminology of a factorial ANOVA?
Independent variable = what you have manipulated (Requires 3 or more different groups or conditions) Factorial designs = allow you to manipulate several variables at the same time In the context of factorial ANOVA, your independence variables is known as a factor Each condition in a factor is known as a level
97
Describe factors and levels more?
Some factors have a fixed number of levels (condition) Eg gender has 2 Some factors can have as many level as you want as you can go precise as you want eg. age or height
98
Benefit of factorial designs?
Factorial designs allow us to look at how variables interact with one another Interactions interactions show the effects of one IV might depend on the effects of another Often more interesting than the IV's by themselves
99
What is 'manipulate multiple between-subject variables' analysed by?
Independent factorial ANOVA
100
What is manipulate 'multiple within-subject variables' analysed by?
Repeated measures factorial ANOVA
101
What is 'manipulate a combination of both' analysed by ?
Mixed factorial ANOVA
102
For a factorial ANOVA the amount of independent variables changes the?
NAme eg. 2-way ANOVA
103
How to write out a factorial ANOVA?
3 x 2 factor, independent factorial ANOVA This would show that factor 1 has 3 levels, and factor 2 has 2 levels
104
What is the main effect of a mixed factorial ANOVA?
When you find a change in your DV due to one of your IVs
105
What is an interaction of a mixed factorial ANOVA?
When the size of this change (main effect) depends on one of the other factors Represented by difference in marginal means being greater than 1 When looking at data plotted: Lines at different heights or lines are sloped represent an effect But if they are parallel shows no interaction An interaction is also found by looking at the values and seeing if something increases more than when it's just by itself when you add the other variable ie. if the effect of an intervention is greater than the sum of its parts
106
If need to SPSS outputs of mixed factorial ANOVA look at the seminar slides from week 7.
ok
107
What's correlation?
Tells us how big a relationship is and in what direction
108
What's regression?
Tells us if one variable can predict the other variable Quantifies what proportion of the variance can be explained by the variance in the independent variable
109
What is the line of best fit?
Line that minimises the sum of squares of the residuals (residual = vertical distance between line and the dot)
110
What's the correlation coefficient (r)?
A number between -1 and 1 and shows how linearly related they are Reported to 2 d.p
111
What does R2 (regression value) tell you?
r = 0.55 then r^2 = 0.3 So 30% of the reason the relationship occurs is due to variable and the other 30% is unexplained
112
Regression equation we use now?
Y = b0 + b1X1 Y is the dependent variable X1 is the independent variable b0 is the intercept of the Y axis b1 is the gradient or regression coefficient
113
What does Std error of the estimate (SEE = SD of residuals) predict?
How accurate our prediction equation So will provide 2 values (multiply by 1.96 for 95% accuracy) that will give a 95% chance that our predicted value will fall between those 2 values Write out like this: 95% of the real ability ratings will fall within +/- 10.1 units of the prediction line (for example)
114
How to report correlation example?
There was a significant, moderate-to-strong, negative correlation between body fat% in childhood and CV health in adulthood (r = -0.55, p = 0.03). Higher levels of body fat were related to lower CVhealth
115
How to report regression example?
Childhood fat% explained 30% of the variance in adult CVhealth. For every 1unit increase in childhood fat%, adult CVhealth decreases by 0.42 units (R^2 = 0.3, b1 = -0.42, p= 0.03)
116
Multiple regression equation?
Y = b0 + b1X1 + b2X2 ... Same as before you just have more independent variables and hence more regression coefficients
117
IN multiple regression what does R. square show now/
How much variance is explained by all the independent variables
118
How to find out which independent variable in multiple regression is most important?
Has the most significant value
119
How to report multiple regression test?
The ‘dribbling’, ‘set shot’, ‘vertical jump’ and ‘drive and lay-up’ tests were each significantly associated with the coach-assessed ability rating when analysed separately. When all four tests were entered into a model all at the same time, together they explained 74% of the variation in the coach- assessed ability rating (R2=0.74, p<0.001). However, only the ‘drive and lay-up’ test was significant in this multiple regression model (p<0.001), explaining 70% of the variation on its own. None of the other three tests could explain a significant amount of the unexplained variance. The ‘drive and lay-up ‘ test is the only one they need. Only the things that are significant go into the regression equation
120
Most of stuff In exam is on multiple regression that will be explained below
ok
121
What does multiple regression do again?
Which of a set of independent variables predicts the variance in a dependent variable
122
Null hypotheses for a multiple regression?
basketball example The set of basketball tests will not explain significant variance in ability ratings The hypotheses would be that the set of basketball tests will explain significant variance in ability ratings
123
What is stepwise multiple regression?
SPSS selects which variables are entered SPSS indetifies the predictor variable (IV) which explains the most variance in the DV and puts that in the model THen, the IV which explains the most of the remaining unexplained variance is added to the model, provided the amount that it explains is statistically significant This process is repeated until there are no IVs left that would explain further variance IVs in the model can be taken out if they become clearly no longer significant p>0.1 due to the addition of new variables Gives you an R square value for each step of adding variables Problems: Data driven, not theory driven, researchers chose a list of variables they THINK might be a predictive
124
What is Hierarachical multiple regression?
Experimenter decides the order in which variables are entered
125
What is forced entry multiple regression?
All predictors are entered simultaneously Problem is does not determine the unique variance that each IV adds to the model Remove all variables with a p value greater than p>0.1 and then re-ran regression
126
Model answer for a stepwise regression?
Thigh skinfold explained 92.1% of the variation in DXAfat%. Adding calf skin fold to the model explained a further 1.3% of the variation (p=0.005 or p<0.01). Adding iliac crest skin fold explained a further 1.0% (p=0.008 or p<0.01). The final step of the model building process, that of adding gender explained a further 1.4% (p=0.001 or p<0.01). Therefore the final model explained 95.9% of the variation in DXAfat%
127
What happens if you violate assumptions?
Type 1 or 2 error Need to do them because:
128
Multiple regression assumptions?
No multi collinearity between IVs in model (bottom right box, Largest VIF should be below 10, if to are more than 5 then remove 1 of them, average should be about 1. Also IVs must not be highly correlated (above r=0.8 or 0.9) Independence of residuals - can value from 0-4, 2 meaning errors are uncorrelated so values less than 1 or greater than 3 are problematic. Homoscedasticity of residuals - randomly and evenly dispersed dots around 0, don't want funelling or fanning Linearity of residuals - should be roughly along the horizontal line Normaility of residuals - assessed visually using a histogram, mean =0, SD =1
129
Casewise diagnostics?
Useful tool to identify outliers and potentially influential cases In a normal sample 95% cases should have standardised residuals within +/- 2 only 1 out of 1000 would be +/- 3
130
What's a parametric test?
Tests in which the distributions of the variables being assessed or the residuals produced by the model are assumed to fit a probability distribution (eg. a normal distribution) ANOVA assumes data in each group is normally distributed Multiple regression assumes that the residuals are normally distributed
131
What are non parametric tests?
Do not rely on assumptions such as a normal distribution or homogeneity of variance 2 situations in which they are required When your sample size is small (n<15) and when outcome data collected on continuous scales are not normally distributed When data are not measured on a continuous scale but on either and ordinal or nominal scale
132
What is nominal data?
People are grouped into mutually exclusive categories which are not in an order
133
What is ordinal data?
People are grouped into mutually exclusive groups which can be ordered/ranked Ranks people in order but does not indicate how much better one score is to another
134
What is continuous data?
The distance/intervals between 2 measurements is meaningful and equal anywhere on that scale
135
Non Parametric equivalent for Pearsons correlation?
Spearman's rank order correlation
136
Non Parametric equivalent for Independent t-test?
Mann-Whitney U test
137
Non Parametric equivalent for paired t-test?
Wilcoxon signed-rank test
138
Non Parametric equivalent for One-way ANOVA?
Kruskal-Wallis ANOVA
139
Non Parametric equivalent for repeated measures ANOVA?
Friedmans ANOVA
140
Assumptions of chi-square test?
Each person only contributes to only one cell of the contingency table, therefore you cannot use a chi square test on a repeated measures design The expected frequencies should be greater than 5. If an expected frequency is below 5, the result is a loss of statistical power, and may fail to detect a genuine difference
141
How to do the CHI-SQUARE test?
X^2 = sum of (((O - E)^2) / E)) Do this for each group Degrees of freedom = number of categories - 1 Look at a table for value of df and sig level of 0.05 to Get a critical value, compare them if critical value is larger it is non significant
142
How to report a chi square test when it has 2 variables?
Assumption: The minimum expected frequency is 14.44 which is greater than 5, so assumption is met Results (significance of X^2): X^2 = 25.356, p<0.001, statistically / reject the null hypothesis. Leadership style and performance outcome are associated. 26.3% (10/38) of athletes who had a democratic coach won, whereas 70.4% (114/162) that had a autocratic coach won. Interpretation/Conclusion A much greater percentage of athletes won under an autocratic coach than under a democratic coach
143
go through power point slides eventually to interoperate where u get info from spss
ok