prac q's 2 Flashcards
a) What is the difference between a statistic and a parameter?
A parameter is a numerical characteristic of a population while a statistic is a numerical characteristic of a sample.
b) What is meant by sampling variability in a statistical study?
Sampling variability refers to the process whereby statistics, such as the sample mean, would give different results if the random sampling process was repeated. We thus need to account for sampling variability when making any conclusions from our data.
c) What is meant by the term double-blind in an experiment?
In a blind experiment the subjects do not know what treatment they are receiving. In a double-blind experiment the experimenter is also blind as to what treatment the subjects are receiving
State the assumptions that have to be made when performing a linear regression.
Linear relationship;
Constant variability;
Normal variability.
Discuss the assumption of constant error variance with reference to the residuals plot shown above.
the scatterplot is grouped in a clunk
- The variance seems fairly constant for the 26 participants in the main group. It is difficult to assess for the outlying values.
null hypothesis for regression analysis
slope of linear is 0 !!!!
Null hypothesis: The slope of the linear association is 0
P-value: p < .001
Conclusion: Very strong evidence against the null hypothesis, suggesting that there is an association between 6MWD and Age.
e) What is the R-squared value for this linear model and how would you interpret it?
eg r^2 is 0.485
how much variability is explained by the model relative to the total variability = what is the proportion of the explained variance …. BASICALLY TELLS YOU HOW GOOD IS MODEL
R-squared is 0.485, suggesting that 49% of variability in 6MWD values is explained by knowing the age of the participant.
Overall, do you think this is a good model for predicting expected 6MWD? If not, then how would you suggest the predictive capabilities might be improved?
- the bloody thing is all over the spot
No, the linear relationship seems spurious, likely just a result of trying to fit a line through the outlying points in the scatterplot. If the researchers are interested in the full range of ages 25-75 then they should include more subjects in the 20-40 and 60-75 age ranges to determine whether there is a genuine linear relationship.
levene’s test
Null hypothesis: The variances within the three groups are all equal
P-value: p = .093
Conclusion: No evidence to reject the null hypothesis
What conclusion would you draw from the results of Levene’s test in relation to the assumptions of the ANOVA (p<0.001) and why?
It is reasonable to assume that the variances within the three groups are all equal.
c) What other assumptions should we make before performing ANOVA?
The groups are independent of each other and the variability within each group is Normal.