What is a numerical data
A.K.A nummerical variable
Continuous - any number value e.g. height, weight, mental health symptom score
Discrete - only specific values (e.g. integers), counts, years, number of cars on a street ( won’t make sense for it to be a decimal)
What is categorical data
Values which represent groups or categories
Can be:
Ordinal - categories in which the order is important e.g. S, M, L shirts, birth order, grade
Nominal - the order isn’t significant e.g. Postcode, gender, eye colour
Can continuous data be changed into categorical data?
Is this recommended? Why/why not?
Give an example of when it might be appropriate to change the data?
Yes it can be - no it is not recommended because data may be lost and there will be less variability (variability is good and it allows us to choose appropriate statistical analyses).
If there is good clinical reason to e.g. the CES-D scale for depression provides cut off scores (16+) for a clinical risk of depression thus it’s okay to put them in “high CES-D” and “low CES-D” scores
What are some elements to consider when choosing the most appropriate statistical test/analysis to run?
In hypothesis testing
- depends on Null hypotheses or Alternative hypothesis
Test statistic - values which describes the degree to which the group difference/ relationship between variables vary from the null hypothesis of no relationship/ no difference
Compute P values - a number describing how likely it is that your data would have occurred by random chance (i.e. that the null hypothesis is true).
(threshold set before hand e.g. 0.05)
What is the null hypothesis?
There is no relationship/ no group difference
What is the alternative hypothesis?
A relationship exists between the groups/ there are group differences
How do you select the most appropriate statistical test/analysis ?
___ is always the explanatory, predictor or independent variable
“x” is always the explanatory, predictor or independent variable
(right hand side of the equation)
____ is always the outcome or dependent variable
“y” s always the outcome or dependent variable
left hand side of the equation
What are some questions to ask in relation to the y = x equation?
(important for linear regression)
How might a “group differences” question be phrased?
How might a “relationship/association” question be phrased?
both continuous
How is causality inferred?
by the research method and design
What is the gold standard method for inferring causality?
Randomised control trials
True of false, statistical tests test causality.
not necessarily - they only test your hypothesis/ proposed relationships between variables of interest
If a test does not infer causality, how must we interpret results?