Week 2 - Research design and appropriate analyses Flashcards
What is a numerical data
A.K.A nummerical variable
Continuous - any number value e.g. height, weight, mental health symptom score
Discrete - only specific values (e.g. integers), counts, years, number of cars on a street ( won’t make sense for it to be a decimal)
What is categorical data
Values which represent groups or categories
Can be:
Ordinal - categories in which the order is important e.g. S, M, L shirts, birth order, grade
Nominal - the order isn’t significant e.g. Postcode, gender, eye colour
Can continuous data be changed into categorical data?
Is this recommended? Why/why not?
Give an example of when it might be appropriate to change the data?
Yes it can be - no it is not recommended because data may be lost and there will be less variability (variability is good and it allows us to choose appropriate statistical analyses).
If there is good clinical reason to e.g. the CES-D scale for depression provides cut off scores (16+) for a clinical risk of depression thus it’s okay to put them in “high CES-D” and “low CES-D” scores
What are some elements to consider when choosing the most appropriate statistical test/analysis to run?
In hypothesis testing
- depends on Null hypotheses or Alternative hypothesis
Test statistic - values which describes the degree to which the group difference/ relationship between variables vary from the null hypothesis of no relationship/ no difference
Compute P values - a number describing how likely it is that your data would have occurred by random chance (i.e. that the null hypothesis is true).
(threshold set before hand e.g. 0.05)
What is the null hypothesis?
There is no relationship/ no group difference
What is the alternative hypothesis?
A relationship exists between the groups/ there are group differences
How do you select the most appropriate statistical test/analysis ?
- Does data meet ASSUMPTIONS for parametric tests?
- Type of VARIABLES e.g. continuous or categorical
- RESEARCH DESIGN e.g. repeated measures, within subjects
- How many LEVELS are in the variables? e.g. gender may have 3 levels (male, female, non-binary)
- How many PREDICTORS and OUTCOME variables
- Does data MEET ASSUMPTIONS
___ is always the explanatory, predictor or independent variable
“x” is always the explanatory, predictor or independent variable
(right hand side of the equation)
____ is always the outcome or dependent variable
“y” s always the outcome or dependent variable
left hand side of the equation
What are some questions to ask in relation to the y = x equation?
- does x predict y?
- does x explain y?
- do changes in x result in changes to y
- are there group (x) differences in y?
(important for linear regression)
How might a “group differences” question be phrased?
- is there a difference in Y among X? (e.g. difference in morning preference among female participants compared to male participants)
- are there X differences in Y?
- are there differences in group (X) means (of Y)
How might a “relationship/association” question be phrased?
both continuous
- is there an association between X and Y?
- Does X predict Y?
- Is there a relationship/ correlation/ association/ link between X and Y?
How is causality inferred?
by the research method and design
- difficult to prove causation
- “third variable problem” - might be responsible for correlation between two other variables (confounders)
- in cross-sectional studies it is difficult to measure outcomes prior to study and after study
What is the gold standard method for inferring causality?
Randomised control trials
True of false, statistical tests test causality.
not necessarily - they only test your hypothesis/ proposed relationships between variables of interest
- in results write p, don’t infer causality if it is not warranted