me myself and I lecture 7- multiple testing Flashcards
what is a regression analysis used for?
We can use a regression analysis to determine how much variation in height is dependent on sex (R2 value).
what is a multi-variate model used for?
to test if there is something else which has an underlying association
eg When we look at both football and sex, the effect of football decreases.
Football isn’t really telling us anything about height, it is simply associated with sex.
If an individual plays a lot of football, they are likely to be male, and males are on average taller than females.
what does multiple testing allow us to identify?
We will identify:
-Some real biological modifiers
-Some co-variables
-Some chance associations
what is cherry picking?
We should avoid cherry picking
i.e., only presenting our positive results & ignoring other findings as This is a misrepresentation of the data.
We should be open and transparent about all the data.
what is the ideal experiment?
-Define your hypothesis
-Design an experiment to collect the data
-Design your statistical testing plan in advance
-Conduct the experiment
-Analyse the data (using predefined plan)
-Publish the results (positive or negative)
-[Conduct any post-hoc analyses]
-Formulate new hypotheses
why shouldn’t we be trawling for positive results?
The publication system is a large part of the problem. Positive results are published way more than negative results. There is a pressure to find a significant result.
what is cross sectional data?
Cross-sectional data refers to data collected at a single point in time or over a short period, capturing a snapshot of a particular population, phenomenon, or variable set.
what is longitudinal data?
refers to data collected over a period of time, tracking the same subjects across multiple time points. This type of data allows for the analysis of changes, trends, and dynamics within a population or system.