Data and Graphical Summaries Flashcards
Observational studies
The investigators have no control over the subjects or quantities of interest; they are just observers.
The investigators cannot use randomisation for allocation into groups.
association
that one thing is linked to another
easy to establish
Does association prove causation?
No it does not, it may suggest it.
Precautions in observational studies
(I) It is very difficult to establish causation
(II) Observational studies with a confounding variable can lead to Simpson’s Paradox
(III) Historical control
Control
a subject who did not get the treatment.
Controlling for confounders
trying to reduce the influence of confounding variables.
contemporaneous control
Controlled experiments need to be performed in the same time period
causation
the relationship between cause and effect
Historical control
Sometimes time is a confounding variable - contemporaneous control
Simpson’s paradox
The association between a pair of variables (X,Y) reverses sign upon conditioning of a third variable Z, regardless of the value taken by Z.
Sometimes there is a clear trend in individual groups of data that reverses when the groups are pooled together.
IDA - Initial data analysis
First general look at the data, without formally answering the research questions
Helps you see whether the data can answer research questions
Identify the data’s main qualities
Suggest the population from which a sample derives
IDA process
Data background: checking quality and integrity of the data
Data structure: what info has been collected?
Data wrangling: scraping, cleaning, tidying, reshaping, splitting, combining
Data summaries: graphical and numerical
Variables
measures or describes some attributes of the subjects
Data with p variables is said to have dimension p
Graphical summaries
Best highlight features of this data
To some extent we use trial and error
Big data
Refers to massive amounts of data being collected
High dimensional: more variables than subjects
Requires more complex visualisations, complicated machinery