Introduction Class 2 Flashcards
What are three main phases needed before a prediction model can be used in clinical practise?
- Development (7 Steps of development)
- Research question and initial data inspection
- Coding of predictors
- Model specification
- Model estimation
- Evaluation of model performance
- Internal validation
- Model presentation.
- External Validation (Completely new data set)
- Impact assessment (Clinical usefulness)
(Steyerberg et al. 2014 Eur Heart J.2014 Aug 1;35(29):1925-31.)
Prediction models are relevant to many questions in clinical medicine, public health, and epidemiology.
What are some examples?
Public health:
Identifying target populations for preventive interventions (Qrisk and Qdiabetes)
Clinical practice:
Therapeutic decision making: Should a treatment start? Which treatment is the best? How intense should it be (e.g. drug dose)?
Management decision:
Do we need more hospital beds? How cost-effective will be a treatment?
Providing realistic expectations of the course of the disease for patients and their relatives
Medical research:
In experimental trials (RCTs) predictive baseline characteristics can help to include or stratify patients and improve statistical analyses, e.g. stratification by biomarkers
Prediction research asks different question than explanatory medical research.
What is this?
How can we reliably predict outcomes of individuals?
-> Prediction models predict outcomes for individuals.
Theoretical models are not necessary and causal interpretation are not of interest
What is the aim of prediction modelling?
To find a model with an appropriate subset of predictor variables, which shows good generalizability: good prediction of future observations
Often many predictors are available.
What is machine learning is used for?
To analyse large numbers of predictors to get a reliable prediction for a person!
P values and confidence intervals are of no interest in Machine learning.
True or false?
TRUE
Prediction modelling aims to make average predictions
True or false
FALSE
Prediction modelling makes to make individual predictions
What is the difference between explanatory and predictive research?
Explanatory research:
Applies statistical models, such as regression, to test causal hypotheses using a priory theoretical models.
Typically explanatory research is interested for an “average” response of a population.
Causal interpretation is ultimate aim
Prediction research asks differently:
- What is the likelihood of individual events or outcomes: Prediction models predict outcomes for individuals.
- Theoretical models are not necessary and causal interpretation are not main interest (“Black box”)
What are prediction models optimized for the purpose of?
Predicting new or future observations, while in explanatory research minimizing the bias (difference between estimated and true population parameter) is the key criterion to select a best model
Why is there tension between explanatory and predictive modelling?
The best explanatory model may differ from the best predictive model (Sober 2006).
What are problems with analysing big data?
With a small number of variables normal statistical methods can be applied
However, in times of BIG DATA they are not sufficient anymore
Often the number of potential predictors is large compared to sample size (p»n problem)
What is a byte?
A unit of data that is eight binary digits long and used in computers to represent a character such as a letter, number or typographic symbol.
1981: Intel 8088 PC had 640 000 bytes (640 kbyte) memory
What adds to the issue of analysing big data?
Volume of data is (still ) increasing exponentially!
from 130 Exabyte’s (Exabyte = 1018 or 1000 000 000 000 000 000 bytes of data) in 2005 to estimated 44 zettabytes (1021) in 2020
equivalent to a stack of DVDs from earth to halfway to Mars or as many digital bytes as there are stars
But less than 1% is analysed
(source: http://www.idc.com)
Why is multi morbidity prediction important?
Diseasesare caused by a combination of genetic, environmental, and lifestyle factors.
What are three types of data?
Structured data: SQL database format (10%)
Semi-structured Data (XML) (10%): tables, excel files
Unstructured data (80% of all data): Text und multimedia data, including emails, patient records – often handwritten, social media, audio, photos, webpages, presentations, documents satellite, streaming data from sensors (wearables), social network data ….