Lecture 0 - Introduction Flashcards
Data can be collected in various ways:
* Cross-sectionally
* Prospectively
* Retrospectively
Explain these terms.
- Cross-sectionally: data collected at one point in time.
- Prospectively: subjects are followed over time where measurements occur at baseline and in the future/over time.
- Retrospectively: the outcome has been assessed and the study looks back in time to find determinants of the outcome.
Types of data are:
* Binary
* Categorical
* Continuous
* Time-to-event (i.e. survival data)
Explain what time-to-event data is.
Time-to-event data is time until a specific event occurs such as time to dead, time to recurrence after treatment, time to get employed.
Data can be numerically summarized, measures are:
* mean
* median
* mode
Explain these measures.
- mean: the average.
- median: the middle value when data is ranked from low to high.
- mode: the highest frequency of a ‘score’.
Most data is normally distributed, which can be depicted in a normally distributed curve/figure. This curve can be:
* Symmetrical
* Positively skewed
* Negatively skewed
Explain these terms and how the mean, median and mode are distributed in these terms. Also give an example of which type of data is typically distributed as symmetrical, or positively or negatively skewed.
- Symmetrical: mean is equal to the median. Example: height.
- Positively skewed: means that the normal distribution is skewed to the right. Here, there is a high frequency of low ‘scores’ and a low frequency of high ‘scores’. Mean > median > mode. Example: house prices or income.
- Negatively skewed: means that the normal distribution is skewed to the left. Here, there is a low frequency of low ‘scores’ and a high frequency of high ‘scores’. Mode > median > mean. Example: retirement age.
The extent of variability (i.e. measures of spread) within a data set can be calculated with:
* Standard deviation (SD)
* Variance
* Range
* Interquartile range (IQR)
Describe how variance, range and IQR are calculated.
- Variance: SD^2
- Range: maximum - minimum
- IQR: Q3 - Q1
What information can you find in a boxplot?
- The minimum
- Q1
- The median
- Q3
- The maximum
Which values are reported in medical articles when:
* data is symmetrically distributed
* data is skewed
- mean and SD
- median and IQR
What is the central limit theorem?
The distribution of sample means approximates a normal distribution as the sample size gets larger, regardless of the population’s distribution. Sample sizes equal to or greater than 30 are often considered sufficient for the central limit theorem to hold.
What is pearson correlation coefficient?
A measure of linear correlation between two sets of data. It is a number between -1 and +1 that measures the strength and direction of the relationship/correlation between two variables.
What does it mean when the pearson correlation coefficient is:
* +1
* 0
* -1
- +1: perfectly positive linear association
- 0: no linear association
- -1: perfectly negative linear association
What is the goal of inferential statistics?
To draw a conclusion beyond your data sample with the use of effect size, confidence intervals, hypothesis testing.
Which measure denotes the confidence interval for mean or proportion?
Standard error, which quantifies the uncertainity of a certain observed effect.
What do the following terms describe:
* Sensitivity
* Specificity
* Positive predictive value (PVV)
* Negative predictive value (NPV)
- Sensitivity: the probablity of a postive test result truly being positive.
- Specificity: the probability of a negative test result being truly negative
- Positive predictive value (PVV): the proportion of positive results that are true positives
- Negative predictive value (NPV): the proportion of negative results that are truly negatives.
Let’s say that the sensitivity of a specific test is 0.756 and the positive predictive value is 50.0%.
Describe how you would calculate the 95% CIs for the sensitivity and PPV.
- Sensitivity: 0.756 +/- 1.96 x (square root(0.756x0.244/n))
- PPV: 0.5 +/- 1.96 x (square root(0.5x0.5)/n))