2 - Biostatistics 1 - Basic Principles Flashcards

1
Q

Statistics

A

Encompasses methods of collecting, summarizing, analyzing & drawing conclusions from data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Biostatistics

A

The application of statistics to medical, biological and public health data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Descriptive Statistics

A

A means of organizing and summarizing observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Statistical Inference

A

A process of drawing conclusions about a population from a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Population

A

A collection of all subjects of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sample

A

A representative subset of the population that can be studied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Types of Sample

A

Random (every 10th person)

Convenient (this cluster all together)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Parameter

A

Rule (applicable to population)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Statistic

A

Value (measured from sample)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Variable

A

A characteristic or condition of an observation that can take on different values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Dependent Variable

A

Outcome, variable of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Independent Variable

A

Exposure, predictor variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Types of Categorical (Qualitative) Data

A

Nominal

Ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Nominal Data

A

Values fall into categories of classes that are mutually exclusive and are not ordered
Dichotomous/Binary - Only Two Possible Categories (Dead/Alive)
Multiple Categories - (Race, Blood Type)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Ordinal Data

A

Values fall into categories or classes where order matters (Disease stage, satisfaction level)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

2 Types of Numerical (Quantitative) Data

A

Discrete

Continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Discrete Data

A

Data has a numerical value that takes only certain whole number values (# of kids in a family)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Continuous

A

Data has a numerical value that can have any value in a continuum (height, weight, time)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Frequency Distributions - Data Representations

A

Categorical Data - Pie Charts, Bar Charts
Continuous Data - Histogram
Continuous Data - Box Plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Unimodal Frequency Distribution

A

One Peak

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Bimodal Frequency Distribution

22
Q

Right Skew Frequency Distribution

A

Tail to the right (more low values than high values)

23
Q

Left Skew Frequency Distribution

A

Tail to the left (more high values than low values)

24
Q

Central Tendency Descriptors

A

Mean
Median
Mode

25
Mean
Average Pro - Uses all Data Values Con - Distorted by outliers and skewed data
26
Median
Middle value of the ordered data set Pro - Not distorted by outliers or skewed data Con - Ignores most of the information
27
Mode
Most frequently occurring value Pro - Easily determined for categorical data Con - Ignores most of the information
28
Spread
``` Measures to describe the variability of dispersion Range IQR Variance Standard Deviation ```
29
Range
Difference between largest and smallest values Pro - Easily Determined Con - Distorted by Outliers
30
IQR (Inter-Quartile Range)
Difference between the 25th and 75th percentiles Pro - Unaffected by outliers Con - Appropriate for skewed data
31
Variance
Each deviation is squared
32
Standard Deviation
Square root of variance, an average of deviations from the observations from the mean
33
Inferential Statistics
The process of drawing conclusions about a population from a sample. Starts with a Null Hypothesis and an Alternative Hypothesis
34
Null Hypothesis (H0)
Assumes no effect in the population
35
Alternative Hypothesis (H1)
Assumes effect in the population
36
Steps for Hypothesis Testing in Inferential Statistics
Assume Null Hypothesis to be true Collect data from the sample to disprove Null Hypothesis Either reject H0 (if there is convincing/strong evidence against it) or fail to reject H0
37
Type 1 Error
Reject the null when the null is actually true | Probability - α
38
Type 2 Error
Fail to reject the null when the null is false | Probability - β
39
Power
The probability of rejecting H0 when it is false (Not committing a Type 2 error) = 1 - β Aim for 100%, settle for 80 - 90%
40
Factors influencing power
Sample Size Variability Effect of Interest Significance Level
41
How does Sample Size influence Power?
Power increases with larger samples
42
How does Variability influence Power?
Power increases as variability decreases
43
How does Effect of Interest influence Power?
Power increases with larger effect size
44
How does Significance Level influence Power?
Power increases with larger α
45
α
The chance of Type 1 Error we are willing to accept, decided prior to collecting data Typically α = 0.05 Using a smaller α will increase your β
46
P-Value
The probability of obtaining our results or something more extreme given that the null hypothesis is true
47
P
Reject H0 and conclude that results are significant at the α% level
48
Confidence Interval
Estimated range of values likely to include the population parameter. Point estimate, 95% CI (upper limit, lower limit)
49
What do P-Values tell you about?
Statistical Significance
50
What do Confidence Intervals tell you?
Statistical Significance + Information about Size and Direction of the effect.
51
Statistical Significance
90% Confidence Interval does not include the null A very small difference that is not clinically meaningful can reach statistical significance if the sample size is large enough
52
Clinical Significance
Effect Estimate is above the threshold for clinical relevance