Unit 3 Flashcards
What are the 3 attributes of study variables?
Order / Magnitude
Consistency of scale / equal distances
Rational absolute zero
What are the 3 levels of data measurement?
Nominal
Ordinal
Interval/ Ratio
Nominal
Dichotomous/ Binary
Non-ranked (non-ordered)
Named categories
- categorical data - can be more than 2 categories - ex: 2 genders, 2 age groups
No order/ magnitude
No consistency of scale or equal distances
Nominal variables are simply labeled- variables without quantitative characteristics
Examples of Nominal variables
What is your gender?
- Male or Female
What is your hair color?
- brown vs black vs blonde vs grey vs other
Education level (if made binary)
Smoking vs non-smoking
Ordinal
Ordered and order-able
Rank-able categories
Non-equal distance between ranges
- technically can be equal and unequal
Unitless
Yes order/ magnitude
No consistency of scale or equal distances
- No units or scales - No even spacing between them
Data is collected in categories and can be ordered
Examples of Ordinal variables
Pain Scales
- patient decides what each value means
Strongly agree > somewhat agree > Neither > somewhat disagree > strongly disagree
SES
- unitless, broken into categories
Interval/ Ratio
Order/ magnitude
Equal Distances
- unitless
- equal spaces between scales
Interval
- Arbitrary 0 value - 0 doesn't mean absence - Can be 0 or negative values
Ratio
- Absolute 0 value - 0 means absence of measurement value - No negative values - Ex: physiological parameters - blood pressure - blood sugar
Examples of Interval/ Ratio variables
Living siblings and personal age
Height in cm
Speed in m/s
LDL in mg/ dL
Mean
Average value
Median
Middle value
Mode
Most common value
This is the most useful measurement for descriptive statistics
What is descriptive statistics?
Tells us about our population
Describes our population
Range
Maximum - minimum
Interquartile Range
Top 25% = Q3
Bottom 25% = Q1
Middle 50% = Q3 - Q1
- represented the 25% above and below the mean
Variance
The average of the squared differences in each individual measurement value and the groups mean
Describes the spread of data
Variance from the mean
Standard Deviation
Square root of variance
Restores units of mean
Describes spread of data
Normal Distribution
Symmetrical
Mean and median are (almost) equal
Equal dispersion of curve (tails) to both sides of mean
Statistical tests useful for normal- distributed data are known as _____
Parametric Tests
Required assumptions of interval/ ratio data for proper selection of parametric tests
- Normal distribution
- Equal variances
- use Levene’s Test
- Randomly derived and independent
Levene’s Test
Test used to calculate if data is normally distributed and has equal variance
Used to assess if the variances are different between groups
Null Hypothesis: groups are equal
Tries to show that there is a difference between groups
How to handle interval data that is not normally distributed
- Use a statistical test that does not require the data to be normally distributed
- non-parametric tests
- step down and run ordinal test
- Transform data to a standardized value
- hope that the transformation allows data to be normally distributed
- z score or log transformation
Positively Skewed
Asymmetric distribution with one tail longer than the other
Mean > median
- mean is higher than median
Tail points to the right
Negatively Skewed
Asymmetric distribution with one tail longer than the other
Mean < median
- mean is lower than medium
Tail points to the left
What effect do outliers have on skewness?
Outliers pull the tails out farther
Contributes to skewness
Skewness
A measure of the asymmetry of a distribution
Perfectly normal distribution is symmetric
- skewness = 0
Negative skewness = negatively skewed data
Positive skewness = positively skewed data
Kurtosis
A measure of the extent to which observations cluster around the mean
- how peaked a value is
Kurtosis of normal distribution curve = 0
Positive kurtosis = more cluster (around a number)
- closer to a positive value
Negative kurtosis = less cluster (around a number)
- closer to a negative value
68%
1 standard deviation away from the mean
95%
2 standard deviations away from the mean
99.7%
3 standard deviations away from the mean
Null Hypothesis
Research perspective which states there will be no true difference between the groups being compared
Most conservative and most commonly utilized
At end of study, either need to accept or reject the null
Can take on the superiority, noninferiority, and equivalency perspectives
Alternative Hypothesis
Research perspective which states there will be a true difference between the groups being compared
Type 1 error
Alpha
Not accepting the null hypothesis when it is actually true and you should have accepted it
Rejecting the null hypothesis when you shouldn’t have
Ex: Telling a man he is pregnant
Type 2 error
Beta
Accepting the null hypothesis when it is actually false and you should not have accepted it
Not rejecting the null hypothesis when you should have
Ex: Telling a pregnant woman she is not pregnant
P value
Probability value (alpha)
Based on the probability, due to chance alone, a test statistic value as extreme or more extreme than actually observed if groups were similar (not different)
Represents your chances of being wrong
If p < 0.05, risk of experiencing a type 1 error is acceptably low
T/F: If p value is lower than the pre-selected alpha (5% or 0.05), it is statistically significant
True
Do you accept or reject the null hypothesis if p value < alpha?
REJECT
What is the interpretation of a p value?
The probability of making a type 1 error if the null hypothesis is rejected
If the data is statistically significant and there are 3+ groups, the p value tells you what?
Tells you that there is at least 1 difference present
Guaranteed difference between control and most extreme value
Lowest and highest value represent the difference
At baseline, do we want groups to be equal?
Yes
At start of study/ baseline, p value should be 1.0
Want p values to start above 0.05
Power
The statistical ability of a study to detect a true difference if only one truly exists between group comparisons and therefore the level of accuracy in correctly accepting/ not accepting the null hypothesis
If there is truly a difference between groups, study has high power
Studies are set up to have 80% power
When lose people, lose power
What is the mathematical representation of power?
1 - beta
= 1 - type 2 error rate
We allow a type 2 error rate of 20%
- accept 20% of risk of finding an error
1 represents sample size
The _____ people the study has, the _____ the power
More
Higher
What are the common elements utilized in determining sample size of a study?
Minimum difference between groups deemed significant
- the smaller the difference between groups necessary to be considered significant, the greater the sample size needed
Expected variation of measurement
Alpha and beta error rates and confidence interval
How does sample size affect power?
The larger the sample size, the greater the likelihood of detecting a difference if one truly exists
Increases power
What is the number one way you can ensure your study has power?
To show the difference between groups if a difference is really present
When is it okay for p values to be non-statistically significant?
Start of study
Levene’s Test
What is a caveat of p values?
They do not tell us about spread/ dispersion
Confidence Intervals
Precision measurement
CI around a group’s differences help reader understand where true value may lie
Calculated at an a priori percentage of confidence that statistically includes the real (yet unknown) difference or relationship being compared
What advantage do confidence intervals have over p values?
Tells us about statistical significance and spread
What are confidence intervals based on?
Variation in sample
Sample size
If you don’t use the same directional word when interpreting CI, is the data statistically significant?
NO
If CI values cross 1.0, data is not statistically significant