Introduction to Biostats (Lecture 2) Flashcards
What is nominal data?
- data can be placed into specific categories when order does not matter
- only a few values exist
- mutually exclusive
- “yes or no” - event either occurred or it did not occur
What is the difference between nominal and ordinal data?
ordinal data - has narrowly defined categories with few possible values in which the ranking-order matters
nominal data - data placed in specific categories when order does not matter
What kind of data would “male vs. female” be?
nominal data
Name type of data for: patient died or did not die
nominal data
Type of data used for: RASS Scale in ICU Sedation
Ordinal
What are the two types of discrete/non-parametric data?
- nominal
- ordinal
Nominal and ordinal data are types of ________ data, which is a type of ____ data
discrete/non-parametric data; qualitative data
Characteristics of discrete/non-parametric data:
- “categorical” variables
- usually have a few possible values
- can be counted but limited mathematical manipulation
scales, counts, and dichotomous outcomes (I.e., mortality) are examples of _______
discrete (non-parametric) data
What are the two types of continuous/parametric data?
- interval
- ratio
Characteristics of continuous/parametric data:
- May have an infinite number of values
- exist on some defined scale
- More mathematical manipulation possible
Weight, temperature, and age are examples of _____
continuous/parametric data
which type of data lacks a defined and meaningful zero point?
interval data
What is the difference between interval and ratio data?
interval data lacks a defined and meaningful zero point, whereas ratio data has a meaningful minimum or zero on the scale
- also can have a negative value for interval data (I.e., temperature), but cannot have a negative value in ratio data (I.e. weight, drug concentration, heart rate, blood pressure)
what is absolute zero?
non-existent measure at zero (can have in ratio data)
Type of data for: Celcius or Fahrenheit temperature (degrees)
interval
data where measurements are on a defined scale and scale of ranked differences is meaningful
interval and ratio data
type of data that can be compared
interval and ratio
weight, drug concentration, heart rate, and blood pressure are examples of what type of data?
ratio data
List and describe the 3 measures of central tendency:
- mean - the average of all values
- median - value in the middle of numerical order
- mode - value that appears most often
Which measure of central tendency is best when describing interval and ratio data?
mean
Which central tendency is best when outliers are present?
median
Which central tendency is best for non-numerical qualities?
mode
What are the 4 measures of distribution/spread?
- range
- interquartile range
- variance
- standard deviation
What is the purpose of measures of spread/distribution?
they are used to describe how data are spread to provide information about the variability in the distribution
the difference between the lowest and highest value in the data is _________
range
What is interquartile range?
the difference between the 25th and 75th percentile (median is the 50th percentile)
What is variance?
How far the values of a variable lie from the mean
What is preferred over variance?
standard of deviation
What is the standard of deviation?
spread of observation distribution using the same units as the original data
provides insight to the dispersion of data points around the mean
Larger standard of deviation indicates ____ in data
greater variability
____ indicates the data is not evenly distributed around the mean
skewness
What does “skewness” appear as on a graph?
data are more concentrated to either the right or left of the mean value (Left skewed/right skewed) and there is a longer tail on one side
What does normal distribution appear as when plotted?
bell curve - distribution is symmetrical around the mean; mean, median and mode are equal
What kinds of data does normal distribution apply to?
continuous - interval or ratio data
What are the two specific distributions of a bell curve?
- z-distribution
- t-distribution
What are the 6 ways that results are displayed graphically?
- Boxplot (Box and whisker plot)
- Stem and leaf plots
- bar charts/histograms
- pie chart
- scatterplot
- frequency table
A boxplot represents a set of _____ data
continuous
What is an independent variable?
what is being manipulated (I.e., new blood pressure medication)
- not influenced by other variables
What is the dependent variable?
what is being measured (I.e., blood pressure reduction)
- outcome of interest (typically has multiple)
- value predicted by independent variables
What are control variables?
what is held constant (I.e., age sex, ethnicity, etc.)
What are confounding variables?
impacts the dependent variable but is not controlled by the researcher (I.e., genetic predepositions)
What is “the range to provide the precision of an estimate with an acceptable level of error”?
interval estimate/confidence interval
range of values that is believed to encompass the actual population value
confidence interval
For a ____ between two values, if the % confidence interval does not cross 0, then the result is _______
difference; statistically significant
For a ___ between two values, if the % confidence interval does not cross 1, then the result is _______
ratio; statistically significant
What is the difference between a null and alternative hypothesis?
null hypothesis is what you are trying to disprove (the assertion that no difference exists between groups)
alternative hypothesis is the desired result/what you want to see (proving that a difference between groups does exist)
What is tested in hypothesis testing?
only the null hypothesis
In hypothesis testing, conclusions are made with respect to the _____.
null hypothesis
What does it mean to “reject a null hypothesis”?
a difference was found
What does it mean to “fail to reject” the null hypothesis?
found no difference
What are the steps to hypothesis testing?
- formation of alternative hypothesis (Ha) from clinical question
- then null hypothesis (H0) is formulated
- hypothesis testing of only the null hypothesis
- conclusion are made with respect to null hypothesis
T/F It is only possible to reject or fail to reject the null hypothesis
T
What are the errors that hypothesis testing is prone to making?
- type I error - alpha - 5% error rate - rejecting null when the null is true (saying there IS a difference when there is NOT a difference)
- type II error - beta - 20% error rate - fail to reject null when null is false (saying there is NO difference when a difference does exist)
Probability of committing one of the errors ____when repeating the analysis
increases
What is a “power”?
the ability to find a difference if one exists
power = ___ + ____
1 - B
What is the power if type II error rate is 20%?
power of 80% (% chance that a difference will be found if it exists)
The higher the power, the ____ the probability of type II error
lower
what is p-value?
the probability of committing a type 1 error based on the assumption that the null hypothesis is true (no difference found)
what does p=0 indicate?
no likelihood that a difference seen is due to chance
what does p=1 indicate?
observed difference is definitely due to chance
The smaller the p-value, the ____ the evidence for rejecting the null hypothesis
stronger
What does p-value not tell us?
- does not tell us the probability of the null hypothesis being true
- it does not indicate the “size of effect”
- does not tell us whether something is practically or clinically significant
What does statistical significance refer to?
the results of an analysis (Is the difference due to random variation?)
When can you conclude that the results are statistically significant?
- when there is enough evidence to reject the null hypothesis
- p-value < a (reject null and accept alternative) - this states that the probability the observed difference is due to chance is < a
T/F results either are or are not statistically significant, there is no in between or gray area
T
What is clinical significance?
the practical significance of findings (will the results cause a change in practice?)
NNT
number needed to treat (involved with risk/benefit measurement for clinical significance)
NNH
number needed to harm (involved with risk/benefit measurement for clinical significance)