Biostatistics Flashcards
What is biostatistics/ the purpose
The collection and analysis of data ( so statistics), except specifically related to understanding the effects of a drug or medical procedure on people and animals
Its used to understand medical and pharmacy journals and helps us be able to answer clinical questions from patients and providers. Ex: on a question we should be able to determine if a drug is appropriate for a patient based on if they meet the exclusion criteria for a study ex: consider relative risk
What is a study manuscript
A description of the research completed with the results
What is peer review
When a researcher sends their manuscript to a journal and the editor sends it to experts in the field to be reviewed to assess the research design, the methods, the value of the results, the conclusion, how well it’s written, and whether it is appropriate/fitting for the journal. Reviewers decide whether to accept (usually with revisions or to reject it.
List the steps to publication
Research Question
Design the Study
Enroll the Subjects
Collect the Data
Analyze the Data
Publish the Data
What is continuous data + two types
Data (usually numerical) that has a logical order with values that continuously increase or decrease by the same/a measurable amount
Two types are ratio data and interval data
What is ratio data
continuous type of data with an equal difference between the values and there IS a meaningful zero. ex: age, height, BP, weight - ex: zero blood pressure is meaningful because the pt would be dead
What is interval data
continuous type of data with equal difference between values but there is NO MEANINGFUL ZERO
ex: celsius and farenheit scales - the zero temp doesn’t mean no temperature, but it’s not meaningful because it just means its cold
What is discrete data and the two types
Categorical data
Two types: nominal and ordinal data
What is nominal data
It’s yes/no data. Data that goes into arbitrary categories (names) like male vs female, ethnicity, marital status, mortality
What is ordinal data
It is ranked and in logical order such as a pain scale NYHA Functional class but the categories do not increase by the same amount. (pain of 4 is higher than 2, but that doesn’t mean it is twice the amount)
What are the measures of central tendency and when are they preferred for which data types
Mean (preferred for continuous data that is normally distributed)
Median (preferred for ordinal data or continuous data that is skewed)
Mode (nominal)
Describe standard deviation
how spread out the data is away from the mean SD+/- a certain amount from the mean.
68% of the data will fall between 1 SD of the mean
95% of the data will fall between 2 SD of the mean
99.7% of the data will fall between 3 SD of the mean
What is the range
The highest value - the lowest value
What is the mode
The value that occurs most frequently
What is a gaussian or “normal” distribution vs skewed data
It’s a bell curve that is normal and usually seen in continuous data with large sample sizes. The curve is symmetrical.
68% of the values fall within 1 SD of the mean and 95% of the values fall within 2 SD of the mean. You can use mean** or median or mode to describe your middle.
You lack normal distribution or have “skewed data” when the sample size is small or there are outliers in the data - when there is a small number of values, the outlier has a large impact on the mean. In these cases the median** is a better indicator of central tendency. Wherever the outliers are the graph will skew to that direction. Median is used to describe the middle for ordinal data too.
Negative skew = left skew
Positive skew = right skew
(skew refers to the tail of the data not the hump)
Distortion of central tendency can be fixed by collecting more values
independent variable
Changed /manipulated by researcher
dependent variable
outcome