Biostatistics II Flashcards
check on learning
variable also known as_____?
data
any measurement that can have different values across individuals (e.g., height, weight, age, gender)
variable
what are the roles of a variable?
predictor variable (Independent variable/Explanatory variable) outcome variable(Dependent variable/Response variable) confounding variable
Intervention/treatment/therapy is an example of what type of variable?
predictor variable
disease status is an example of what type of variable?
outcome variable
demographic information like age and gender are examples of what type of variable?
confounding variable
confounding variable is treated as what kind of variable in statistical modeling?
predictor variables
data that are measured in a numerical scale and are expressed as numbers
Quantitative
data that are described by words rather than numbers
Qualitative
Systolic and diastolic blood pressure, height, weight
this is what type of data?
quantitative data
open ended questions in surveys, eye color, Likert scale questions
this is what type of data?
qualitative data
quantitative data has these two types of classification?
discrete or continuous
interval or ratio
what are the categorical variables?
nominal
dichotomous
ordinal
what type of categorical variable has the following characteristics?
Labels
No order information
Nominal
what type of categorical variable has the following characteristics?
Only two possible labels
No order information
Dichotomous (a special case of Nominal)
what type of categorical variable has the following characteristics?
Order (rank) information is maintained
Differences/ratios do not make sense
Ordinal
blood type and occupation is what type of categorical variable?
nominal
Test result: Positive/Negative
is what type of categorical variable?
Dichotomous (a special case of Nominal)
Likert scale questions
Academic Tier is what type of categorical variable?
ordinal
what are the two types of quantitative variables? what are their subsets?
discrete and continuous
interval and ratio
this type of quantitative variable has the following characteristics?
Possible values are integers
Have acountablenumber of values
Discrete
this type of quantitative variable has the following characteristics?
Can have any value in a range
Continuous
Number of patients who were exposed to a risk factor
Number of patients who visited ER during last weekend
is an example of what type quantitative variable?
discrete
Amount of time between meal being served and onset of gastro-intestinal symptoms
Infant mortality rate
is an example of what type quantitative variable?
continuous
these subdivisions of discrete and continuous variables are these two types?
interval and ratio
interval or ratio has the following characteristics that are either interval or ratio?
No absolute zero exists
Differences make sense but not ratios
Only permits addition and subtraction
Interval
interval or ratio has the following characteristics that are either interval or ratio?
Zero represents a true absence
Ratios make sense
Ratio
this example:
Temperature in Celsius/Fahrenheit scale IQ score (most standardized tests scores in psychology)
is interval or ration?
interval
this example:
Temperature in Kelvin scale Salary in $ Weight Risk and proportion is interval or ration?
ratio
what should we consider when determining the type of a variable?
the research context and how it was measured
Example: Age by nature is continuous and ratio, but it may be of other types:
Age as ordinal: Age group 0-17, 18-60, 60+
Age as discrete: age at the last birthday, age on the day of medical school interview
what are the tools used in descriptive statistics?
numerical measurements
graphical tools
what is the tool used from descriptive statistics derived from these examples?
Measures of central tendency
Measures of dispersion
numerical measurements
Tools to choose depend on the type of variables
what is the tool used from descriptive statistics derived from these examples?
Frequency table
Bar chart/Pie chart
Histogram/Box plot
Scatterplot
graphical tools
Tools to choose depend on the type of variables
what are the measures of central tendency?
mean
median
mode
this measure of central tendency is an average of all values?
mean
this measure of central tendency is the middle value of the ordered dataset?
median
this measure of central tendency is the most commonly used value?
mode
this measure of central tendency is used:
Not robust to extreme values
Makes most sense for quantitative variables
mean
this measure of central tendency is used:
Robust to extreme values
Makes most sense for quantitative variables
Can also be used for ordinal variables
median
this measure of central tendency is used:
Can be used to describe both categorical and quantitative variables
Can have more than one modes
mode
which of the following measures of central tendencies is applicable only to females?
mode
which of the following measures of central tendencies is applicable to age?
mean and median
what are the four measures of central tendencies?
variance
standard deviation
range
interquartile range
what is variance?
How close the values are to the mean
what is standard deviation?
Square root of variance
what is range?
Max value β min value
what is interquartile range?
π3-π1
these measures of dispersions are used:
Not robust to extreme values
Makes most sense for quantitative variables
variance
standard deviation
range
this measure of dispersion is used:
Robust to extreme values
Makes most sense for quantitative variables
Interquartile Range
A table listing the possible values for a variable, together with their (relative) frequencies.
frequency table
this type of table is used when:
for categorical variables
measures number of times the observations fall into a certain category
relative
frequency=frequency/total
frequency table
a circular diagram divided into segments, each representing a category of the variable.
Pie Chart
a type of graph for presenting categorical variable in such a way that each observation can fall into one and only one category of the variable.
Bar Chart
a graphical representation of the frequency distribution of a quantitative variable.
Histogram
the significance of a histogram is to allow the inspection of the data for its underlying distribution, outliers, skewness, etc.
Histogram
what are the following steps used to construct?
Step 1: split the data into intervals, called bins.
Step 2: use the table from step 1 to construct the histogram.
histogram
this type of plot is used to present the distribution of a variable measured on a numerical scale (discrete or continuous).
why?
Box plot
It allows easy inspection of potential outliers and group comparisons (side-by-side box plot).
the following is a guideline to use when it comes to potential outliers?
Any point below the lower extreme/fence (π_1β1.5πΌππ )
Any point above the upper extreme/fence
(π_3+1.5πΌππ
)
IQR=Q3-Q1
Criteria for potential outliers (indicated by βββ or βββ)
what is the significance of a box plot?
Side-by-side box plot allows visualization of group comparisons.
its usually drawn out before working out a correlation or fitting a regression line
pattern
direction
strength
anomalies
these are indications for a _____ (what type of plot)?
scatterplot
this definition defines a variable of what to look for in a scatterplot?
clusters of points or outliers like outliers and clusters
anomalies
this type of anomaly consists of points that deviates from the overall pattern
outliers
this type of anomaly consists of groups of points separated from one another
clusters
how do we define the scatterplot trends?
strong, positive, liners
no overall
strong, quadratic
clustered