Statistics Flashcards
What kind of data uses whole numbers that are mutually exclusive (e.g. infected vs. not infected)
Discrete
What kind of data contains information that can be measured on a continuum or scale, have numeric values between the minimum and maximum values and requires a process of measuring rather than counting?
Continuous
Simplest or crudest level of measurement. Categories are used to classify observations into mutually exclusive groups or classes
Nominal scale
observations are ranked so that each category is distinct and stands in some definite relationship to each of the other categories.
Ordinal scale
When data meet all the requirements for ordinal data and the exact distance between any two observations on the scale is known.
Interval Scale
sum of values / # of observations
mean
the point at which 50% of the values fall below a middle value and 50% of the values occur above the middle value
Median
middle value of an odd numbered set of values
median
(2 middle values) / 2 in an even numbered set of values
median
Observation that occurs most frequently in a set of data
mode
1 SD will contain ___% of the measurements
68%
2 SD will contain ___% of the measurements
95%
3 SD will contain __% of the measurements
> 99.7%
Positive Skew means
Mean > Median
Normal distribution means
mean, median and mode are equal
Negative skew means
Mean < Median
Mesokurtosis
Normal bell shape
Leptokurtosis
More peaked shape curve
Platykurtosis
flatter shaped curve
Basic formula for all rates
(X/Y) * K
Indicates the risk of disease in a population over a period of time
Incidence
K (constant used to transform equations into uniform quantity)
made so the smallest calculated rate is at least 1 number to the left of the decimal
Incidence Rate
Equals the number of new cases of a disease for a specified time period
Prevalence Rate
Equals the number of existing cases of disease from a specified interval or point in time
The proportion of persons in a population with a particular disease or attribute at specific point in time.
Prevalence
summary measure that compares HAI rates over time among one or more groups of patients to that of a standard population
Standardized Infection Ratio (SIR)
Proportion of persons at risk who become infected over an entire period of exposure
Attack Rate
Attack Rate
of new cases / population at risk (for same time period) * 100
Measure of the frequency of death in a defined population, during a specified time (usually a year)
Mortality Rate
Mortality Rate
dead / estimated population * K
K = 1,000 for crude rates K = 100,000 for cause specific
Difference in rate of a condition between an exposed population and an unexposed population.
Attributable risk
Attributable Risk Formula
Incidence in exposed - incidence in unexposed
Odds ratio formula
Draw Table and (AD)/(BC)
Probability of not having a disease given a negative screening test result in the screened population
Negative predictive value
Probability of having the disease given a positive screening test result in the screened population
Positive predictive value
The higher prevalence of the disease, PPV____________ & NPV ________________
PPV increases, NPV decreases
Relative risk equation
Draw table
A/R1) / (C/R2
If R (relative risk) equals 1….
There is no significant association
If R < 1
There is a negative association
If R > 1…
There is a positive association
The probability of committing a Type I error is referred to as
Significance level
This type of error means rejecting the null hypothesis when it is true and attributing significance where there is none
type 1 (alpha)
This type of error means accepting the null hypothesis when it is false or not attributing significance when it exists
Type II (Beta)
You can reduce a Type I error by…
Decreasing the length of rejection area, keep alpha level very small (0.05, 0.01)
You can reduce Type II errors by
Increasing the sampling size
The p value in statistical test results indicates
The probability of having committed a Type 1 error
What parametric test type test that the means of two sample groups are not different for a sample size >30 with normal distribution?
Z test
What parametric test is used with the sample size is < 30?
T test
T/F: Non-parametric test (such as Chi-square) make no assumptions about distribution of population values.
True
The probability that a test correctly identifies patients without disease as negative
Specificity
The probability that a test correctly identifies as positive patients who have the disease
sensitivity
Sensitivity Equation
TN/(TN + FP)
Specificity Equation
TP/(TP+FN)
Chi-square Equation
(O-E)^2/E
This non-parametric type of test is used for medium to large samples and tests the association between two classifications of a set of counts or frequencies.
Chi-square
This non-parametric test is used in place of the chi-square when the sample size number is <20
Fisher’s Exact
The _____ of a test is its ability of a test detect a specified difference (e.g. the probability of rejecting the null hypothesis when it is false)
Power
Power of a hypothesis is affected by these 3 factors
- sample size: Bigger sample size=greater the power
- Significance level: higher significance = higher the power
- The greater the difference between the “true” value parameter and the value specified in the null hypothesis, greater the power of a test.
What kind of graph shows a frequency distribution with values of the variable on the x-axis & the number of observations on the y-axis; data points are plotted at the midpoints of the intervals and are connected with a straight line?
Frequency polygon
Difference in rate of a condition between an exposed population and an unexposed population
attributable risk
The proportion of cases attributable (and avoidable) to this exposure in relation to all cases.
Attributable Risk Percent (ARP)
Attributable Risk Percent Calculation
(relative risk - 1) + relative risk
Precision of the relative risk is related to…
the power of a study.
this type of chart uses calculated upper & lower limits over time
control charts
this type of chart is useful in IC because there can be more than one error per patient taken into account
U Charts
This type of chart provide a range of expected variation about a mean and the upper & lower limits beyond which the process is considered out of control
control charts
This type of chart is useful in conveying changes in rates over time & identifying points in time when rates are outside the expected range
control charts