Exam 3 Flashcards
Probability
Deals with the relative likelihood that a certain event will or will not occur, relative to some other events
Probability is a synonym of
proportion
Occurrence of an event is just as likely as it is unlikely at probability
0.5
Probability values assigned to each experimental outcome must be between
0 and 1
The sum of all experimental outcome probabilities must be
1
Outcome space/sample space
All possible outcomes.
Mutually exclusive and exhaustive
Complement of an event
The complementary event A refers to the event consisting of all sample points that are not in A
Probability of intersect (joint)
Probability of both A and B occurring at the same time
Two events A and B that are not mutually exclusive
Probability of conjoint (union)
The probability of either A or B occurring
Conditional probability
p(A|B)
The probability of an outcome, given that a certain value is already known for a second variable.
Non-parametric Chi-square test
Involves discrete variables (categorical data). It does not require assumptions of homogeneity or normality
Nominal (yes/no) data
Evaluating the difference between the frequencies actually observed in the sample
Df of chi squared
n-1
(n-1)(r-1) for test of independence
Chi squared test, if the calculated value is less than the critical value what happens?
We fail to reject the null hypothesis.
No change
Chi square test of independence
To determine if the two discrete variables are independent of each other or if an association exists.
Attempting to find out if one variable predicts another variable.
Data is displayed in a contingency table of rows and columns
Chi square, what do we do when the calculated statistic is greater than the critical value?
Reject the null hypothesis
There is a significant difference
Assumptions of chi square test of independence
There must be at least one observation in every cell, no empty cells
The expected value for each cell must equal or be larger than 5
Yates Chi Square for conservative estimation
Used in certain situations when testing for independence on a contingency table.
Produces a smaller numerator and a more conservative estimate for the chi square statistic; harder to reject null
Fishers Exact Test for less data
Used when data for a chi-square test of independence is reduced to 2x2 and the expected values are still too small or have a zero in the cell
Can be non-parametric median test or probability for multiple tests
McNemars Test
Will be used to evaluate the relationship or independence of paired discrete variables (like paired T test)
When to use OR vs RR?
OR are more commonly reported incase control, cohort studies and clinical trials OR can be an estimate of RR If the outcome of interest is rare, OR=RR Either overestimate (OR>1) or underestimate (OR<1)
NNT
The number of patients needed to treat with the specified therapy in order for one patient to benefit.
Inverse of ARR
Logistic regression
An extension of multiple regression methods for use where the dependent variable is dichotomous (dead/alive)
Determines the predicted probability of the outcome based on the combo of predictor values
Simple logistic regression
Logistic regression with one independent and one dichotomous dependent variable
How do you solve the problem of probabilities not being linear?
Transformation to convert into a linear expression
How do you transform a probability into a linear expression?
- ) Convert probability to odds
2. ) Make natural log of odds
b1 regular regression
For a one-unit change in x we expect an average change in Y of b units, holding all other variables constant
b1 logistic regression
For a one unit change in x, we expect on average that the predicted odds will change by a multiplicative factor of e^b1 holding all else constant
Cox proportional hazard regression
Accounts for the effects of predictor continuous and discrete variables on the dependent variable, which can include censored time-until-event
To compare survival in two or more levels of an independent variable adjusting for multiple covariates
What is research?
A systematic approach to find answers to questions
Applied vs Basic research
Applied- solving problems
Basic- understanding problems
Cross-sectional study
It takes place in a single point in time
Longitudinal study
Takes place over time. You have at least 2 or more waves
Exploratory Research
Invesitgating or discovering something that is previously unknown
Zika, Ebola , swine flu
Descriptive Research
A systematic attempt to symbolize the obvious relationships that are found among the natural phenomena under study
Association between variables, frequency of occurrence
Explanatory research
To explain (show causal relations between variables) and predict relations
Discovers the answer to the question
Reveals gaps in our understanding
Research problem
A question demanding a settlement
Expresses a relationship between 2 or more variables, usually in the form of a question, should imply a method of empirical testing
Research hypothesis
A conjectural statement, a tentative proposition about the relation between two or more variables. It is a prediction from theory under test
Bias of systematic error
Causes some type of constant error in the measurement with a system
Bias of random errors
Chance errors
Unpredictable and will vary in sign (+ or -)
Selection bias
Certain characteristics make potential observations more or less likely to be included in the study
It occurs when confounding factors are unevenly distributed between experimental and control groups.
Also called cherry picking
Funding bias
It is possible that there is a different quality of research between industry funding and trials without external funding.
Financial interest may bias interpretation
Publication bias
A selective publication of trials with certain results may lead to an exaggeration of effects
Reliability
A collection of factors and judgments that, when taken together, are a measure of reproducibility. The consistency of measures
Validity
Refers to the fact that the data represents a true measurement
A valid piece of data describes or measures what it is suppose to represent
Conclusion validity
Is there a relationship between the cause and effect?
Means conclusion we reach about our relationship is reasonable.
Internal validity
Is the relationship causal?
Example- whether the program causes the outcome
Construct validity
Can we generalize to constructs?
If there is a causal relationship in a study, did you measure the outcomes you wanted to measure?
Can you claim that the research program reflected your intended construct of the program?
External validity
Can we generalize to other persons, places, times?
Descriptive data analysis
Range, mean, SD, percentage, CI
Inferential statistics
Parametric- tTest, ANOVA
Non-parametric- chi square, logistic regression