Hypothesis testing Flashcards
What is the Domain/ Study sample/
Population in which the question is answered
What is the exposure/ determinant?
The causes and other factors that influence the occurence of disease and other health-related events
What is the ‘Outcome’?
The endpoint of interest
What is the definition of Data?
Information, such as facts or numbers collected together to be examined
What is the definition of Variables?
The attributes that change among the variables
What are nominal characteristics?
characteristics with no ordering of the categories
What are ordinal characteristics?
Characteristics with clear ordering
e.g they have scores of relative abundance
What is discrete data?
Data that only takes integer values
e.g number of leaves
What is continuous data?
Data that can have almost any numeric value and is then divided into finer and finer levels
What is descriptive statistics?
summarising data by using tables, diagrams and summary measures
What are inferential statistics?
trying to reach conclusions that apply to the entire population based on some data from a sample
What is a confidence interval?
an estimated interval within which an unknown parameter may plausibly lie
specifically- A 95%CI is the result of a procedure that, in 95% of cases in which its assumption are correct, will contain the true parameter value.
What is a null hypothesis?
Where there is no significant difference in the populations under investigation
What is hypothesis testing?
The formal procedure to accept or reject statistical hypotheses
What is the alternative hypothesis?
denoted by H1 and is contrary to the null hypothesis
What is the p-value?
The probability of getting the observed effect if the null hypothesis is true
What is the effect of a small p value?
The smaller the p value the stronger the evidence against the null hypothesis
What is a t-test for?
Comparing the means of two groups
provided that the two samples are independent, the variables
normally distributed and they must have approximately the same
variances
What is an ANOVA test for?
Comparing the means of more than two groups
*provided that the samples are independent and the variables are
normally distributed. *
What is a chi-squared test for?
Comparing two proportions
provided a large sample size, each individual is represented once..
What is a power analysis?
a calculation that helps you determine a minimum sample size for your study
What does statistics involve, broadly
- Collection and analysis of data
- summarising information in order to aid understanding
- interpretation of analyses and drawing conclusions from the data
What are the two different types of data?
basic classification
- Categorical/ Qualitative
- Numerical/ Quantitative
How would you define Categorical/ qualitiative data?
They describe a characteristic that cant be easily measured but can be observed subjectively
How would you describe numerical or quantitative data?
They describe a measurable quality on a well-defined scale
What are some different ways you can calculate confidence intervals?
Depends on what you are measuring
* means
* proportions
* difference in mean
* Relative risk
* odds ratio
* regression coefficients
under general assumption the data comes from normal distribution
What does the p=value summarise?
The p value summarises the strength of the evdience going against the null hypothesis
What are six principles that address the msiconceptions and missuse of the p= value?
- p-values can indicate how incompatible the data are with a specified statistical model.
- p-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
- Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
- Proper inference requires full reporting and transparency.
- A p-value, or statistical significance, does not measure the size of an effect or the
importance of a result. - By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.
Why is power analysis important?
If the study does not have enough ‘power’ then your study may not be able to detect a true ‘effect’ making it a waste of time, money and unethical
What is a type I error?
A false positive
so rejecting the null hypothesis (positive result) incorrectly
What is a type II error?
A false negative
so incorrectly retaining a false null hypothesis