Advanced Analysis and Hypothesis Tests Flashcards
What is a t-distribution?
Similar to the standard Normal distribution but is family of curves dependent on the degrees of freedom.
What is hypothesis testing?
using data to “weigh up the evidence” and using the evidence to decide whether to reject a pre-defined statement
What are the five steps of hypothesis testing?
- State the null hypothesis
- Calculate the appropriate test statistic
- Obtain a P value for the test statistic
- Make the decision whether to reject the null hypothesis based on P value
- State the conclusion in terms of the original research question
What is the null hypothesis?
- A statement about the value of a population parameter or the difference between groups
- usually the negation of the research hypothesis
- usually “the effect/association of interest is zero
What is the alternative hypothesis?
- Opposite of the null hypothesis
* Usually related to the research question
How do we calculate the test statistic?
Test statistic = observed value - hypothesised value/
standard error
What is the relationship between the test statistic and the null hypothesis?
The bigger the test statistic (+/-), the more evidence there is against the null hypothesis. The value of the test statistic is used to decide whether to reject the null hypothesis.
What is the goal of estimation?
We want to estimate the population parameter based on the sample statistic.
• The sample must therefore be representative of the population
How is estimation different from hypothesis testing?
Hypothesis testing is concerned with using the data to ‘weigh up the evidence’ and make a decision whether to reject a pre-specified statement (the null hypothesis) or not, whereas estimation gives us a ‘best estimate’ for the population value along with a range of likely values (confidence Intervals)
What is the definition of a population parameter?
A measurable characteristic of the population (e.g. mean = μ, proportion = π, standard deviation = σ). Values obtained from a sample are estimates of the
population parameters.
What are sample statistics?
Sample statistics are estimates of results that would have been obtained had the whole population been studied
What are the two different kinds of estimation?
Point estimation and interval estimate
What is a confidence interval?
a range of values in which we have confidence that the population true value lies. It quantifies uncertainty and indicates the precision of our sample statistic
What is a point estimate?
An example would be a mean - it is just one value and doesn’t take into account that this value would change from sample to sample
When does the width of the CI increase?
When there is:
- a small sample size
- lots of variability in the data
- the level of confidence (eg 99%) increases
When do we use the t-distribution?
When the sample size is small, say under 30
What is the formula for the t-distribution?
t = (x̄ – μ) / (s/√n) x̄ is the sample mean μ is the population mean s is the standard deviation n is the size of the given sample
What impacts the width of a CI?
- Precision of the estimate (s.e.)
* Level of confidence (multiplier)
What are poor and high precision and how do they relate to the concept of a CI?
• Poor precision (large SE): wide interval
•High precision (small SE): narrow interval
•As sample size increases, standard error (SE)
decreases which leads to greater precision and
narrower intervals
The larger the confidence, the….
….greater the interval
The narrower the interval, the…
…lower the confidence
Can you use a CI for a proportion?
Binomial proportions are not from the normal distribution but:
• If the sample size is greater than 30 and 0.1 < p < 0.9, we can use our standard formula for the confidence interval p +1.96SE( p)
What is the chi-squared test?
The chi-squared test of association(for categorical data) is a test for the comparison of two attributes in a sample of data to determine if there is any relationship between them
What would be the null hypothesis in the context of using the chi-squared test?
Ho = there is no association between the classification of the two attributes under investigation
What is the chi-squared test based on?
The difference between the observed and expected frequencies
What happens within the chi-squared test?
Under the null hypothesis this test statistic follows the
Chi-squared distribution
o The value of the test statistic is then compared with the appropriate Chi-squared distribution (first proposed by Pearson)
o The greater the differences between the observed and expected statistics, the larger the Chi-squared statistic is, the more evidence that the two variables are associated
How do you calculate the expected frequencies in a 2x2 table for a chi-squared test?
Expected freq. = (relevant row total × relevant column total)/ total sample size
How do you calculate the chi-squared statistic for a 2x2 table?
The chi-squared value is obtained by calculating:
(observed - expected)2/expected
for each of the four cells in the contingency table and
then summing them.
How would you then either reject the null hypothesis or fail to reject the null hypothesis using the chi-squared test?
Compare the Chi-squared test statistic with the tabulated values of the Chi squared distribution corresponding to given two-tailed p values for different degrees of freedom. The bigger the difference between the test statistic and the p-value, the more evidence against the null (you would fail to reject)
What is Yates’ correction for a 2x2 table?
- When the number of events/sample is low, a continuity correction is usually made by subtracting 0.5 to each element in the calculation. This correction is referred to as Yate’s continuity correction
- It is intended for use with ‘small’ samples i.e. total sample size <40 or expected numbers are small (cell frequency <5)
o The correction reduces the value of Chi-square and prevents overestimation of statistical significance for small data sets
What is Fisher’s exact test, and when is it used?
o The Fisher’s exact test to compare two proportions is needed when the numbers in the 2 x 2 table are very small (i.e. expected frequency of less than 5)
o For the Chi squared test to be valid, most cells should have an expected frequency of more than 5 and total sample size of approximately 40
Can the chi-squared statistic be used for larger contingency tables?
Yes!
- Larger tables are called r x c tables, where r denotes the number of rows in the table and c the number of columns.
- the calculation for the expected frequencies then becomes: Expected number = column total x row total/overall total
What is the chi-squared test for linear trend, and when is it appropriate?
o Appropriate for ordered categorical (ordinal) exposure variables (e.g. lifetime partners, age- group, cholesterol levels).
o Not appropriate for variables in which there is no natural order e.g. marital status, ethnic group, country of residence.
o The ꭕ2 test for trend is a more sensitive test that assesses whether there is an increasing (or decreasing) trend in the proportions over the exposure categories.
What does the chi-squared test presume of it’s observations?
That they are independent
What test do you use for categorical variables/observations which are NOT independent?
McNemar’s test - this would be appropriate for paired data, such as matching in a case control trial, before and after measurements, comparisons between 2 observers - eg 2 radiographers using x-rays to diagnose TB
What are some examples of continuous data?
weight, age, blood pressure, antibody levels
What do you need to check for continuous data?
The shape of the frequency distribution - this indicates what summary measures should be used on the data
What are some examples of how continuous data is displayed?
Histogram, scatter-plot, line plot, box plot
What are some recommendations for how continuous data can be summarized?
- For normally distributed data: Mean and SD
- For non-normal data: Median and interquartile range (25th -75th percentile)
When is it appropriate to use Student’s T-Test?
For the comparison of means
When is appropriate to use a one-tailed t-test?
o Imagine you have developed a new drug that you believe is an improvement over an existing drug. So you opt for a one-tailed test. Therefore, you fail to test for the possibility that the new drug is less effective than the existing drug. The consequences in this example are extreme, but they illustrate a danger of inappropriate use of a one-tailed test.
o Imagine you have a new drug which is cheaper than the existing drug and, you believe, no less effective. You do not care if it is more effective. You only wish to show that it is not less effective. In this scenario, a one-tailed test would be appropriate (the consequences of not testing the effect in the other direction are negligible and ethical)
What are paired t-tests based on?
o A paired t-test is based on differences within each subject
o Each subject acts as their own control
o Measurements on the same subject are not independent
o Measurements on different subjects are independent
What are the underlaying assumptions of t-tests?
o Means of the populations being compared should follow normal distributions. Fortunately, it can be proved that this will be approximately true if you have enough data.
o The data used should either be sampled independently or fully paired (for a paired test).
o In Student’s t-test original formulation the variances of the populations being compared should be equal. However, modern statistical software are allows for unequal variances (in R, the default option for t.test is “var.equal=FALSE” which allows for unequal variances).
What if you are comparing more than one means? Which test would you use
ANOVA (analysis of variance)
What is one-way ANOVA used for?
o One-way ANOVA is used to compare the mean of a numerical outcome variable in the groups defined by an exposure level with two or more categories.
o It is called one-way as the exposure groups are classified by just one variable.
What is the definition of precision in the context of diagnostics?
How close diagnostic test results are to each other
What is the definition of sensitivity in the context of diagnostics?
The proportion of people with the disease or condition that test positive
What is the definition of specificity in the context of diagnostics?
The proportion of people without the disease or condition that test negative
What is the formula to calculate sensitivity?
A/(A+C)
What is the formula to calculate specificity?
B/(B+D)
What is the positive predictive value, and how is it calculated?
Proportion of people testing positive who have the condition. It is calculated as A/(A+B)
What is the negative predictive value and how is it calculated?
Proportion of people testing negative who do not have the disease. It is calculated as D/(B+D)
What is the crucial difference to remember between sens/spec and predictive values?
Sensitivity and specificity depend on the test itself - whereas NPV and PPV depend on the prevalence of a condition or disease among the population
What are the four main barriers to the development and use of diagnostics in LMICs?
1) Lack of investment and innovation
2) Limited access to diagnostic tests
3) Lack of regulatory control and quality standards
for evaluation
4) Infrastructure and human resource capacity
What is a reference standard?
The best test we have available to
estimate an individual’s disease status
What is the index test?
A new or improved test which is tested against the reference standard
What is economic evaluation in the context of test diagnostics?
“… the comparative analysis of alternative courses
of action in terms of both their costs and
consequences.”
What does correlation do?
Measures the strength of linear association between two continuous variables (exposure and outcome)
What are the four components of the Pearson correlation coefficient?
- True value in the population (⍴)
- Estimated in sample by r
- Can take values between -1 and 1
- It is only valid within the range of values in the sample
What is the r score if there is no correlation?
r=0
What is the r score of an imperfect positive correlation
0
What is the r score of a perfect positive correlation?
r=1
What is the r score of an imperfect negative correlation?
-1
What is the r score of a perfect negative correlation?
r= -1
What does r =-1 indicate?
A perfect negative linear relationship; as the value of one variable increases, the value of another decreases
What does r=1 indicate?
A perfect positive linear relationship. As
the value of one variable increases the value of the
other increases
What does r=0 indicate?
There is no linear relationship between the 2 continuous variables
What are arbitrary labels for strength of positive correlation
0 - 0.19 very weak
- 2 - 0.39 weak
- 4 - 0.59 moderate
- 6 - 0.79 strong
- 8 – 1.0 very strong