Module 3 Flashcards
How do you summarise categorical data?
- frequencies, proportions or percentages
What test is used to compare the distribution of categorical variable to the hypothesised distribution?
chi squared goodness-of-fit test /one sample chi squared test
What is an example of a one-sample chi squared test hypothesis?
H0: is evenly distributed
H1: is not evenly distributed
Why is the one sample chi-squared test (chi squared goodness-of-fit test) used?
- to quantify the discrepancy between the expected and observed frequencies
e. g. between sample and hypothesised value
What is the shape of the chi-squared distribution?
- non-symmetric (always positive)
- changes with the df (degrees of freedom)
What is degree of freedom?
= number of groups -1
- indicates how many of the data points are ‘flexible’
What test is used to look at association between 2 categorical variables?
- chi-squared test of independence
How is the test statistic calculated for chi-squared test of independence?
- same as normal except need to calculate for each cell in the table
e. g. (column total x row total)/overall total
How do you calculate the df for a contingency table/cross-table?
df = (number of rows - 1) x (number of columns - 1)
What does a small x^2 value mean?
- when the observed value is approximately eyqla o the expected value in each cell
- only vary due to sample variability
What causes a large x^2 value?
- sample variability (given by p-value)
- Null hypothesis is not true
What are the chi-squared test of independence assumptions?
- the observational units are independent
- the expected cell counts should be >5
What are the limitations of x^2 test of independence?
- not informative about how variables are related
- only really be used for bivariate analysis
What are other options for assessing associations in categorical variables?
- relative risk
- odds ratio
Can chi-squared test of independence be used for before and after?
- no because the measurement is on the same individual
What is a McNemar’s test?
- used for 2x2 tables to test repeated measurments on the same variable
SIMPLE CONCEPT:
- if no change, participants stay on diagnoal
- if change, participants move off the diagonal
What test is used for continuous data?
one sample t-test
What is a t-test?
- parametric test used for testing differences in means
- tests the hypothesis that the means of a sample is equal to a fixed value
What does the one sample t-test assume?
- data is normal distributed
What is the test statistic equation for a one-sample t-test?
t = (sample mean - expected value)/ (sample sd/ square root of the sample size)
what influence the fatness of the t-test distribution tail?
- degrees of freedom
What makes a t-distribution more normally distributed?
- bigger sample size/more df
What is the t-test distribution if n>30?
- sampling distribution of means is approximately normally distributed
When is a two sample/independent sample t-test used?
compare two groups
- dependent is continuous
- independent is categorical
What are the assumptions for a two-sample t-test?
- distribution is normally distributed or >30
- results come from two independent samples
- variances in the two groups are the same
what test is used if it is unknown if the sample is normal?
non-parametric test (mann-whitney test)
What does a mann-whitney test/Wilcoxon Rank sum test compare?
- medians of two samples
When is a paired t-test used?
- before and after
- left and right arm
What is a ANOVA t-test?
- one way analysis of variance
- used when >2 groups
What is the ANOVA hypotheses?
Null = means are the same H1 = at least one mean differs
What are the two types of variation within data for ANOVA?
- between groups
- within group
How can you tell if the variation is between groups?
- distributions are at different levels of the x-axis
How can you tell if the variation is within groups?
- the distributions overlap but are very wide
How do you calculate total variation?
sstotal (sum of squares) = sum of (mean - overall mean)^2
What does the conversation of SS to MS (mean square) for?
- account for different df in each calculation
What is the total variance equation?
MStotal = SStotal/(N-1)
What is the between group variance equation?
MSgroups = SSgroups/(k-1)
What is the within group variance equation?
MSerror = SSerror/(N-k)
What does the ratio of MSgroups/MSerror show?
- how much bigger groups effect is compared to random noise
What is the variance ratio?
- ratio of two variances
- denoted F (f is the test statistic for ANOVA)
When is a post-hoc test used?
- if the H0 is rejected in an ANOVA to determine which means are different
What is the most common post hoc test?
- tukey
What is the steps for interpreting ANOVA results?
- check ANOVA assumptions
- conduct ANOVA
- if p-value>0.5 then do not reject H0)
- if p-value<0.5 then reject and do post-hoc testing
What test is used for normality assumption (ANOVA)?
- non-parametric test such as kruskal-wallis test
What test is used for equal variances assumption?
- levene’s test to test the H0 that variances of groups is the same
- if test is significant the variances are not equal
what test statistics equation is used for one-sample chi squared test?
= observed - expected/precision
What type of test is a t-test?
- parametric test
What is an assumption of a parametric test?
- assumes the data follows a known distribution
What does a one-sample t-test test?
- the hypothesis that the mean of a sample is equal to a fixed value
What t-test is used to decide if variances are equal?
Levene’s test
What is an advantage of paired test?
- takes out the variation between patients and only the effect of a drug
What are the assumptions for ANOVA?
- normally distributed or >30
- equal variances
- independence among observations
What number of type 1 error is achieved after all post-hoc tests?
- 0.05