- parametric test used for testing differences in means - tests the hypothesis that the means of a sample is equal to a fixed value

Module 3 Flashcards by Elo Casey

How do you summarise categorical data if using 1 variable

frequencies, proportions or percentages

How well did you know this?

Not at all

Perfectly

What test is used to compare the distribution of categorical variable to the hypothesised distribution?

chi squared goodness-of-fit test /one sample chi squared test

How well did you know this?

Not at all

Perfectly

What is an example of a one-sample chi squared test hypothesis?

H0: is evenly distributed
H1: is not evenly distributed

How well did you know this?

Not at all

Perfectly

Why is the one sample chi-squared test (chi squared goodness-of-fit test) used?

to quantify the discrepancy between the expected and observed frequencies
e. g. between sample and hypothesised value

How well did you know this?

Not at all

Perfectly

What is the shape of the chi-squared distribution?

non-symmetric (always positive)

- changes with the df (degrees of freedom)

How well did you know this?

Not at all

Perfectly

What is degree of freedom?

= number of groups -1

- indicates how many of the data points are ‘flexible’

How well did you know this?

Not at all

Perfectly

What test is used to look at association between 2 categorical variables?

chi-squared test of independence

How well did you know this?

Not at all

Perfectly

How is the test statistic calculated for chi-squared test of independence?

same as normal except need to calculate for each cell in the table
e. g. (column total x row total)/overall total

How well did you know this?

Not at all

Perfectly

How do you calculate the df for a contingency table/cross-table?

df = (number of rows - 1) x (number of columns - 1)

How well did you know this?

Not at all

Perfectly

What does a small x^2 value mean?

when the observed value is approximately equal to the expected value in each cell
only vary due to sample variability

How well did you know this?

Not at all

Perfectly

What causes a large x^2 value?

sample variability (given by p-value)

- Null hypothesis is not true

How well did you know this?

Not at all

Perfectly

What are the chi-squared test of independence assumptions?

the observational units are independent

- the expected cell counts should be >5

How well did you know this?

Not at all

Perfectly

What are the limitations of x^2 test of independence?

not informative about how variables are related

- only really be used for bivariate analysis

How well did you know this?

Not at all

Perfectly

What are other options for assessing associations in categorical variables?

relative risk

- odds ratio

How well did you know this?

Not at all

Perfectly

Can chi-squared test of independence be used for before and after?

no because the measurement is on the same individual

How well did you know this?

Not at all

Perfectly

What is a McNemar’s test?

used for 2x2 tables to test repeated measurments on the same variable

SIMPLE CONCEPT:

if no change, participants stay on diagnoal
if change, participants move off the diagonal

How well did you know this?

Not at all

Perfectly

What test is used for continuous data?

one sample t-test

How well did you know this?

Not at all

Perfectly

What is a t-test?

parametric test used for testing differences in means

- tests the hypothesis that the means of a sample is equal to a fixed value

How well did you know this?

Not at all

Perfectly

What does the one sample t-test assume?

data is normal distributed

How well did you know this?

Not at all

Perfectly

What is the test statistic equation for a one-sample t-test?

t = (sample mean - expected value)/ (sample sd/ square root of the sample size)

How well did you know this?

Not at all

Perfectly

what influence the fatness of the t-test distribution tail?

degrees of freedom

How well did you know this?

Not at all

Perfectly

What makes a t-distribution more normally distributed?

bigger sample size/more df

How well did you know this?

Not at all

Perfectly

What is the t-test distribution if n>30?

sampling distribution of means is approximately normally distributed

How well did you know this?

Not at all

Perfectly

When is a two sample/independent sample t-test used?

compare two groups

dependent is continuous
independent is categorical

How well did you know this?

Not at all

Perfectly

What are the assumptions for a two-sample t-test?

- distribution is normally distributed or >30 - results come from two independent samples - variances in the two groups are the same

what test is used if it is unknown if the sample is normal?

non-parametric test (mann-whitney test)

What does a mann-whitney test/Wilcoxon Rank sum test compare?

- medians of two samples

When is a paired t-test used?

- before and after | - left and right arm

What is a ANOVA t-test?

- one way analysis of variance | - used when >2 groups

What is the ANOVA hypotheses?

``` Null = means are the same H1 = at least one mean differs ```

What are the two types of variation within data for ANOVA?

- between groups | - within group

How can you tell if the variation is between groups?

- distributions are at different levels of the x-axis

How do you calculate total variation?

sstotal (sum of squares) = sum of (mean - overall mean)^2

What does the conversation of SS to MS (mean square) for?

- account for different df in each calculation

What is the total variance equation?

MStotal = SStotal/(N-1)

What is the between group variance equation?

MSgroups = SSgroups/(k-1)

What is the within group variance equation?

MSerror = SSerror/(N-k)

What does the ratio of MSgroups/MSerror show?

- how much bigger groups effect is compared to random noise

What is the variance ratio?

- ratio of two variances | - denoted F (f is the test statistic for ANOVA)

When is a post-hoc test used?

- if the H0 is rejected in an ANOVA to determine which means are different

What is the most common post hoc test?

- tukey

What is the steps for interpreting ANOVA results?

- check ANOVA assumptions - conduct ANOVA - if p-value>0.5 then do not reject H0) - if p-value<0.5 then reject and do post-hoc testing

What test is used for normality assumption (ANOVA)?

- non-parametric test such as kruskal-wallis test

What test is used for equal variances assumption?

- levene's test to test the H0 that variances of groups is the same - if test is significant the variances are not equal

what test statistics equation is used for one-sample chi squared test?

= observed - expected/precision

What type of test is a t-test?

- parametric test

What is an assumption of a parametric test?

- assumes the data follows a known distribution

What does a one-sample t-test test?

- the hypothesis that the mean of a sample is equal to a fixed value

What t-test is used to decide if variances are equal?

Levene's test

What is an advantage of paired test?

- takes out the variation between patients and only the effect of a drug

What are the assumptions for ANOVA?

- normally distributed or >30 - equal variances - independence among observations

What number of type 1 error is achieved after all post-hoc tests?

- 0.05

Which of the following are assumptions of the Chi-square test of independence? - There are no assumptions for this test - Expected cell counts ≥5 - Data are normally distributed - Observations are independent Cell counts ≥5

- Expected cell counts ≥5 | - Observations are independent

If I conduct a Chi-square Test of Independence on a 3x4 contingency table and discover I have several cells with expected counts < 5 what should I do? - Remove the troublesome categories and repeat the analysis. - Continue with the analysis and report the results. - Try using another statistical test. - Review the expected counts for each cell to identify the problem categories and then try to combine them with another category if it is sensible to do so.

Review the expected counts for each cell to identify the problem categories and then try to combine them with another category if it is sensible to do so.

True or false? I would use a Chi-square Test of Independence if I want to look for an association between two ordinal variables in my data set.

true

True or false? I would use a Chi-square Test of Independence if I want to look for an association between two continuous variables in my data set.

false

How many degrees of freedom are there in a Chi-square analysis if I am comparing two variables, one with three categories and the other with four categories?

6 The degrees of freedom (df) are calculated using the formula (number of categories in variable 1 - 1) x (number of categories in variable 2 - 1). When expressed in a contingency table we can simplify this to the (number of rows - 1) x (number of columns - 1). In this question there were 3 categories in one variable and 4 in the other, so df = (3-1) x (4-1) = 6.

Which is the most appropriate summary of results for the following R Commander output? Paired t-test data: Baseline Triglyceride and Final Triglyceride levels t = 1.200, df = 15, p-value = 0.249 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -10.915 39.040 sample estimates: mean of the differences(Baseline - Final) 14.062 - There was a significant difference in the mean triglyceride level and the mean final triglyceride level (p>0.05). - The mean triglyceride values were significantly higher than the mean final triglyceride levels (mean difference = 14.1 mmol/L, p > 0.05). - The final triglyceride values were on average 14.1 mmol/L higher than the triglyceride levels (95% CI: -10.9 - 39.0 mmol/L, t15=1.20, p=0.249). This difference was not considered statistically significant. - The triglyceride values were on average 14.1 mmol/L higher than the final triglyceride levels (95% CI: -10.9 - 39.0 mmol/L, t15=1.20, p=0.249). This difference was not considered statistically significant.

-The triglyceride values were on average 14.1 mmol/L higher than the final triglyceride levels (95% CI: -10.9 - 39.0 mmol/L, t15=1.20, p=0.249). This difference was not considered statistically significant.

You use an ANOVA to compare the means of three or more groups instead of doing multiple pairwise t-tests because ... - The ANOVA controls for the type I error, reducing the chances of incorrectly concluding there is a difference between some groups. - my lecturer told me to. - the ANOVA is testing something different to the t-test. - it is easier to do one ANOVA in SPSS compared to running multiple t-tests.

- The ANOVA controls for the type I error, reducing the chances of incorrectly concluding there is a difference between some groups.

If the assumption for equal variances is violated in an ANOVA what should you do? - Report the usual ANOVA F statistic - Don't worry about it because the ANOVA is robust to violations in the assumptions - Use the Kruskal-Wallis test instead of the ANOVA - Use the Welch F or Brown-Forsythe F test statistics

Use the Welch F or Brown-Forsythe F test statistics

If you obtained the following ANOVA output, would you conduct post-hoc tests? Df Sum Sq Mean Sq F value Pr(>F) Factor(Treatment) 3 90 30.00 0.364 0.55 Residuals 68 5600 82.35 Yes No

No The ANOVA output indicates there is no significant difference between the means of the groups being compared. Therefore we don't need to do post-hoc tests.

True or false? I can use the results of the normality check done as part of my univariate descriptive statistics to check the normality assumption for a t-test.

False

How do you summarise categorical data if using 2 variables

Cross tabulation (contingency table)

How do you calculate the expected frequencies

column total x row total / overall total

Module 3 Flashcards

(64 cards)