Module 3 Flashcards

1
Q

How do you summarise categorical data if using 1 variable

A
  • frequencies, proportions or percentages
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What test is used to compare the distribution of categorical variable to the hypothesised distribution?

A

chi squared goodness-of-fit test /one sample chi squared test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an example of a one-sample chi squared test hypothesis?

A

H0: is evenly distributed
H1: is not evenly distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why is the one sample chi-squared test (chi squared goodness-of-fit test) used?

A
  • to quantify the discrepancy between the expected and observed frequencies
    e. g. between sample and hypothesised value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the shape of the chi-squared distribution?

A
  • non-symmetric (always positive)

- changes with the df (degrees of freedom)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is degree of freedom?

A

= number of groups -1

- indicates how many of the data points are ‘flexible’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What test is used to look at association between 2 categorical variables?

A
  • chi-squared test of independence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How is the test statistic calculated for chi-squared test of independence?

A
  • same as normal except need to calculate for each cell in the table
    e. g. (column total x row total)/overall total
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you calculate the df for a contingency table/cross-table?

A

df = (number of rows - 1) x (number of columns - 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does a small x^2 value mean?

A
  • when the observed value is approximately equal to the expected value in each cell
  • only vary due to sample variability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What causes a large x^2 value?

A
  • sample variability (given by p-value)

- Null hypothesis is not true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the chi-squared test of independence assumptions?

A
  • the observational units are independent

- the expected cell counts should be >5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the limitations of x^2 test of independence?

A
  • not informative about how variables are related

- only really be used for bivariate analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are other options for assessing associations in categorical variables?

A
  • relative risk

- odds ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Can chi-squared test of independence be used for before and after?

A
  • no because the measurement is on the same individual
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a McNemar’s test?

A
  • used for 2x2 tables to test repeated measurments on the same variable

SIMPLE CONCEPT:

  • if no change, participants stay on diagnoal
  • if change, participants move off the diagonal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What test is used for continuous data?

A

one sample t-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a t-test?

A
  • parametric test used for testing differences in means

- tests the hypothesis that the means of a sample is equal to a fixed value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does the one sample t-test assume?

A
  • data is normal distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the test statistic equation for a one-sample t-test?

A

t = (sample mean - expected value)/ (sample sd/ square root of the sample size)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what influence the fatness of the t-test distribution tail?

A
  • degrees of freedom
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What makes a t-distribution more normally distributed?

A
  • bigger sample size/more df
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the t-test distribution if n>30?

A
  • sampling distribution of means is approximately normally distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

When is a two sample/independent sample t-test used?

A

compare two groups

  • dependent is continuous
  • independent is categorical
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What are the assumptions for a two-sample t-test?
- distribution is normally distributed or >30 - results come from two independent samples - variances in the two groups are the same
26
what test is used if it is unknown if the sample is normal?
non-parametric test (mann-whitney test)
27
What does a mann-whitney test/Wilcoxon Rank sum test compare?
- medians of two samples
28
When is a paired t-test used?
- before and after | - left and right arm
29
What is a ANOVA t-test?
- one way analysis of variance | - used when >2 groups
30
What is the ANOVA hypotheses?
``` Null = means are the same H1 = at least one mean differs ```
31
What are the two types of variation within data for ANOVA?
- between groups | - within group
32
How can you tell if the variation is between groups?
- distributions are at different levels of the x-axis
33
How do you calculate total variation?
sstotal (sum of squares) = sum of (mean - overall mean)^2
34
What does the conversation of SS to MS (mean square) for?
- account for different df in each calculation
35
What is the total variance equation?
MStotal = SStotal/(N-1)
36
What is the between group variance equation?
MSgroups = SSgroups/(k-1)
37
What is the within group variance equation?
MSerror = SSerror/(N-k)
38
What does the ratio of MSgroups/MSerror show?
- how much bigger groups effect is compared to random noise
39
What is the variance ratio?
- ratio of two variances | - denoted F (f is the test statistic for ANOVA)
40
When is a post-hoc test used?
- if the H0 is rejected in an ANOVA to determine which means are different
41
What is the most common post hoc test?
- tukey
42
What is the steps for interpreting ANOVA results?
- check ANOVA assumptions - conduct ANOVA - if p-value>0.5 then do not reject H0) - if p-value<0.5 then reject and do post-hoc testing
43
What test is used for normality assumption (ANOVA)?
- non-parametric test such as kruskal-wallis test
44
What test is used for equal variances assumption?
- levene's test to test the H0 that variances of groups is the same - if test is significant the variances are not equal
45
what test statistics equation is used for one-sample chi squared test?
= observed - expected/precision
46
What type of test is a t-test?
- parametric test
47
What is an assumption of a parametric test?
- assumes the data follows a known distribution
48
What does a one-sample t-test test?
- the hypothesis that the mean of a sample is equal to a fixed value
49
What t-test is used to decide if variances are equal?
Levene's test
50
What is an advantage of paired test?
- takes out the variation between patients and only the effect of a drug
51
What are the assumptions for ANOVA?
- normally distributed or >30 - equal variances - independence among observations
52
What number of type 1 error is achieved after all post-hoc tests?
- 0.05
53
Which of the following are assumptions of the Chi-square test of independence? - There are no assumptions for this test - Expected cell counts ≥5 - Data are normally distributed - Observations are independent Cell counts ≥5
- Expected cell counts ≥5 | - Observations are independent
54
If I conduct a Chi-square Test of Independence on a 3x4 contingency table and discover I have several cells with expected counts < 5 what should I do? - Remove the troublesome categories and repeat the analysis. - Continue with the analysis and report the results. - Try using another statistical test. - Review the expected counts for each cell to identify the problem categories and then try to combine them with another category if it is sensible to do so.
Review the expected counts for each cell to identify the problem categories and then try to combine them with another category if it is sensible to do so.
55
True or false? I would use a Chi-square Test of Independence if I want to look for an association between two ordinal variables in my data set.
true
56
True or false? I would use a Chi-square Test of Independence if I want to look for an association between two continuous variables in my data set.
false
57
How many degrees of freedom are there in a Chi-square analysis if I am comparing two variables, one with three categories and the other with four categories?
6 The degrees of freedom (df) are calculated using the formula (number of categories in variable 1 - 1) x (number of categories in variable 2 - 1). When expressed in a contingency table we can simplify this to the (number of rows - 1) x (number of columns - 1). In this question there were 3 categories in one variable and 4 in the other, so df = (3-1) x (4-1) = 6.
58
Which is the most appropriate summary of results for the following R Commander output? Paired t-test data: Baseline Triglyceride and Final Triglyceride levels t = 1.200, df = 15, p-value = 0.249 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -10.915 39.040 sample estimates: mean of the differences(Baseline - Final) 14.062 - There was a significant difference in the mean triglyceride level and the mean final triglyceride level (p>0.05). - The mean triglyceride values were significantly higher than the mean final triglyceride levels (mean difference = 14.1 mmol/L, p > 0.05). - The final triglyceride values were on average 14.1 mmol/L higher than the triglyceride levels (95% CI: -10.9 - 39.0 mmol/L, t15=1.20, p=0.249). This difference was not considered statistically significant. - The triglyceride values were on average 14.1 mmol/L higher than the final triglyceride levels (95% CI: -10.9 - 39.0 mmol/L, t15=1.20, p=0.249). This difference was not considered statistically significant.
-The triglyceride values were on average 14.1 mmol/L higher than the final triglyceride levels (95% CI: -10.9 - 39.0 mmol/L, t15=1.20, p=0.249). This difference was not considered statistically significant.
59
You use an ANOVA to compare the means of three or more groups instead of doing multiple pairwise t-tests because ... - The ANOVA controls for the type I error, reducing the chances of incorrectly concluding there is a difference between some groups. - my lecturer told me to. - the ANOVA is testing something different to the t-test. - it is easier to do one ANOVA in SPSS compared to running multiple t-tests.
- The ANOVA controls for the type I error, reducing the chances of incorrectly concluding there is a difference between some groups.
60
If the assumption for equal variances is violated in an ANOVA what should you do? - Report the usual ANOVA F statistic - Don't worry about it because the ANOVA is robust to violations in the assumptions - Use the Kruskal-Wallis test instead of the ANOVA - Use the Welch F or Brown-Forsythe F test statistics
Use the Welch F or Brown-Forsythe F test statistics
61
If you obtained the following ANOVA output, would you conduct post-hoc tests? Df Sum Sq Mean Sq F value Pr(>F) Factor(Treatment) 3 90 30.00 0.364 0.55 Residuals 68 5600 82.35 Yes No
No The ANOVA output indicates there is no significant difference between the means of the groups being compared. Therefore we don't need to do post-hoc tests.
62
True or false? I can use the results of the normality check done as part of my univariate descriptive statistics to check the normality assumption for a t-test.
False
63
How do you summarise categorical data if using 2 variables
Cross tabulation (contingency table)
64
How do you calculate the expected frequencies
column total x row total / overall total