Lecture 4: Analysis of Continuous and Categorical Variables Flashcards

1
Q

Descriptive vs Inferential Statistics

A

descriptive: describing the central tendency and dispersion of data

inferential: use sample data to draw conclusions about the population that the sample is mean to represent (sampling will naturally involve error)
o Estimate parameters and test hypotheses to make inferences about the population
o Compare means and evaluate relationships
o Test statistics, p-values, confidence intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which test do I apply if I have 2 related samples and parametric data?

A

paired t-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which test do I apply if I have 2 related samples and non- parametric data?

A

Wilcoxon test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which test do I apply if I have 2 independent samples and parametric data?

A

Independent t-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which test do I apply if I have 2 independent samples and non-parametric data?

A

Mann-Whitney U test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which test do I apply if I have 3 or more groups and parametric data?

A

ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which test do I apply if I have 3 or more groups and non-parametric data?

A

Kruskal-Wallis test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is student t-test?

A

Used to compare means between two groups
o Related groups: paired t-test (e.g. pre- and post- study measures on the same participants)
o Independent groups: unpaired/independent t-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the null hypothesis in student t-test?

A

the means of the groups are not statistically different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the degrees of freedom in student t-test?

A

amount of information provided by the data that can be used to estimate population parameters and
variability of the estimates

o df = n - # of estimated parameters
o As df increase, t-distribution more closely resembles a normal distribution

• E.g. One sample independent t-test to estimate the population mean
o Estimates the standard deviation about the mean o Uses a t-distribution with df = n-1
o Df=n–1 for paired t-test as well
• E.g. Two sample independent t-test to compare two means
o Uses a t-distribution with df = n1 + n2 – 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the assumptions of T-test?

A

• Samples are independent
• Variable is normally distributed
• Variance homogeneity variance within each group is equal
o Levene’s test for equality of variances in SPSS (automatically conducted)
o Informs you whether to use results for pooled or unpooled variance
• T-tests fairly robust even if assumptions are not perfectly met

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is t-statistics?

A

Difference between the means divided by the pooled or unpooled standard error of the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is confidence interval?

A

Degree of uncertainty: area around the sample statistics where the corresponding population parameter is likely to be

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

True/False

The larger the sample, the smaller the CI

A

True

Greater likelihood that the sample statistics approximates the population parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

True/False

If CI contains 0 (null value) then the means are not statistically different (non-significant finding)

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How to report independent t-test results?

A

Report the means and standard deviations for both groups, t-value, degrees of freedom, and p-value.

o E.g. Males (mean ± StD): 24.0 ± 4.1, Females (mean ± StD): 22.9 ± 4.0; t = 1.3, df = 116, p = 0.20

P-values from t-test often reported in subject characteristics table when data is compared between two groups (e.g. males vs. females, intervention vs. control group)

17
Q

Analysis of Variance ANOVA

A

Test to determine if means differ between 3 or more groups o Unlike t-tests, uses variance to assess differences
• Test null hypothesis that variances between the groups are equal
o F-test will result in rejection of null hypothesis when variability between group means is sufficiently larger than variability
within the groups.

18
Q

F-statistics

A

If ratio is large, indicates not all means are equal –> significant p-value will result

19
Q

In an F-test what does it mean to reject the null hypothesis?

A

variability between group means is sufficiently larger than variability within the groups

20
Q

Degrees of freedom in ANOVA

A

• Df1: df associated with the number of the F-statistics o Df=k–1
o K: # of group means
• Df2: df associated with the denominator of the F-
statistic
o Df=n–k

The first one represents between group variability, so this is calculated by k-1 (k is the number of groups). Nominator is the within group variability and that is calculated by n-k (sample size – the group means).

21
Q

What does variability look like in SPSS?

A

if the group has low group variability within the dots are close together. if there is high variability between the groups then the dots are more dispersed, so you are more likely to find a significant difference with high variability between your groups.

If the dots are more spread apart there will be more variation less dispersion more precision high variability within group, the difference between the groups are larger than the difference within the groups, you will have a significant finding.

22
Q

True/False

Low F value, less variability between your groups will lead to a retainment of the null hypothesis.

A

True

23
Q

What are the assumptions of ANOVA?

A
  • Variable is normally distributed
  • The errors are normally distributed
  • The cases are independent from each other
  • Variance homogeneity
24
Q

How to report ANOVA results?

A
  • Report means and standard deviations for all groups, as well as the F value, degrees of freedom, and p-value
  • Test for this example:
o Analysis of variance indicated that the different physical activity groups report different levels of caloric intake: 
§ Sedentary:1640.1±516.7
§ Light:2030.8±570.9
§ Moderate:1999.4±670.7 
§ High:2005.1±633.5
§ F(3,144)=3.3,p=0.02
25
Q

Post hoc test ANOVA

A

• ANOVA tells you if there is a difference between means, but not specifically which means differ
o Conduct multiple comparisons post-hoc test to determine which means differ

26
Q

What are the post hoc tests that ANOVA offers

A

o Tukey’s Test
o Least Squared Difference
o Dunnett’s test for between group comparisons

27
Q

One-way ANOVA

A

considers only one independent variable (factor) for independent groups

28
Q

Repeated measures one-way ANOVA

A

one-way ANOVA for related groups

29
Q

Multivariate ANOVA (MANOVA)

A

ANOVA with several dependent variables

30
Q

Factorial ANOVA

A

Compares means across two or more independent variables (factors)

31
Q

What are the tests that can be done for Analysis of Categorical Variables

A

• Chi-square test of independence
o Tests whether there is a significant association between two or more categorical variables

• Fisher’s Exact Test
o Use when 20% or more of the cells have <5 counts of data

• Test for Trend
o More powerful for ordinal data

32
Q

Chi-square test of independence

A

Tests whether there is a significant association between two or more categorical variables

• Utilizes contingency tables (also referred to as cross-tabulation, crosstab, or two-way table) o 2 x 2 table when each variable has 2 groups
o 2 x k table when each variable has k groups
• Assesses goodness of fit between observed values and theoretically expected values
• Df = k -1 (k=number of groups-columns)

33
Q

Fisher’s Exact Test

A

Use when 20% or more of the cells have <5 counts of data

34
Q

Test for Trend

A

More powerful for ordinal data

• Referred to as Linear-by-Linear association in SPSS
o Tests for trends in contingency tables larger than 2x2
• Takes into account ordered nature of data
o Relates to odds rather than variances
o Assumes that a change in ranks makes no difference to the odds of the outcome (i.e. Odds Ratio = 1)
o Therefore, df = 1

35
Q

What are the assumptions of Chi-squared test?

A

o Variable are ordinal or nominal
o Groups/categories are independent
§Use McNemar test for related groups (if the observation are from the same people)