Week 9, 10, 11, 12 Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Expected contingency table

A

The contingency table you expect from a population where the null hypothesis is true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Hypothesis testing for categorical data is based on __

A

contingency tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Difference between 1 way and 2 way expected contingency tables

A

1 way contingency looks at if there are differences in the counts between levels of a variable
-Expectation: Counts are distributed equally among cells - take total count and divide by amount of levels

2 way contingency looks at if the counts are independent between the variables
-Expectation: Counts are distributed independently among cells

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How to calculate independence for 2 way tables

A

Calculating independence requires first calculating the marginal distribution as proportions
The expected table is the product of the row and column proportions for each cell, multiplied by the table total

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the chi-square score measure

A

Measures the distance between observed and expected contingency tables. It works by calculating the squared difference between the two tables on a cell-by-cell basis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The four steps in calculating the chi-square score

A

Take the difference between each observed and expected cell

Square the difference

Divide by the expected value

Sum over all cells in the table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Chi Square Distribution

A

The null distribution for hypothesis testing with categorical data

It is the distribution of chi-square scores you would get from sampling an imaginary statistical population where the null hypothesis was true

ONLY POSITIVE VALUES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Degrees of freedom for 1 way tables vs 2 way tables

A

1 way: df=k-1
Number of cells minus 1

2 way: df=(r-1)(c-1)
(rows-1 multiplied by columns-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Statistical test conclusions for chi square scores

A

-Reject null hypothesis if X^2 observed> X^2 critical or is p<alpha

Fail to reject the null hypothesis if X^2 observed< X^2 critical or is p>alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Reports for chi square tests should include:

A

Name of Test
Degrees of freedom
Total count in the observed table
The observed chi-squared value (two decimal places)
P-value (three decimal places)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Write out the formula for the t-observed for correlation

A

Refer to formula sheet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Reporting of correlation test should include

A

Symbol for the test
Degrees of freedom
Observed correlation value
P-value (three decimal places)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Correlation

A

Evaluate the association between two numerical variables (looking for a pattern)
No implied causation between the variables
Both variables are assumed to have variation
Not used for prediction
Pearson’s correlation coefficient (r, p) measures the strength of the association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Write out the formula for r

A

refer to formula sheet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Difference between correlation and linear regression

A

Correlation cannot predict, linear regression can (experimental studies)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the linear equation? Describe the components

A

y=a+b(Xi)

Slope (b):
-Amount that the response variable (y) increases of decreases for every unit change in the predictor variable (x)

Intercept (a)
-The value of the response variable (y) when the predictor variable (x) is at 0

17
Q

Statistical Model
-3 componentS

A

Systematic Component: describes the mathematical function used for predictions
Linear equation

Random component: describes the probability distribution for sampling error (linear regression is Normal Distribution)
Error distribution
Only occurs in response variable

Link Function: connects the systematic component to the random component

18
Q

How to calculate the sum of squares

A
  1. Calculate the residual for each data point
  2. Take the square of each residual
  3. Sum the squared residuals across all data points
  4. Divide by the degrees of freedom
19
Q

Null and alternative hypothesis for linear regression

A

Intercept:How does the intercept (a) relate to a reference value (Ba)

Slope:How does the slop (b) relate to a reference value (Bb)

20
Q

What is the null distribution in linear regression hypothesis tests

A

a t-distribution

21
Q

What are the 4 main assumptions of linear regression

A
  1. Linearity
    - Response variable a linear combination of the predictor variable
    Y=a+bx is a straight line
  2. Independence
    The residuals along the predictor variable should be independent of each other
    Evaluated qualitatively using a plot of residuals against the predictor variable
  3. Normality
    Residual variation should be Normally distributed
    Evaluated using the Shapiro-Wilks Test
  4. Homoscedasticity
    The residual variation should be similar across the range of the predictor variable
22
Q

F test

A

evaluate the difference in variance between two groups

23
Q

Null and alternative hypothesis of F-test

A

Ho= ratio of variances is one

Ha= ration of variances is not 1

24
Q

What to report for F-score

A

Mean, sd, sample size for each group
Observed F score
Degrees of freedom for each group
P-value

25
Q

Single Factor ANOVA

A

Used to work with numerical and categorical variables

Group variation: the variation among means of the categorical levels

Residual variation: the variation among sampling units within a categorical variable

ANOVA evaluates whether there is a difference in means among categorical levels

26
Q

Post-Hoc tests

A

Secondary statistical test designed to indicate which groups have different means
ONly used if the ANOVA F-test indicates to reject the null hypothesis

27
Q

Two Factor ANOVA

A

Two categorical variables and their interaction

28
Q

Two factor ANOVA is used to anser 3 questions

A

Main effects A
-Differences among the levels of factor A averaging across the levels of factor B

Main effects B
Differences among the levels of factor B averaging across the levels of factor A

Interactions
Differences among the levels of one factor within each level of the other factor
Cell by cell comparisons

29
Q

Interactions are deviations from____

A

additivity: When the effect of the levels are their simple sum