Chapters Beyond 9 Flashcards

1
Q

What does the mean squares represent?

A

The between group variability and the within group variability.

Between is on top
Within is on bottom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When do we do in pairwise comparisons?

A

When we reject H_o in an ANOVA and want to see which means are different between groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why do we do pairwise comparisons?

A

To be able to see which groups have different means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does it mean if two variables measured on the same subject are associated?

A

Knowing one value of the variable tells us something about the value of the other variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What do we use to show the correlation coefficient, and what are its units?

A

r.

r has no units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the parameter for the correlation coefficient?

A

Rho (p)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

r = 1 corresponds to what correlation?

A

A perfect positive correlation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

r = -1 corresponds to what correlation?

A

A perfect negative correlation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

r = 0 signifies what?

A

No correlation. No linear association.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What happens to the scatter plot when there is a strong linear association?

A

The points are tightly clustered around a line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the explanatory variable?

A

X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which is the response variable?

A

Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Does r change if we interchange the explanatory and response variables?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When there is a strong linear association, what do we know?

A

That information about one variable helps in predicting the other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What happens in a weak association?

A

The points are scattered broadly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does a low r mean?

A

It means there is no linear association - however not necessarily that there is no association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Correlation does not imply _______

A

Causation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is linear regression used for?

A

To find a line that summarizes the linear relationship between two variables.

With it we can make predictions about y, the response variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How do we notate the regression line?

A

y_i = beta_o + beta_1x_i + epsilon_i

beta_o = intercept
beta_1 = slope
epsilon_i = error term. Indicates how far y_i is from the line.
20
Q

What is beta_o?

A

Intercept

Represents the average value of y when x is zero.

21
Q

What is beta_1?

A

Slope

Represents the change in the average for y for every one unit increase in x

22
Q

What is epsilon_i?

A

error term. Indicates how far y_i is from the line.

23
Q

We estimate the slope and intercept of the regression line from the data to get:

A

y_hat i = beta_hat o + beta_hat i * x

24
Q

You don’t want to use a regression line to estimate values that are…

A

Outside of the range of data we got in our sample. (This is known as extrapolation)

25
Q

What is the least squares regression line?

A

The line that minimizes the total squares vertical error (i.e. The total of the squares residuals)

26
Q

What are residuals?

A

The vertical deviations between the points and the line y_i-B_o+B_ix_i = epsilon_i

27
Q

In the least squares line:

B_hati can be found by?

A

r *(sd_y / sd_x)

28
Q

In the least squares line :

B_hato can be found by?

A

y_bar - B_hat1*x_bar

29
Q

In ANOVA, what describes how much the observations vary around the sample mean?

A

the within group variance

30
Q

In ANOVA, what describes how much the sample means vary around a total mean?

A

the between group variance

31
Q

What does a Chi-squared Goodness of Fit Test measure?

A

it measures the discrepancy between observed cell counts and cell counts expected under the null hypothesis, to assess whether the hypothesized distribution is plausible

32
Q

What is H_o for the goodness of fit test?

A

H_o: p1=p1, p2+p2, …, pk = pk*

33
Q

What is H_A for the goodness of fit test?

A

H_A: pi != pi*, for some i.

34
Q

What are the expected counts in a goodness of fit test?

A

The proportion given to us, multiplied by the total # of observations in our test.
All the expected counts summed should equal our total observations in our test!

35
Q

What is the Pearson chi-square test statistic?

A

chi-squared = sum from i to k of [(x_i-e_i)^2]/e_i

or [(observed-expected)^2]/expected

36
Q

How is the p-value for a chi-squared goodness of fit test calculated?

A

We use the table for chi-square, and we always want a one-tailed right hand tail.

37
Q

What are the degrees of freedom for a chi-square test?

A

k-1, or the number of groups minus 1.

38
Q

What do we use a chi-square test of independence for?

A

To determine whether two variables, summarized in a 2-way contingency table, are independent.

39
Q

What is a two-way contingency table?

A

a set of frequencies that summarize how a set of objects is simultaneously classified under two different categorizations.

40
Q

What is H_o in a test for independence?

A

H_o: the two variables are independent

41
Q

What is H_A in a test for independence?

A

H_A: the two variables are not independent.

42
Q

How do we calculate the expected count in a test for independence? (e_ij)

A

e_ij = (row_i total * col_j total)/grand total

43
Q

What is the chi-square test statistic for a test for independence?

A

chi^2 = sum from i to r of, sum from j to c of (x_ij - e_ij)^2/e_ij

44
Q

What are the degrees of freedom in a chi-square test for independence involving an r*c table?

A

(r-1)(c-1)

45
Q

When does the chi-square test for independence work well?

A

when e_ij GT= 5

46
Q

What two hypotheses tests are there for categorical data?

A

1) goodness of fit test
2) chi-square test of independence
(both are chi-square tests!)