Final Flashcards

1
Q

ANOVA

A

analysis of variance, measures differences in sample means accross 2 or more groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In ANOVA if H0 is false, there should be…

A

…a substatnial difference between categories between categories but not within

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

F ratio =

A

mean square between / mean square within

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

F ratio is bigger when…

A

categories are more distinct and tightly clustered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

[ . [ ] . ]

[ . ]

A

F ratio = smaller/larger = smaller

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

[ . ] [ . ]

[ . ]

A

F ratio = larger/smaller = larger

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The assumptions of a ANOVA test

A

independent random samples, interval/ratio measurement, normal distribution, population variances are equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Limitations of ANOVA

A
  • requires interval/ratio dependent , nominal independent
  • just bc its significant doesn’t been its substantive
  • the alternate hypothesis is not specific
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

the alternate hypothesis of ANOVA test

A

At least one of the population means differs from the others

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

synonyms of mean square between/mean square within

A

sum of square between/degrees of freedom between

sum of square within/degrees of freedom within

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Is ANOVA one-tailed or two-tailed?

A

one-tailed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

the main question of ANOVA

A

is there more variance between categories or within?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

strengths of chi-square test

A

allows use of nominal (and ordinal) variables for dependent instead of just interval/ratio like ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

in a bivariate table, is the independent variable in the columns or rows?

A

independent = columns, dependent = rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

chi-square test

A

a test of independence/significance based on bivariate, crosstabulation tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

the H0 of chi-square test

A

the variables are completely random and independent, Fo = Fe

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what does Fe stand for and how is it calculated?

A

expected frequencies = row marginal x column marginal/n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

assumptions of chi-square test

A

independent random sample, nominal level of measurement, no assumption of sampling distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

why is there no assumption of sampling distribution in chi-square test?

A

bc chi-square test is non-parametric, i.e. it does not deal with distribution patterns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

degrees of freedom in chi-square

A

(rows - 1)(columns - 1)
degrees of freedom is like Sudoku –> how many cells can be missing while still being able to figure out all of the blanks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

limitations of chi-square

A
  • tells us that categories are independent, but it doesn’t tell us about patterns/nature of the relationship
  • difficult to interpret when variables have many categories
  • with a small sample size, it cannot be assumed that chi-square sampling distribution is accurate
  • very sensitive to sample size
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

how does chi-square test react to large sample sizes?

A

as the sample size increases, chi-square obtained increases. With large samples, trivial relationships may be significant (i.e. things can be erroneously said to be significantly different)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

three questions of bivariate association

A

(1) does an association exist?
(2) how strong is the association?
(3) what is the pattern/direction of association?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

when do we want to use Lambda?

A

for nominal variables with large sample sizes that can’t be properly assessed with chi-square.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

PRE measures

A

Proportional Reduction in Error
1st prediction: ignore information about the independent variable and make many errors E in predicting the value of the dependent variable
2nd prediction: take into account information about the independent variable in predicting the value of the dependent. If variables are associated, we should make fewer errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

is Lambda PRE?

A

yes

lambda = (E1 - E2)/E1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

interpreting Lambda statistic

A

e.g. lambda = .33 means that the ability to predict something increased by 33%. in other words, the likelihood of making a mistake is reduced by 66%
0.00–0.10 = weak
.011–0.30 = moderate
0.31–1.00 = strong

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Limitations of Lambda

A
  • asymmetric (value will vary dependening on which variable is independent, so care is needed in designating independent variable)
  • when row totals are very unequal, Lambda can be zero even when there is an association between the variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

when row marginals are very unequal, what test should be used?

A

chi-square

30
Q

analyzing association between variables at ordinal elvel

A

to detect association within bivariate, use ch-square, then use Somer’s d to detect the strength of the association (uses same scale as Lambda to determine strength)

31
Q

scattergrams

A

display relationships between two interval/ratio variables

32
Q

describe the axes of a scattergram

A

X = independent, y = vertical

33
Q

regression line

A

aka line of best fit, a line that gets as close to all cases as possible.

34
Q

assessing the strength of regression lines

A

clustering around the lines indicates strength of the linear relationship between two variables

35
Q

formula of regression line

A
Y = a +bX 
where, 
Y = score on the dependent variable 
a = the Y intercept 
b = the slope i.e. amount change produced in Y by unit change in X 
X = score on the independent variable
36
Q

Pearson’s r

A

measure of association for two interval-ratio variables
0.00–0.10 = weak
.011–0.30 = moderate
0.31–1.00 = strong

37
Q

r squared

A

aka coefficient of determination, provides PRE interpretation

38
Q

multivariate regression

A

looks at the part of y that x can explain that z can’t explain i.e. the effect of x on y while controlling for z

39
Q

formula for multivariate regression line

A

y = z + (b1)(X1) + (b2)(x2)

the partial slope controls for the other relationship

40
Q

the ANOVA test is designed for independent variables measured at the ____ level.

A

nominal

41
Q

In the ANOVA test, when the sample means should be roughly equal in value…

A

… if the null hypothesis is true.

42
Q

The ANOVA test uses means and standard deviations to compare the amount of variation _____ with the amount of variation _____.

A

within categories, between categories

43
Q

In the ANOVA test, if the null hypothesis is false, the means of the different sample should be _____ and the standard deviation of the different samples should be _____.

A

very different in value, low in value

44
Q

In the ANOVA test, if the null hypothesis is true, then…

(a) SSB should b at least twice as much as dfb
(b) SSB should be much greater than SSW
(c) the mean square between should be roughly equal to or smaller than the mean square within
(d) the combined dfb and dfw should be much greater than the SST

A

(c) the mean square should be roughly equal to or smaller than the mean square within.

45
Q

ANOVA is a one tailed test and we are concerned only with those outcomes in which there is more variance…

A

…between categories than within categories.

46
Q

To conduct a chi square test, the variables must first be organized into a ______.

A

bivariate table

47
Q

The subtotals calculated for bivariate tables are also known as ____.

A

marginals

48
Q

In the context of chi square, variables are independent if…

(a) they are related.
(b) cause and effect can be proved.
(c) the obtained chi square falls in the critical region.
(d) the score of a case on one variable has no effect on the score of the case on the other variable.

A

(d) the score of a case on one variable has no effect on the score of the case on the other variable.

49
Q

In a 2x2 table, all cell frequencies are exactly the same. This is consistent with which of the following conditions?

A

The variables are independent.

50
Q

When the null hypothesis in the chi square test for independence is true, there should be…

A

….little difference between the observed frequencies and the expected frequencies.

51
Q

A Chi square test has been conducted to assess the relationship between marital status and church attendance. The obtained Chi square is 23.45 and the critical Chi square is 9.488. What may be concluded?

A

Reject the null hypothesis, church attendance and marital status are dependent

52
Q

In a research study conducted to determine if arrests were related to the socioeconomic class of the offender, the chi square critical score was 9.488 and the chi square test statistic was 12.2. We can conclude that the variables are ____.

A

dependent.

53
Q

If variables are arranged in a bivariate table, we can see if they are associated by…

(a) adding their scores vertically.
(b) subtracting their scores horizontally.
(c) computing percentages in the direction of the independent variable.
(d) computing percentages in the direction of the dependent variable.

A

(c) computing percentages in the direction of the independent variable.

54
Q

In the case of a perfect association, predictions from one variable to another can be made (with/without) error.

A

without

55
Q

“As education increases, income rises.” This is an example of a ______ relationship.

A

positive

56
Q

If a researcher is looking to perform an analysis based upon the relationship between the number of arrests as an adult and number of encounters with police as a juvenile, they would use which measure of association?

(a) Somers d.
(b) Lambda.
(c) Chi-Square.
(d) none of these would be appropriate.

A

(d) none of these would be appropriate.

57
Q

Proportional reduction in error (PRE) measures of association are based on the logic of ______.

A

prediction

58
Q

If there is no association between two variables, knowledge of the independent variable does what to the number of errors of prediction?

A

Does not change the number of errors of prediction.

59
Q

A bivariate table shows the association between gender and whether or not a person ever attends formal religious services. Lambda was .34. What may be concluded?

(a) Women are more likely to attend church.
(b) Men are more likely to attend church.
(c) Knowing a person’s gender improves our ability to predict whether or not they attend religious services by 34%.
(d) Knowing whether a person attends religious services improves our ability to predict their gender by 34%.

A

(c) Knowing a person’s gender improves our ability to predict whether or not they attend religious services by 34%.

60
Q

A researcher has computed a Somers’s d of −0.75 between marital happiness and number of children. What can be concluded from this result?

A

that there is a strong, negative relationship between number of children and marital happiness

61
Q

On a scatterplot, the regression line…

A

…comes as close as possible to touching every score.

62
Q

There is no linear relationship between two interval-ratio variables when the regression line on a scatterplot…

A

is parallel to the horizontal axis.

63
Q

The Y intercept is the point where…

A

…the regression line crosses the vertical axis of the scattergram.

64
Q

If a regression line is parallel to the horizontal axis of the scattergram, the slope (b) will be ___.

A

0.00

65
Q

If the regression line showing the effect of education on income has a slope of 1000…

A

…every year of education increases income by 1000

66
Q

A researcher wants to measure the strength of the association between income (measured in dollars per year) and education (measured in number of years of formal schooling). Which of the following would be the most appropriate measure?

(a) the slope (b)
(b) y-intercept
(c) chi-square
(d) pearson’s r

A

(d) pearson’s r

67
Q

In a study of the relationship between geographical mobility (number of times a person has changed residences) and number of friends, Pearson’s r is reported as .40. Which of the following would be the most correct interpretation?

A

Mobility explains 16% of the variation in number of friends.

68
Q

mean square between groups

A

since we will simultaneously consider many groups, and evaluate whether their sample means differ more than we would expect from natural variation

69
Q

Null hypthesis for mean square between groups

A

If the null hypothesis is true, any variation in the sample means is due to chance and shouldn’t be too large.

70
Q

what is the statistic for ANOVA

A

f-statistic