- requires interval/ratio dependent , nominal independent - just bc its significant doesn't been its substantive - the alternate hypothesis is not specific

Final Flashcards by Momo Sylvia

ANOVA

analysis of variance, measures differences in sample means accross 2 or more groups

How well did you know this?

Not at all

Perfectly

In ANOVA if H0 is false, there should be…

…a substatnial difference between categories between categories but not within

How well did you know this?

Not at all

Perfectly

F ratio =

mean square between / mean square within

How well did you know this?

Not at all

Perfectly

F ratio is bigger when…

categories are more distinct and tightly clustered

How well did you know this?

Not at all

Perfectly

[ . [ ] . ]

[ . ]

F ratio = smaller/larger = smaller

How well did you know this?

Not at all

Perfectly

[ . ] [ . ]

[ . ]

F ratio = larger/smaller = larger

How well did you know this?

Not at all

Perfectly

The assumptions of a ANOVA test

independent random samples, interval/ratio measurement, normal distribution, population variances are equal

How well did you know this?

Not at all

Perfectly

Limitations of ANOVA

requires interval/ratio dependent , nominal independent
just bc its significant doesn’t been its substantive
the alternate hypothesis is not specific

How well did you know this?

Not at all

Perfectly

the alternate hypothesis of ANOVA test

At least one of the population means differs from the others

How well did you know this?

Not at all

Perfectly

synonyms of mean square between/mean square within

sum of square between/degrees of freedom between

sum of square within/degrees of freedom within

How well did you know this?

Not at all

Perfectly

Is ANOVA one-tailed or two-tailed?

one-tailed

How well did you know this?

Not at all

Perfectly

the main question of ANOVA

is there more variance between categories or within?

How well did you know this?

Not at all

Perfectly

strengths of chi-square test

allows use of nominal (and ordinal) variables for dependent instead of just interval/ratio like ANOVA

How well did you know this?

Not at all

Perfectly

in a bivariate table, is the independent variable in the columns or rows?

independent = columns, dependent = rows

How well did you know this?

Not at all

Perfectly

chi-square test

a test of independence/significance based on bivariate, crosstabulation tables.

How well did you know this?

Not at all

Perfectly

the H0 of chi-square test

the variables are completely random and independent, Fo = Fe

How well did you know this?

Not at all

Perfectly

what does Fe stand for and how is it calculated?

expected frequencies = row marginal x column marginal/n

How well did you know this?

Not at all

Perfectly

assumptions of chi-square test

independent random sample, nominal level of measurement, no assumption of sampling distribution

How well did you know this?

Not at all

Perfectly

why is there no assumption of sampling distribution in chi-square test?

bc chi-square test is non-parametric, i.e. it does not deal with distribution patterns

How well did you know this?

Not at all

Perfectly

degrees of freedom in chi-square

(rows - 1)(columns - 1)
degrees of freedom is like Sudoku –> how many cells can be missing while still being able to figure out all of the blanks

How well did you know this?

Not at all

Perfectly

limitations of chi-square

tells us that categories are independent, but it doesn’t tell us about patterns/nature of the relationship
difficult to interpret when variables have many categories
with a small sample size, it cannot be assumed that chi-square sampling distribution is accurate
very sensitive to sample size

How well did you know this?

Not at all

Perfectly

how does chi-square test react to large sample sizes?

as the sample size increases, chi-square obtained increases. With large samples, trivial relationships may be significant (i.e. things can be erroneously said to be significantly different)

How well did you know this?

Not at all

Perfectly

three questions of bivariate association

(1) does an association exist?
(2) how strong is the association?
(3) what is the pattern/direction of association?

How well did you know this?

Not at all

Perfectly

when do we want to use Lambda?

for nominal variables with large sample sizes that can’t be properly assessed with chi-square.

How well did you know this?

Not at all

Perfectly

PRE measures

Proportional Reduction in Error 1st prediction: ignore information about the independent variable and make many errors E in predicting the value of the dependent variable 2nd prediction: take into account information about the independent variable in predicting the value of the dependent. If variables are associated, we should make fewer errors

is Lambda PRE?

yes | lambda = (E1 - E2)/E1

interpreting Lambda statistic

e.g. lambda = .33 means that the ability to predict something increased by 33%. in other words, the likelihood of making a mistake is reduced by 66% 0.00–0.10 = weak .011–0.30 = moderate 0.31–1.00 = strong

Limitations of Lambda

- asymmetric (value will vary dependening on which variable is independent, so care is needed in designating independent variable) - when row totals are very unequal, Lambda can be zero even when there is an association between the variables

when row marginals are very unequal, what test should be used?

chi-square

analyzing association between variables at ordinal elvel

to detect association within bivariate, use ch-square, then use Somer's d to detect the strength of the association (uses same scale as Lambda to determine strength)

scattergrams

display relationships between two interval/ratio variables

describe the axes of a scattergram

X = independent, y = vertical

regression line

aka line of best fit, a line that gets as close to all cases as possible.

assessing the strength of regression lines

clustering around the lines indicates strength of the linear relationship between two variables

formula of regression line

``` Y = a +bX where, Y = score on the dependent variable a = the Y intercept b = the slope i.e. amount change produced in Y by unit change in X X = score on the independent variable ```

Pearson's r

measure of association for two interval-ratio variables 0.00–0.10 = weak .011–0.30 = moderate 0.31–1.00 = strong

r squared

aka coefficient of determination, provides PRE interpretation

multivariate regression

looks at the part of y that x can explain that z can't explain i.e. the effect of x on y while controlling for z

formula for multivariate regression line

y = z + (b1)(X1) + (b2)(x2) | the partial slope controls for the other relationship

the ANOVA test is designed for independent variables measured at the ____ level.

nominal

In the ANOVA test, when the sample means should be roughly equal in value...

... if the null hypothesis is true.

The ANOVA test uses means and standard deviations to compare the amount of variation _____ with the amount of variation _____.

within categories, between categories

In the ANOVA test, if the null hypothesis is false, the means of the different sample should be _____ and the standard deviation of the different samples should be _____.

very different in value, low in value

In the ANOVA test, if the null hypothesis is true, then... (a) SSB should b at least twice as much as dfb (b) SSB should be much greater than SSW (c) the mean square between should be roughly equal to or smaller than the mean square within (d) the combined dfb and dfw should be much greater than the SST

ANOVA is a one tailed test and we are concerned only with those outcomes in which there is more variance...

...between categories than within categories.

To conduct a chi square test, the variables must first be organized into a ______.

bivariate table

The subtotals calculated for bivariate tables are also known as ____.

marginals

In the context of chi square, variables are independent if... (a) they are related. (b) cause and effect can be proved. (c) the obtained chi square falls in the critical region. (d) the score of a case on one variable has no effect on the score of the case on the other variable.

(d) the score of a case on one variable has no effect on the score of the case on the other variable.

In a 2x2 table, all cell frequencies are exactly the same. This is consistent with which of the following conditions?

The variables are independent.

When the null hypothesis in the chi square test for independence is true, there should be...

....little difference between the observed frequencies and the expected frequencies.

A Chi square test has been conducted to assess the relationship between marital status and church attendance. The obtained Chi square is 23.45 and the critical Chi square is 9.488. What may be concluded?

Reject the null hypothesis, church attendance and marital status are dependent

In a research study conducted to determine if arrests were related to the socioeconomic class of the offender, the chi square critical score was 9.488 and the chi square test statistic was 12.2. We can conclude that the variables are ____.

dependent.

If variables are arranged in a bivariate table, we can see if they are associated by... (a) adding their scores vertically. (b) subtracting their scores horizontally. (c) computing percentages in the direction of the independent variable. (d) computing percentages in the direction of the dependent variable.

In the case of a perfect association, predictions from one variable to another can be made (with/without) error.

without

"As education increases, income rises." This is an example of a ______ relationship.

positive

If a researcher is looking to perform an analysis based upon the relationship between the number of arrests as an adult and number of encounters with police as a juvenile, they would use which measure of association? (a) Somers d. (b) Lambda. (c) Chi-Square. (d) none of these would be appropriate.

(d) none of these would be appropriate.

Proportional reduction in error (PRE) measures of association are based on the logic of ______.

prediction

If there is no association between two variables, knowledge of the independent variable does what to the number of errors of prediction?

Does not change the number of errors of prediction.

A bivariate table shows the association between gender and whether or not a person ever attends formal religious services. Lambda was .34. What may be concluded? (a) Women are more likely to attend church. (b) Men are more likely to attend church. (c) Knowing a person's gender improves our ability to predict whether or not they attend religious services by 34%. (d) Knowing whether a person attends religious services improves our ability to predict their gender by 34%.

A researcher has computed a Somers’s d of −0.75 between marital happiness and number of children. What can be concluded from this result?

that there is a strong, negative relationship between number of children and marital happiness

On a scatterplot, the regression line...

...comes as close as possible to touching every score.

There is no linear relationship between two interval-ratio variables when the regression line on a scatterplot...

is parallel to the horizontal axis.

The Y intercept is the point where...

...the regression line crosses the vertical axis of the scattergram.

If a regression line is parallel to the horizontal axis of the scattergram, the slope (b) will be ___.

0.00

If the regression line showing the effect of education on income has a slope of 1000...

...every year of education increases income by 1000

A researcher wants to measure the strength of the association between income (measured in dollars per year) and education (measured in number of years of formal schooling). Which of the following would be the most appropriate measure? (a) the slope (b) (b) y-intercept (c) chi-square (d) pearson's r

(d) pearson's r

In a study of the relationship between geographical mobility (number of times a person has changed residences) and number of friends, Pearson’s r is reported as .40. Which of the following would be the most correct interpretation?

Mobility explains 16% of the variation in number of friends.

mean square between groups

since we will simultaneously consider many groups, and evaluate whether their sample means differ more than we would expect from natural variation

Null hypthesis for mean square between groups

If the null hypothesis is true, any variation in the sample means is due to chance and shouldn’t be too large.

what is the statistic for ANOVA

f-statistic

Final Flashcards

(70 cards)