Chapters 14-16 Quantitative Data Analysis Flashcards

1
Q

Three Common Arguments and Claims in Quantitative Political Science

A

Descriptive claims - %
Claims of group differences
Claims of relationships between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Descriptive data helps us understand ____ variable(s).

A

One

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Two Types of Basic Descriptive Statistics

A

Measures of central tendency

Measures of dispersion/variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Selection of Univariate Stats/Levels of Measurement

A

Nominal
- central tendency is found through modes
- dispersion is found through variation ratio
Ordinal
- central tendency is found through mode and median
- dispersion is found through variation ratio and range
Interval
- Central tendency is found through mode, median and mean
- Dispersion is found through variation ratio, range, and standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Three Possible Measures of Central Tendency

A

Mode - that which occurs most frequently
Median - the sample median is the middle value when in order to increasing magnitude
Mean - average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Pros and Cons of the three Measures of Central Tendency

A
Mode cons: 
- susceptible to categorical construction (green and not green vs. green, ndp, lib, etc.)
- doesn't use all data
Mode pro:
- can use with nominal measures
Median con:
- does not use precise values
Median Pro:
- stable, not affected by extreme values
Mean pro: 
- uses precise values
Mean con:
- skewed by extreme
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Statistical Distribution and Measures of Central Tendency

A

If stats are normally distributed (in a nice curve), then all three will be the same
If not distributed nicely, different central tendencies will pull data in different directions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

3 Measures of Dispersion, and why we need it

A

Standard Deviation
Variation Ratio
Range
Because central tendency doesn’t give us all the information!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Deviation and Standard Deviation

A
  • how far an individual score is from the mean
  • standard deviation is the average deviation
  • effected significantly by outliers (like all means) and sample size
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Mean is appropriate to use when…

A

The standard deviation is minimal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Variation Ratio

A

The number of cases that aren’t in the modal category.

High ratio means data are more dispersed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Range

A
  • Difference between highest and lowest score

- Can’t be used for nominal obvi

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

If there is an even number of cases and the two middle values are different, the median becomes…

A

The mean of the two middle numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Positive and Negative Skew

A
Negative skew (low extremes)
Positive skew (high extremes)
Too many make mean a bad central tendency to use.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Is 80/100 cases are in the modal category, the variation ratio is…

A

0.2

So small.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Null Hypothesis

A

Mean of the control = mean of the treatment group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Alternative Hypothesis

A

Mean of the control group isn’t equal to that of the treatment group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Type 1 Error

A

False positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Type 2 Error

A

False negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Inferential Statistics

A

Stats which test the probability that sample statistics are reasonable estimates of population parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

5 Steps of Hypothesis Testing

A
  1. Formulate Null and Alternative
  2. Select a confidence level
  3. Calculate the appropriate inferential statistic
  4. Using the table for the test statistic, find the critical value (expected value) at the selected confidence level.
  5. If the calculated statistic equals or exceeds the critical value, reject the null.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Confidence Levels are determined by…

A

Probability.

23
Q

Use Inferential Statistics to Reject the Null Hypothesis and find a relationship

A

24
Q

If you find false positives to be more objectionable than false negatives, you will will likely want a ____ confidence level.

A

Higher.

25
Q

Lower confidence levels (I feel like it should worded lower confidence criteria) make it ____ to reject the null hypothesis.

A

Easier.

26
Q

Higher confidence levels make it ____ to reject the null hypothesis.

A

Harder.

27
Q

I find the terms high or low confidence levels confusing as fuck because…

A

When they say high confidence levels they don’t mean how confident one is that their is a causal relationship, they mean that there’s a higher confidence threshold/criteria before one can assert that there is a causal relationship.

28
Q

Question to Ask of Descriptive Stats

A

Are the sample data representative of the population?

29
Q

Some Questions to Ask of Claims of Differences Between Groups

A

How large are the differences?

Are they due to chance?

30
Q

Some Questions to Ask of Claims of Relationships Between Variables

A

How strong is the relationship?
Is it due to chance?
Is it a causal relationship?

31
Q

Alpha Level

A
(Aka confidence level)
• probability that the sample
statistic is an accurate estimate
of the population parameter,
and the population parameter
lies within an estimated range of
values (known as the confidence
interval)
32
Q

Confidence Interval

A

The Estimated Range of Values for the Population Parameter

33
Q

If the sample statistic is 45% and the confidence interval is +/- 3%, then the confidence interval would be…

A

42%-48%

34
Q

P > 0.10

A

o 90% of confidence intervals would contain the population parameter – 10% would not

35
Q

A higher confidence level (criteria) means that the sample statistic will reflect the population parameter ____ accurately, but ____ precisely.

A

More accurately (because you have a wider confidence interval), and less precise (because so many possible numbers)

36
Q

A lower confidence level (criteria) leads to a _____ confidence interval.

A

Narrower / smaller

37
Q

A lower confidence level makes it _____ to reject the null hypothesis, and has a better chance of leading to a type _ error.

A

Makes it easier to reject the null hypothesis.
Likely to lead to a type 1 error.
(Do vice versa in yo head)

38
Q

Greater sample sizes will result in ___ sampling errors.

A

Fewer.

39
Q

Three Considerations Prior to Deciding Confidence Levels

A

What’s your sample size going to be? The higher the size, the higher the confidence criteria should be.
Does it have tight controls? Then it should be higher.
Is it exploratory? If so, it can be smaller.

40
Q

Effect Size

A

The difference between a control and treatment group regardless of sample size.
Should be considered in addition to p value.

41
Q

Confidence levels must be understood in relation to ____

A

Sample size.

42
Q

Substantively Significant

A

The extent to which something actually matters. (Ex.: Does it modify, build upon, or reject your theory?)

43
Q

4 Questions when Asking Whether to Reject Null Hypothesis or Not

A

Sample size
Confidence level
Confidence level appropriateness
5 Criteria for Causality

44
Q

We can use measures of association to…

A

measure the strength of a bivariate relationship.

45
Q

5 Considerations when looking at Bivariate Relationships

A

Is there a relationship?
What is the direction of the relationship? (not nominal)
What is the strength of the relationship?
Is the relationship statistically significant?
Does the relationship continue to exist when other measures are controlled?

46
Q

Independent variable goes on _ axis, while the dependent variable goes on the _ axis.

A
IV = X
DV = Y
47
Q

Perfect Correlation

A

knowing the value on one variable
always lets us know the value on
the other

48
Q

Weak, Moderate, Strong

Correlation:

A

Knowledge of the IV allows us to
better predict the value of the
DV

49
Q

Two Things Measures of Association Do

A
- condense the patterns in a
contingency table or scatter plot
into a single numerical value
- provide a standardized and
compact way to convey
relationship information
50
Q

Measures of Association for Nominal v Ordinal and Interval

A

Nominal’s range is 0 - 1. The other two it is -1 to +1.

51
Q

Interpreting Associations

A
0.00 No relationship
\+/- 0.01-0.09 Very weak
\+/- 0.10-0.20 Weak
\+/- 0.21 to 0.30 Moderate
\+/- 0.31-.049 Moderately strong
\+/- 0.50-0.99 Strong, very strong
1.00 Perfect relationship
52
Q

How to determine:

  1. Is there a relationship?
  2. What is its direction?
  3. What is its strength?
  4. Is it statistically significant?
  5. Does the relationship still exist when other variables are controlled?
A
  1. Contingency table, scatterplot
  2. Contingency table, scatterplot
  3. Measures of association
  4. Inferential statistics
  5. Contingency tables with controls, regression analysis
53
Q

2 Types of Descriptive Statistics

A

Measures of central tendency

MEasures of variation

54
Q

Three Possible Outcomes after New Variable is Added to Bivariate

A

The bivariate:
holds constant - that would increase confidence in original relationship
relationship gets stronger - reinforcing variable
relationship is gone - confounding relationship or intervening variable is doing something