Quiz 2 POLI 399 Quantitative Research Methods Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Why is research design important?

A

purpose: impose controlled restrictions on our observations of the empirical world.
- defines domain of generalizability, causality able to say cause or not. rule out alternative explanations.
- draw causal inferences with confidence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Causality *

A
  • when talking about causality can never know for sure that one variable causes another. increased confidence if:
    1. demonstrate covariation: the cause and effect are moving together consistently and in a patterned way. the IV and DV move together, must match proposition.
    2. eliminate sources of spuriousness: eliminate a common cause that is moving both the IV and the DV. The relationship appears to be casual but can’t make a causal claim.
    3. establish time order. the IV must come first, cause first. changes in the IV must be present before changes in the DV
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Classical Experimental Design

A
  • There is an experimental group and a control group. Experimental groups is given or has the IV and the control group does not.
  • Conduct experiments to control for spuriousness and establish time order.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Components of Experimental Design

A
  • Demonstrates causality
  • Comparison to determine covariation
  • Manipulation to determine when the IV is introduced to assess time order
  • Control to look for sources of spuriousness.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Internal Validity of a Research Design

A
  • enables us to infer with reasonable confidence that the IV has a causal influence on DV
  • has to do with causality
  • increase internal validity, increase confidence in the fact that we have a causal claim
  • threats: can be extrinsic or intrinsic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Extrinsic Threats to Internal Validity*

A

extrinsic: arises from the selection of cases. selection bias that experimental group differs from control group before the experimental group is exposed to the IV. this introduces potential spuriousness. (something is moving IV and DV outside hypothesis)

Ensure groups are equivalent through precision matching, frequency distribution matching and randomization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do we ensure groups are equivalent *

A

extrinsic
Precision matching: impractical. match each person in both groups according to certain characteristics they both share. not randomization
Frequency distribution matching: same proportions of characteristics are present in each group. while the individuals may be different, the group averages are the same.
Randomization (best way): randomly assign people to experimental and control group. need to demonstrate frequency distribution matching. need to have equal chances of being selected but in order to do this you need a large enough sample. can’t bias results if randomization is done properly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Intrinsic Threats to Internal Validity*

A

Changes in the cases being studied: people change between pre test and post test
Flaws in measurement: having to do with validity (if indicator isn’t representing the target concept)
Reactive effects of being observed: changes in pre test and post test
-Undermine ability to make a causal claim
1. history: a difference in time between the pre and post tests can affect DV values separate from the introduction of the IV due to that events that occur.
2. maturation: people changing their minds within the pre-post test period. Can be psychological or physical processes that have an affect on DV values independent of IV
3. mortality: participants die or lose interest and therefore selectively drop out.
4. instrumentation: reliability problem because the measuring instruments are not performing in a consistent way.
5. regression effect: regression to the mean. the participant appears a-typical for the pre-test but appears more typical during the post-test. this is external to IV exposure
6. reactivity: test effect. pre-test causes the values to change apart from IV.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
Which is not an intrinsic threat to internal validity
A. selection bias
B. maturation
C. instrumentation
D. all are
E. none are
A

-Selection bias because it is an extrinsic threat to internal validity. has to do with case selection and selection bias.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Demonstrating causality requires

A

making comparisons
implementing controls
establishing time order, sequence. IV must come before the DV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Countering intrinsic threats to internal validity *

A

if groups are truly equivalent what happens to one group should affect the other group in the same way. for example: history: exposed to the same events
this is all contingent on the whether the sampling has been done well or not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Threats to external validity

A

has to do with generalizability: the degree of applicability to the real world and whether it is possible to generalize to what people in the real world are actually doing.

  • unrepresentative cases: people in the experiment don’t match the population. need representative cases.
  • artificiality of research setting: less likely to match real life contexts.
  • reactivity: intrinsinic threat but also has to do with external validity. people are essentially reacting to being studied. corresponds to a traditional critique of the scientific method: Hawthorne Effect.
  • tradeoff between the level of confidence in causality and the level of confidence in generalizability.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Variation on Classic Experiment

A
  • post test only only. exposure to IV during post test to the experimental group.
  • avoid causal claim bc no baseline
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Quasi-Experimental Design

A
  • Can’t use the word “cause”
  • Using statistics to establish controls and comparisons
  • weaker causal argument.
  • researcher can’t randomly assign observations.
  • ex post facto experiment, meaning that the researcher approximates the post-test only control group design through multivariate statistical methods.
  • sort cases on values of spuriousness
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Control Variables

A
  • test hypothesis using control variables
  • involves showing that the IV and DV covary in a consistent way, it is not enough to show empirical association, one must look for other variables that may eliminate or change the observed relationship
  • effects are held constant while IV-DV relationship is being examined. if the relationship is true, it does not matter if the control viable is held constant or made to vary.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Types of Control Variables 1.

A

Sources of Spuriousness

  • spurious variables cause both the IV and DV, common cause weakens or disappears and therefore there is no covariation .
  • relationship is destroyed
  • identify it by logically assessing whether there is something causing both the IV and DV or if there is anyone directly acting on IV/DV.
  • partial tables are different, have weaker differences across columns, than the original crosstabulation
  • IV that are given characteristics such as age or gender, cannot have sources of spuriousness because nothing can move them
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The higher a country’s literacy rate, the more democratic it will be. Plausible source of spuriousness.

A

Level of economic development

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Types of Control Variables 2.

A

Intervening variables

  • presumed causal method, mechanism that mediates the relationship between the IV and DV
  • this variable explains why the IV is causing the DV
  • intervening variables have to do with the why the causal mechanism appears the way it does
  • can’t distinguish whether the variables is spurious or intervening from statistics alone
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The lower the peoples incomes, the less interest they have in politics, What is the intervening variable?

A

attention to news or political alienation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Types of Control Variables 3.

A

Conditional Variables

  • has to do with how generalizable the relationship is.
  • affects either the strength of the relationship between IV and DV or the form of the relationship between the two.
  • IV may have a predictive function in terms of DV for some people but not all
  • examine whether there are some sorts of people for whom the IV will not have the predicted effect on the DV.
    1. specify relationship in terms of interest, knowledge or concern
    2. specify relationship in terms of time or place
    3. specify relationship in terms of social background characteristics
  • partial tables differ from the original table in two different ways. for example, gaps across one partial table may grow, signifying the relationship got stronger (if seen with a corresponding increase in the chosen measure of association.) and gaps across the other partial table may have fallen (corresponding decrease in the chosen measure of association).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The more people favour public healthcare, less likely they are to vote for a party/candidate on the right. What is a plausible conditional variable?

A

political knowledge

*ideology cannot be a plausible conditional variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The older people are the more likely they are to oppose same-sex marriage. What is a plausible conditional variable?

A

sexual orientation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are research problems?

A
  • are always questions that display how one concept is related to another concept.
  • the goal of a research problem is to maximize generalizability.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Why is generality important in the context of research problems?

A
  • the scientific method has generality as one of the goals.
  • the research that one engages in has implications for the sample and the relationship being studied in that specific instance.
  • the reason that people care about a research problem is due to its implications such as policy.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Which of the following is the BEST statement of a research problem?

a. socioeconomic status related to political engagement
b. income related to voting
c. education related to interest in politics
d. all
e. none

A

A. socioeconomic status related to political engagement.

26
Q

Research process

A
find something to explain
formulate research problem
develop hypothesis (operationalize)
identify plausible sources of spuriousness, intervening and conditional variables
choose indicators
collect and analyze data
27
Q

Stages in Data Analysis

A

test hypothesis
test for spuriousness
if non-spurious, test for intervening variables, test for conditional

28
Q

The differences between the values of a nominal scale are:

A

qualitative (another way to say categories)

29
Q

Nominal-level measurement enables us to :

A

tally frequencies

30
Q

Descriptive v.s Inferential Statistics*

A

descriptive: characteristics or summaries of a population or sample or single variable. use the same for samples and population. univariate descriptive stats describe or infer about one variable whereas bivariate descriptive stats: describe or infer the relationship between two. multi= more than 2. distribution (how many cases take each value) or central tendency which is the most typical value. central tendency means nothing without dispersion which is how much the values vary from the most typical value. measures of association (how strong relationship is

inferential statistics: used to generalize from a sample to the population form which sample was drawn. use sample to make inferences about the population

31
Q

Describing a frequency

A

descriptive statistic
frequency distributions is the number of observations in each category of the variable. how frequently each value occurs. absolute or raw converted to relative frequencies to make it comparative
to from N (number of observations) to percent.
0.4 and below round down, 0.5 round to the next even number and 0.6 and above round up.

32
Q

The two basic types of statistics are:

A

inferential and descriptive

33
Q

Central tendency

A

descriptive stats
is the value that best represents the entire distribution as it is the most typical. must be presented together with dispersion otherwise it is misleading
dispersion tells us how typical the most typical value is. it determines how effective the measure of central tendency is in representing the entire distribution. you need dispersion to have variation which is neccessary for both sample size and covariation

34
Q

Measure of central tendency and dispersion for a nominal level variable

A

measure of central tendency: mode which is the most frequently occuring value. in terms of raw frequency this is the category with most observations. relative frequencies is the category with the highest valid percent.
dispersion: Variation ratio, the proportion of cases that do not fall in to the modal category. this shows how typical the modal value is.

35
Q

Measure of central tendency and dispersion for an ordinal level variable

A

measure of central tendency: median which is the value taken by the middle case. half the cases are above the median and half the cases are below
dispersion: range which is the highest and lowest values taken by cases. inter-quartile range is the range of values that is taken by the middle 50%. the endpoints are a quartile above and a quartile below the median. where 50% of the cases fall between x # of categories.

36
Q

Measure of central tendency and dipersion for an interval/ratio variable

A

measure of central tendency: mean or average value. the sum of all the values is divided by the number of cases. the mean is the preferred measure but it is subject to distortion when there are extreme cases. if these are the circumstances it may be better to use the median.
dispersion: mean deviation is an underestimate, variance is in units squared by standard deviation is the best measure of dispersion for interval ratio level variables. it is a conservative measure with comprehensible units.

37
Q

z-Scores

A

z-scores tell us the exact number of standard deviation units in any particular case lie above or below the mean.
2 z-scores out on a normal distribution is usually about 95%.

38
Q

How do you choose which measure of central tendency to use?

A

level of measurement, which mathematical operations can be performed at each level of measurement
nominal- mode
ordinal- interquartile range
interval/ration- mean

39
Q

Which of the following is not an example of a descriptive stat:

a. frequency distribution
b. standard deviation
c. all
d. none

A

a. frequency distribution because a frequency distribution is a table and a descriptive stat

40
Q

Demonstrating covariance

A
  • all depends on the level of measurement
  • degree: which is how strong the IV is in predicting the DV
  • form: which values of the DV are association with the IV. this is the how piece
  • statistical significance: can the relationship be generalized to the population from which the sample was drawn.
41
Q

Creating crosstabs at the nominal level, ordinal or higher

A

Nominal: can remove categories. interpret the row that is named in your hypothesis.
Ordinal or higher: cannot remove categories but can strategically collapse them. want 3 categories or less. interpret top and bottom leave middle. look for consistent increases in one direction, curvelinear means that you cannot prove hypothesis because the hypothesis is a linear conjecture.
Both: IV column DV row. cell frequency is the number of people in the box. column percentages. look at gap in percentages across columns. interpret as percentage points. 1-4 trivial, 5-8 is may be a relationship, 10 or more relationship. larger % diff, stronger relationship

42
Q

In a crosstab, percentages should be calculated based on …

A

the number of cases in each category of the IV

column percentages*

43
Q

Interpreting Crosstabs

A
  1. differences in distribution of DV for different categories of the IV. gaps across columns?
  2. compare percentage differences with the hypothesis
  3. do the differences support the hypothesis
  4. significance, can you generalize to the population the sample was drawn from
44
Q

Statistical significance

A
  • What is the probability that the relationship in the sample occurred by chance and does not exist in the population.
  • the lower the probability that it occurred by chance, the higher the significance
  • 95% confident that it is not caused by change, p value of less than 0.05. 5% it could be caused by chance
45
Q

Type I v. Type II errors

A

type I: false positive. incorrectly infer from the sample that there is a relationship when there isn’t one. more serious because you want to be really confident that there is a relationship. conservatism in statistics. (has to do with inferential statistics)
type II: false negative. assume there is not a relationship when there is one. (has to do with replicability)

46
Q

Chi-Square* definition

A
  • theoretical probability distribution
  • gives the likelihood of each possible degree of a relationship occurring in a sample if there was no relationship in population from which sample was drawn .
  • assumes there is no relationship and compares it to what is present in the relationship being studied.
47
Q

Chi-Square steps*

A

Step 1: null hypothesis, states there is no relationship (no gaps across columns)
Step 2: calculated expected cell frequencies, if there is no relationship, cell % in each row should be same as the marginal row %
Step 3: compare expected and observed cell frequencies
Step 4: adjust sample size
Step 5: calculate degrees of freedom
Step 6: consult theoretical chi square distribution table.

48
Q

Expected Cell Frequency

A

column marginal x row marginal/total number of cases
Chi square asumes in no more than 25% of the cells will you have a cell frequency that is less than 5.
-categories x categories= x, 0.25x

49
Q

Chi-Squares Assumptions

A
  1. hypothesized relationship in advance
  2. random sample (no one has a 0 for their odds of inclusion.
  3. no more than 25% off cells have an expected frequency of less than 5.
  4. a non-significant chi-square means no relationship, does not mean that our sample** is unrepresentative
50
Q

If p = 0.5 then…

A non-significant chi-square indicates that …

A

it means that our relationship is probably due to chance

probably no relationship in the population

51
Q

A type 1 error occurs when we conclude

A

that there is a relationship when there actually isn’t

this is an error in generalization and has to do with inferential statistics

52
Q

Measures of association

A

descriptive statistics that describe how strong the relationship is between the IV and the DV
dependent on level of measurement
analyze the table, if results are significant then you know you can generalize to the population and move to measures of association which assess how strong the relationship is.

53
Q

Measures of Association for nominal variables

A

1 nominal variable then
Lambda: PRE (proportional reduction in error) measure that reveals how much our predictive ability is improved when the IV is known. 0 no improvement 1 perfect predictive ability. for nominal variables, best prediction is the mode. if you only know the DV what is the best guess?
total error without IV-total error with IV/ total number of cases
misleading because always zero if modal value is the same for all categories. underestimates strength of relationship so use Cramers V (not PRE)

54
Q

Measures of Association for ordinal/ordinal or nominal dichotomy/ordinal

A

Taub- square tables or Tau c rectangle tables
Tau is a PRE measure, proportionate reduction in error.
dichotomy nominals are coded 1-0 to trick the computer into assuming a higher level of measurement than is actually present

55
Q

Gamma*

A

PRE measure, proportionate reduction in error
1. if two variables are in perfect agreement, probability of drawing a positive pair. pairs of cases are ranked. can have perfect inversion, one variable goes up and the other goes down (-1) or perfect agreement which means they are the same (1). begin with upper right part of the table the move to bottom left.
DO NOT USE GAMMA because it ignores every pair of cases that has ties, or draw the same sequence. all pairs that have ties for the IV , the DV or both. this means that the relationship is overstated, this is not conservative enough to use as a measure of association.

56
Q

Which of the following is an inferential statistic

a. Cramers V
b. Lambda
c. Chi-square
d. all
e. none

A

Chi-square

57
Q

Which of the following is NOT a bivariate stat:

a. chi
b. lambda
c. gamma
d. none
e. all

A

all of them are

58
Q

Strength of Measures of Association Chart

A
under 0.10 trivially weak (start over)
between 0.10-0.14 weak
between 0.15-0.19 moderate
between 0.20 and 0.29 moderately strong
0.3 and above strong
59
Q

How to test controls

A
  1. determine original table and calculate the gaps across columns.
  2. create partial tables looking at the relationship between the IV and DV for each category of the control variable.
  3. compare partial tables to the original table
60
Q

When you apply controls what can happen?

A
  1. stays the same (more or less). gaps across columns stay relatively the same, significant and measure of association remains the same as well. this is replication, means control has no effect.
  2. weakening or disappearance of the relationship across all categories for all partial tables indicates a spurious or intervening variable. gaps across columns shrink or disappear, becomes insignificant or measure of association drops significantly.
  3. if there is a weakening for one partial table but strengthening for others. differences that vary in form indicate a conditional variable.
    * ** point of reference is always the original table with the IV/DV relationship. never compare across partial tables.
    - test for spuriousness look for things that cause IV and DV, intervening look for things that mediate the relationship between IV and DV (time order argument comes in here IV must come before DV) then conditional some sorts of people are likely to take a particular value of DV regardless of IV in that order.