exam 2 morgan's study guide notes Flashcards
what is acquiescence bias
People tend to respond to questions with answers they think people want to hear.
what can researchers do to reduce the effects of acquiesence bias
we sometimes want to measure things in the opposite direction
That way, that bias will cancel itself out.
Do surveys usually use open-ended or close-ended questions
close-ended
what is biased wording in survey questions
when questions have cultural or political bias
they can mean different things to different groups of people or be inappropriate questions to ask a certain group
what is unclear wording in survey questions
If the wording is unclear, people can misunderstand the question and have varying interpretations of it
what are leading questions in survey questions
questions that are set up in a way that makes you feel pressured to agree
what is this an example of
“How satisfied are you with your pay and working conditions?”
double-barreled question
what is this an example of
“Would you agree with most Americans that the U.S. should not have withdrawn from the Iran nuclear deal?”
leading question
what is a double-barreled question in survey questions
when a question is asking about two different concepts at one time
what is a negatively worded question in survey questions
when questions are phrased in a negative light
the word ‘not’ introduces measurement error because people who read things quickly might skip over the word and miss the whole concept
what are some things people can’t answer?
questions about things too far in the past
automatic habits people might not be aware of
asking them to guess what effect something had on them (self-reporting)
what is this an example of
“It’s not easy to figure out the truth behind political issues.”
negatively worded
what is this an example of
“During the last presidential election, how many hours did you spend per week watching CNN?”
Questions too far in the past
what is this an example of
“How often do you make eye contact in conversations?”
habits people aren’t aware of
what is this an example of
“Did watching the debate make you more supportive of candidate X?”
self-reporting
why do we use multi-item measures
to reduce measurement errors
what are the two different kinds of multiple-item measures
index and scales
how do we use multi-item measures
multiple item measures are averaged together so random errors in each item cancel out
what are the cons of interviews
Possible confounds of interviewer effects (hard to guarantee you treat everyone the same)
They might have a better rapport with men than women and vice versa.
It is more expensive.
What kinds of questions are exceptions to the RAS model; i.e., what questions do most people actually have stored opinions about?
Questions that are exempt from the RAS model are those about presidential support, attitudes about parties, or broadsides on prominent issues like abortion.
Issues that are stable, meaning that your answer will never waver, like abortion.
what are multiple-item measures in scales
concepts that correlate and have the same underlying ideas
what is an example of multiple-item measures in index
voting in the morning or voting at night are virtually the same just at different times
however, if a person says they voted in the morning, you can be sure that they won’t vote at night
therefore, the two cannot correlate
What do polls of the public’s issue opinions measure if people don’t have opinions on those issues stored in memory? Are polls meaningless, then?
Polls are not meaningless; while they may not really measure public opinion, they do measure public response.
what are the three kinds of evidence against the assumption that opinion questions measure stored opinions
opinion instability in panel surveys (people give different answers over time)
question order effects
question framing effects (questions worded in different ways but mean the same thing)
how does a measurement error occur
people aren’t more likely to take their time and give more/careful answers
Do people really “have” opinions, meaning pre-formed opinions stored in their memory, about most political issues?
No, the reality is that when you are asked an opinion question, you come up with the answer quickly by using already stored information you have on the question.
what are interview pros
Higher response rate & completion rate, and lower measurement error.
what are response rates
how many people you try getting into the survey that actually participate
how can you include experiments in a self-administered survey online
by changing question order or wording manipulation
media stimuli for media effects
what are the self-administered survey pros
Recruitment can be via other methods (e.g. phone)
cheaper
don’t need to enter data if it’s done online
lower social desirability bias effects
what are the self-administered survey cons
has a lower response rate than interview style, especially internet
what is a longitudinal cross-section survey
Surveys that use Repetition of measurement and asking the questions again over time.
what are the three types of longitudinal surveys
trend
cohor
panel
why are longitudinal cross-section survey important
This is important bc we want to find changes in variables and see how they compare to other measures. Rules out reverse causality but it doesn’t rule out third variable influence.
what is an example of a trend survey
tracking polls
what is a trend survey
When you have studies that look at the same population but use a different random sample from that population every time. Meaning the same people are not chosen everytime, but everyone who is chosen is from the same population.
what is a cohort survey
They are not the same people, but they represent the same specific sub-population, which has changed over time. Focused on a certain group of people to see how they are changing over time.
what is the margin of error for subgroups
The 3% margin of error for the poll overall only applies to results that use the whole sample. Since this result used only 30% of the sample, the margin of error has to be bigger.
what is the margin of error for a candidate
41% approve w/ 3% margin means the real approval rate in the population is 95% likely to be 38%-44%
what is the different between a panel survey and a panel used for sampling
A panel survey measures the same people over a period of time and classifies as a longitudinal survey.
A panel used for sampling is the same people being measured but for different surveys. A group of people previously recruited to participate in future surveys
what is an example of cohort surveys
People born between the same set of years and how they change
what is the margin of error for lead
The margin of error for a lead is double.
So, the same sample size that had a 3% margin for a candidate would have about 6% for the candidate’s lead over the opponent.
If candidate A is 5% ahead of B with a 6% margin of error, we’re 95% sure that reality ranges from B ahead by 1% to A ahead by 11%
E.g., a poll of 1000 people w/a 3% margin
Split into race/ethnicity subgroups
Hispanics 15% = 160 people, 8% margin of error
So if 45% of Hispanics approve of the president in the sample, in population, we’re 95% sure it’s between…
37% and 53%
what are panel surveys
When you Re-measure the exact same people. This allows us to measure change at an individual level. The benefit of this study is that it allows us to identify which individuals are changing.
what is secondary Analysis
An analysis based on data from someone else
Why do election polls need likely voter models?
The population doesn’t exist yet, so you need these models to create an idea of who will vote. Also, if you don’t use likely voter models, you are likely to see a lot of systematic bias.
If a poll report results say “likely voters,” they’re multiplying each vote by its…
likelihood
why would we use secondary analysis
If you have a question and using a survey is the right basic method, then you want to start by looking for existing data/surveys. This is because there are many free high-quality survey data sets online with much better sampling than you can ever afford.
Why are weights used in polling?
Weights are used to try and represent a proportion of the demographics that are harder to reach.
Which variables are traditionally used for weights, and which variable did pollsters add to correct for the 2020 election’s underestimate of votes for Trump?
Traditionally, weights are for basic demographics like race, age and sex.
The variable that the pollsters added to correct for the 2020 election’s underestimate of votes for Trump was partisan nonresponse bias.
What do weights in polling try to correct for?
They try to correct for response biases and adjust the sample to better match the population (at least on demographics).
What information is vote likelihood based on?
Often based on past voting history or on self-reported intention to vote
(or both + other variables like education, self-reported interest, attitude extremity, etc.)
100 participants in a poll, and each one of them is assigned a 30% chance of voting for A
how many votes for A
30
Why was the 2020 election more concerning for pollsters than the 2016 election, even though they predicted Biden’s win in 2020 and failed to predict Trump’s win in 2016?
2020 was concerning because the prediction was way outside the margin of error, consistently over-estimating Biden’s support also because there was systematic bias at play.
70 participants with a 100% chance of voting for B
how many votes for B
70
what is inter-coder reliability
do we get the same codes from different people?
validity is so much more important
check inter-coder reliability (how often you agree after 100 cases or so)
What does a random sample let you do for content analysis
For what’s in the media, you start with sampling, and a random sample lets you generalize to a population of content. It is an alternative because it has higher reliability and validity.
Why is random sampling for content analysis sometimes an alternative to automated coding of big data?
You can get a full population of 5 million tweets, sample a few thousand of them, and human code them all.
It is of higher quality, and you can still generalize it to five million.
what is krippendorff’s alpha
> or =.7 its ok
or =.8 its good
its a number we are hoping is above .7 but its better to be above .8
what are coding sheets
where you do the coding and save the results
what is a coding guide
rules for how to code each variable
what is an example of a coding sheet
spreadsheets with cases as rows and variables as columns
what are manifest concept meanings
these are clearly in the text without having to make assumptions
what are latent meanings
concepts are not directly stated in the text
we have to infer them
it is much harder to get reliable measurements
what do Dr.Pingree recommend
manifest concept meanings or latent meanings
manifest concept meanings
what is the purpose of a blind overlapping subset in the 3rd step (final coding)
The reliability relies on this step. If the two researchers are not aware of which samples are overlapping and their results are the same, then you can be sure that the results are valid.
what is a case study
understand one event in depth (e.g., the study of decision-making that led to the BP oil spill, based on emails, records, and interviews). Focused on one case/event/situation in a lot of depth.
what is a participant observation
Understand communication by being part of it (e.g., working as a journalist, joining an activist group, etc.). You go and do it actively.
what is an example of a participant observation
interested in understanding how newsrooms work
what is an in-depth interview
open-ended interview of 1 person. It’s much simpler than participant studies.
what is a focus group
open-ended interview of a group of people. Structured conversation with them about a topic that you are interested in.
What kinds of research questions are qualitative studies for?
It aims to maximize depth and Understand one person/one organization /one event at a time in as much detail as possible without generalizing beyond it. Qualitative: exploratory questions, tentative answers(uncertain; there are possibilities that need to be followed up with other research)
What kinds of conclusions can you draw from qualitative research?
Tentative conclusions that require further research. Remember, you aren’t trying to generalize your conclusions. You can say there are possibilities and make suggestions.
causal possibilities not causal claims
say X happened before Y, not that X caused Y
How is theory used differently in planning a qualitative study compared to a quantitative Study?
Unlike quantitative research, don’t aim for narrow questions with very specific operationalization. They are supposed to be more general and leave room for answers you didn’t plan for. Questions can change during the study.
Instead of using it to make specific predictions, you use it to try and help you give a label for something that you might not have known how to label.
What is a reflexive journal used for in qualitative research?
One of the two kinds of notes you take during the study. Notes about your feelings.
A diary kept by a researcher doing participant observation (or any other qualitative method)
Your thoughts/feelings about the study, tentative conclusions, and evolving plans for how to proceed.
Effective measurement of variables can help…
minimize the effects of measurement error and improve validity.
The mean median and mode are three measures of…
central tendency
what is central tendency
measures of what’s a typical response for a particular variable
what is the mean
the simple average
add up all of the responses and divide by the number of responses
what is the median
the middle value
sort the responses in order and then using the middle value
what is the mode
most common response
whatever number you see the most
how can a mean be misleading
A mean can be misleading because the outliers can throw off the average. (meaning the numbers on the end are the two extremes)
when do you use the mode
when you use a nominal level of measurement
when do you use the median
when you use an ordinal measure
When you use an interval/ratio level of measurement, use…
the mean if there are no extreme outliers
if there are, use the median
what is standard deviation
sums up how far all of the scores are from the mean
how is standard deviation calculated
square difference of each score from the mean
ex. 5 steps away from the mean would be 25
what does a big or small SD mean
scores farther from the mean matter more, but scores matter
small SD are scores more bunched near the mean
Why use SD instead of range to represent how spread out a variable is
You would use standard deviation instead of the range because the range only tells you how far apart the two most extreme outliers are. We want something that uses all the data.
what is the z score formula
z-score=(score-mean)/SD
what is a z score
how far a score is above or below its mean in SD units
if a z score is 0…
the score equals the mean
if a z score is -1…
the score is 1 SD below its mean
what is a type 1 error
Occurs when we conclude that a relationship exists, but in reality, it does not.
what is a type 2 error
Occurs when we conclude there is no relationship, but in reality, there is
what is a p value
describes the probability our sample comes from a population where the null hypothesis is correct.
The larger the p-value, the more likely it is that…
we are studying a population where no relationship exists and the less likely we are to reject the null
what is a null hypothesis
predicts that there is no relationship
What we conclude if p > .05.
p > .05 means we don’t know if there is a relationship. It doesn’t prove there is or isn’t a relationship (it could also be that the sample size too small or there is a measurement error)
How to choose between chi-square, t-test, and correlation.
Choosing depends on the levels of the variables, meaning are they nominal or interval/ratio
when do you use a chi-square
if both are nominal
to see if group frequencies are related
what test would you use here
Are people more trusting of media in the control group or in the treatment group?
chi-square
what test would you use here
Do more people vote in the treatment group or in the control group?
chi-square
when do you use a correlation test
if both are either interval or ratio
tests to see if being above the mean on one variable is related to being above or below the mean on the other
what test would you use here
How is age in years related to political ideology (on a 1 to 7 scale)?
correlation
when do you use a t-test
if there one nominal variable and one interval/ratio varibale
use the test to see if groups have different means
what test would you use here
Do men or women make more money?
t-test
what is the nominal variable here
Do men or women make more money?
gender
what is the interval or ratio here
Do men or women make more money?
income
what are expected values in a chi-square test
based on differences between the observed and expected cells
what are values in a chi-square test based on
how many would be in each cell if there’s no relationship between variables
how do you interpret the results of a chi-square test
(in relation to p value)
the variables are related if it’s p<.05
how do you interpret the results of a t-test
if p<.05 the two groups have different means
how is correlation calculated
Convert 2 interval/ratio variables to Z scores
Correlation r = average result of multiplying their Z scores together
how do you interpret the results of a correlation
Make the strength clear with words like “slight tendency,” “strong tendency.”
Make the direction clear: start with a label for people who are higher on one variable, say they have a tendency to be higher/lower on the other
what is r in correlation
correlation strength and direction, -1 to 1
Which bivariate test is ANOVA an extension of?
ANOVA is an extension of the t-test and is used when testing more than two groups.
what is anova
Anova is an analysis of variance with multiple different independent variables.
what is one-way anova
a single grouping variable(meaning that there are three versions of the same variable like political party: dem, rep., and independent)
what is a factorial anova
two or more grouping variables
Lets you see the main effect of a factor on its own, and the interactions show if the effect of one factor depends on another factor.
Lets you answer three questions and separate a factor from the interactions.
when do you use one-way anova
when only looking for differences in means
when do you use factorial anova
when comparing groups means when groups are organized using multiple facotrs
what is the main effect of factorial anova
what each factor does
what is the interaction of factorial anova
whether the effects of one factor depends on the other
what does the interaction correspond to in the plot?
Interaction corresponds to different slope lines
Which bivariate test is regression an extension of?
Regression is an extension of the correlation test.
what is a weakness of anova
anova can only have nominal predictors
How are regression results different from bivariate correlations?
The result for each predictor is how it predicts the outcome, controlling for all the other predictors
what are the two kinds of betas
beta and standardized beta
what is beta
can interpret direction and strength
what is standardized beta
helps us compare the strengths of relationships across different predictors
If betas aren’t standardized, they’ll…
each be paired with a standard error in parentheses
if you see another number in parentheses to the right of or below each beta…
its unstandardized
The hint to decide if it’s not unstandardized is to look for…
numbers in pairs and whether the number is bigger or smaller than 1.
What kind of nominal variables can’t be used in regression?
nominal variables with more than two levels
what is a dummy variable
when you do have more than 3-level nominal variables, you can convert them. You always have to have a missing dummy, and that is what you think of as the comparison group. Then, if it is different, then that shows the significance of the missing dummy
how to do interactions in regression
Each variable to be interacted is mean centered
Subtract the variable’s mean from its value
Then multiply the two mean-centered variables together to create an interaction term
Interaction term is added as a predictor to the regression
what are pie charts used for
intended to show parts of a whole
why are pie charts bad
does not make important differences easy to see
how do you make pie charts okay
it can be ok with very few slices (2-5)
each very different in size
sorted from biggest to smallest
never use multiple pie charts for comparison
what are bars and column charts for
comparison across items
when do you use horizontal bars
if category labels are long or lots of categories
when do you use vertical columns
if categories represent time
what are end decorations for bar charts and columns
Don’t decorate the ends; no rounded, shadowed, or 3d bars/columns
Simple flat tops make comparison easy, which is the whole point of these charts.
Axis at 0: start at zero to avoid distortion
what is descending order for bar charts and columns
If categories have an inherent order (e.g., months), always put them in that order. Otherwise order them by value to make comparison easier.
what are meaningful colors for bar charts and columns
If you vary colors, do it meaningfully, not decoratively.
what are stacked group charts for
comparing values using the combination of two categorical variables.
how do stacked group charts make it easy to see
Differences in total height of stack and Differences in the bottom category.
what are line charts used for
For showing the change in one variable over many points in a continuous other variable (e.g., change over time)
Straight lines connect where the tops of the columns would be in a column chart.
axis at 0: line charts
Unlike bar & column graphs, don’t always need to start at 0
The purpose of line graph is to show the change
Including zero can mean zooming so far out that changes disappear
don’t smooth to a curve: line charts
Smoothing / curve fitting algorithms distort and misrepresent how many data points it’s based on
avoid multiple axis: line charts
Different scaling of two axes can invite very different conclusions
why are election maps bad
either distort geography or distort politics
why are election cartograms bad
Cartograms distort map too much, make it hard to find states
why are red and blue or shades of purple election maps bad
Red & blue makes sense for highlighting winner-take-all results like the Electoral college
Purple is better for showing where it’s close (more useful for strategy)
Are people more trusting of media in the control group or in the treatment group?
what kind of test
why
Test: Chi-square test
Reasoning: Chi-square test is used for categorical data comparison, like comparing proportions or frequencies between groups. Here, you’re comparing the proportions of people trusting media in two different groups (control vs. treatment).
Do more people vote in the treatment group or in the control group?
what kind of test
why
Test: Chi-square test
Reasoning: Similar to the first scenario, you’re comparing proportions of people who vote in two different groups, so a chi-square test would be appropriate.
How is age in years related to political ideology (on a 1 to 7 scale)?
what kind of test
why
Test: Correlation
Reasoning: Correlation measures the strength and direction of a relationship between two continuous variables. Here, you’re examining the relationship between age (continuous) and political ideology (also continuous but measured on a scale), so Pearson correlation coefficient or Spearman’s rank correlation coefficient would be appropriate depending on the nature of the data.