stats midterm Flashcards

Question 1

Q

Statistics:

Answer

A

The practice or science of collecting and analyzing numerical data in large quantities to interpret, summarize, and present it in a meaningful way.
A numerical fact or datum: a piece of data that provides information on a particular subject, often used in reference to quantitative research or studies.

Question 2

Q

Data

Answer

A

-Information, especially facts or numbers, collected to be examined and considered and used to help decision-making
-Information in an electronic form that can be stored and used
by a computer

Question 3

Q

Data literacy

Answer

A

the combination of skills and mindsets that allows individuals to find insights and meaning within their data to enable effective, data-informed decision-making

Question 4

Q

Data literacy imparts the skills and mindset to find

Answer

A

meaning
within data

Question 5

Q

Politics

Answer

A

-The activities of the government, members of law-making organizations, or people who try to influence the way a country is governed
-The relationships within a group or organization that allow
particular people to have power over others

Question 6

Q

Political science

Answer

A

uses data to figure out the correct answer to important questions like these

Question 7

Q

Two styles of research

Answer

A

-Qualitative
-Quantitative

Question 8

Q

Qualitative research

Answer

A

based on information that cannot be easily measured, such as people’s feelings, rather than on information that can be shown in numbers

Question 9

Q

Quantitative research

Answer

A

related to information that can be shown in numbers and amounts

Question 10

Q

Topic

Answer

A

a matter dealt with in a text, discourse, or conversation; a
subject

Question 11

Q

Theory

Answer

A

a plausible general principle or body of principles offered to
explain phenomena

Question 12

Q

A causal theory differs from a theory in that it

Answer

A

explicitly states the
relationship between two variables

Question 13

Q

Variable

Answer

A

a characteristic, number, or quantity that can be measured
or counted and can take on different values

Question 14

Q

Invariance

Answer

A

The property of remaining unchanged regardless of changes in the conditions of measurement

Question 15

Q

A hypothesis is even more - than a causal theory

Question 16

Q

A hypothesis - the variables

Answer

A

operationalizes

Question 17

Q

Operationalization

Answer

A

precisely defining the variables and how they are measured

Question 18

Q

Pre-registration

Answer

A

makes your hypothesis and plan for
hypothesis testing public

Question 19

Q

Once you have - your research plan, you can test your hypotheses

Answer

A

pre-registered

Question 20

Q

Hypothesis testing

Answer

A

the use of statistics on data to test a hypothesis

Question 21

Q

Methodology

Answer

A

the use of statistics

Question 22

Q

Empirical analysis

Answer

A

the use of statistics on observational
data – not experimental data

Question 23

Q

Empirical testing

Answer

A

the use of statistics on observational data to test a hypothesis

Question 24

Q

Hypothesis testing uses statistics to test:

Answer

A

whether an association exists between the two variables,
the strength of any association between the two variables, and
the probability that the association between the two variables is
due to random chance

Question 25

Q

Normative arguments include words like

Answer

A

“should” or “ought to.”

Question 26

Q

parsimonious

Question 27

Q

Time dimension

Answer

A

points at time in which your data changes

Question 28

Q

Time-series data

Answer

A

a sequence of data points collected or recorded at successive points
in time, typically at equally spaced intervals, that represents how a particular variable or set of variables changes over time

Question 29

Q

Hierarchical dimension

Answer

A

the level at which your data changes

Question 30

Q

Multi-level data

Answer

A

data that is structured in multiple nested levels, where observations are grouped within higher-level units

Question 31

Q

Spatial dimension

Answer

A

geographic locations in which your data changes

Question 32

Q

Cross-sectional data

Answer

A

data collected at a single point in time from multiple units, such as states or countries, to analyze variations across those units

Question 33

Q

Moderator (Z)

Answer

A

a variable that influences the strength or direction of the
relationship between an independent and a dependent variable in a study.

Question 34

Q

Mediator (Z)

Answer

A

a variable that explains the process or mechanism through
which an independent variable affects a dependent variable, acting as an
intermediary in the relationship11

Question 35

Q

Formal theory

Answer

A

a framework that uses mathematical models and logical structures to rigorously analyze and predict the behavior of complex systems or phenomena

Question 36

Q

Rational choice theory

Answer

A

individuals make decisions by systematically evaluating the costs and benefits to maximize their personal utility or advantage

Question 37

Q

Utility

Answer

A

the sum of all benefits of an action minus the sum of all costs from that
action

Question 38

Q

Utility maximizer

Answer

A

an individual who seeks to make choices that yield the highest possible level of benefit based on their preferences and available options

Question 39

Q

Expected utility

Answer

A

the overall anticipated satisfaction or benefit (utility) derived from a particular choice or outcome

Question 40

Q

Game theory

Answer

A

a branch of formal modeling that focuses on analyzing strategic interactions between rational decision-makers, where the outcome for each participant depends not only on their own choices but also on the choices of others

Question 41

Q

The prisoner’s dilemma

Answer

A

a classic game theory scenario where two individuals, who cannot communicate, face a choice between cooperating with each other or betraying one another

Question 42

Q

Social choice theory

Answer

A

a domain within formal modeling that examines how individual
preferences can be aggregated to make collective decisions

Question 43

Q

Intransitive Preferences

Answer

A

a preference structure that violates the transitivity condition. For example, an individual might prefer option A over option B, option B over
option C, but still prefer option C over option A (A > B, B > C, but C > A).

Question 44

Q

Spatial models

Answer

A

a specialized form of formal modeling that incorporate spatial or geographic
dimensions into the analysis of strategic interactions.

Question 45

Q

Spatial models of voting

Answer

A

a formal modeling approach used to analyze how voters’ preferences
and spatial positioning influence electoral outcomes

Question 46

Q

Preference mapping

Answer

A

voters and candidates are positioned on a spatial map (often a one-dimensional or two-dimensional continuum) based on their ideological or policy preferences

Question 47

Q

Vote maximization

Answer

A

candidates choose positions or policies to maximize their votes, typically moving towards the median voter or the center of voter preferences to appeal to the largest
segment of the electorate

Question 48

Q

Equilibrium analysis

Answer

A

The model identifies equilibrium points, where candidates’ positions stabilize because any deviation would result in fewer votes. The most common equilibrium is the median
voter theorem, where candidates converge to the preferences of the median voter

Question 49

Q

Causal relationship

Answer

A

a connection between two variables where one variable directly influences or determines the outcome of the other

Question 50

Q

Confounder

Answer

A

a variable that influences both the independent and dependent variables, potentially leading to a misleading or spurious association between them.

Question 51

Q

Spurious relationship

Answer

A

a false or misleading association between two variables that is actually caused by a third, confounding variable, rather than a direct causal link between the two

Question 52

Q

Control variable

Answer

A

a variable or condition that is held constant or regulated in an
experiment or study to isolate the effect of the independent variable on the dependent variable, ensuring that the results are not influenced by extraneous factors

Question 53

Q

Deterministic relationship

Answer

A

a connection between two variables where one variable’s value is precisely determined by the value of the other, with no randomness or uncertainty involved

Question 54

Q

Probabilistic relationship

Answer

A

a connection between two variables where changes in one variable are associated with changes in the likelihood or probability of different
outcomes in the other variable, but the relationship is not perfectly predictable

Question 55

Q

Observational data

Answer

A

information collected from real-world observations or measurements without conducting experiments

Question 56

Q

Experimental data

Answer

A

information collected from experiments where variables are
systematically manipulated to observe their effects on other variables, allowing for causal inferences

Question 57

Q

Randomized controlled trials (RCTs)

Answer

A

experimental studies where participants are randomly assigned to either a treatment group or a control group to evaluate the effectiveness of an intervention while minimizing biases

Question 58

Q

Treatment group

Answer

A

a group of participants in a study that receives the treatment or intervention being tested, allowing researchers to assess its effects compared to a control group

Question 59

Q

Random assignment

Answer

A

the process of randomly allocating
participants to control and treatment groups in a study to ensure that each group is comparable and to eliminate selection bias

Question 60

Q

Selection bias

Answer

A

when the sample of participants in a study is not representative of the population being studied, leading to distorted or unrepresentative results

Question 61

Q

Randomized controlled trials are considered the gold standard for
causal research because they can cross the

Answer

A

four causal hurdles.

Question 62

Q

Experiments can exhibit low levels of

Answer

A

external validity

Question 63

Q

External validity

Answer

A

the degree to which one can be confident that the results of an analysis apply to the broader population

Question 64

Q

Natural experiments

Answer

A

experiments that leverage naturally occurring random variations or events to investigate causal effects, without direct manipulation of the independent variable by the researcher

Answer 63

A

internal validity

Answer 64

A

studies that compare the effects of an intervention or treatment between pre-selected groups that are not randomly assigned, aiming to assess causal relationships while
controlling for confounding variables

Answer 65

A

research designs that aim to evaluate
interventions or treatments without full randomization, often using
pre-existing groups or natural conditions to infer causal relationships

Answer 66

A

research designs in which the
researcher does not have control over values of the independent
variable because the independent variable occurs naturally

Answer 67

A

a specific question or statement in a survey designed to gather
data on a particular aspect of a respondent’s attitudes, opinions, or behaviors

Answer 68

A

items that allow respondents to provide their answers in
their own words

Answer 69

A

item that asks respondents to rank a list of choices according to their preferences or importance

Answer 70

A

response options that allow respondents to rate their level of
agreement or disagreement with a series of statements on an interval scale, typically ranging from “strongly disagree” to “strongly agree

Answer 71

A

a type of response with only two choices

Answer 72

A

multiple questions or items that measure a single underlying construct

Answer 73

A

the process of assessing whether a multi-item scale accurately and reliably captures the construct it is intended to measure, ensuring
that it reflects the intended attributes and performs consistently across different contexts and populations

Answer 74

A

data collected about respondents’ characteristics, such as age, gender, education level, income, and ethnicity

Answer 75

A

the entire group of individuals or units from which a sample is drawn
and to whom the survey findings are intended to generalize

Answer 76

A

a subset of individuals or units selected from a larger population
for the purpose of conducting a survey or study to draw conclusions about the entire population

Answer 77

A

to select and examine a subset of a population or data set to draw conclusions or make inferences about the larger population

Answer 78

A

the number of individual units or observations selected from a
population for a study, used to ensure the results are statistically reliable and representative of the larger group

Answer 79

A

the probability that a statistical test will correctly reject a false null hypothesis, thereby detecting an effect or relationship if one truly exist

Answer 80

A

a subset of a population that accurately reflects the characteristics and diversity of the larger group, allowing the results to be generalized to the entire population

Answer 81

A

when each member of the population has a known, non-
zero chance of being selected for the sample, allowing for statistical inference and generalization to the population

Answer 82

A

when members of the sample are not selected at random, making it difficult to determine the likelihood of any member being chosen and limiting the ability to generalize the findings

Answer 83

A

a type of non-probability sample where participants are selected based on their easy availability and proximity to the researcher, rather than through random sampling, which can lead to biases and limited generalizability

Answer 84

A

a method of inquiry that focuses on collecting and analyzing numerical data to identify patterns, test hypotheses, and make generalizations about a population

Answer 85

A

forming a precise definition for and clear understanding of the concepts being studied

Answer 86

A

a broad, abstract idea or general notion that provides a
foundational understanding

Answer 87

A

a specific, measurable version of a concept used in research
to operationalize and test theoretical ideas

Answer 88

A

the extent to which a measurement tool appears to measure what it is supposed to measure, based on casual inspection

Answer 89

A

the extent to which a variable or measurement is related to other measures that theory suggests should be related

Answer 90

A

the extent to which a variable or measurement accurately represents all of the elements that define the concept it is intended to measure

Answer 91

A

the consistency and stability of a measurement tool across
repeated applications

Answer 92

A

when only the entities that have “survived” a particular process are considered, leading to a skewed understanding or conclusion.

Answer 93

A

a method of inquiry that focuses on understanding and interpreting the meanings, experiences, and perspectives of individuals or groups through non-numerical data, such as interviews, observations, and texts

Answer 94

A

represent categories or groups and do not have a numeric value

Answer 95

A

categorical variables with no inherent order or ranking among the categories.

Answer 96

A

categorical variables that have a meaningful order or ranking, but the intervals between the categories are not necessarily equal.

Answer 97

A

represent quantities and can be measured on a numeric scale

Answer 98

A

can take any value within a range and can be subdivided into finer increments with equal unit distances

Answer 99

A

can only take specific, distinct values, often counts or integers

Answer 100

A

a class of statistics used to describe the variation of continuous variables based on their ranking from lowest to highest values

Answer 101

A

a statistical term that divides a dataset into four equal parts, with
each quartile containing 25% of the data

Answer 102

A

a graphical representation of data
that displays the median, quartiles, and potential
outliers, using a box to show the interquartile range
and “whiskers” to indicate the range of the data

Answer 103

A

numerical measures derived from the data values themselves and their positions relative to the mean or origin

Answer 104

A

if you subtract the mean of a dataset
from each data point, the sum of these deviations will always be zero

Answer 105

A

expected value because it is the
value you would most expect the variable to take.

Answer 106

A

a measure of the dispersion of a variable around its mean

Answer 107

A

another measure of the dispersion of a variable around
its mean.

Answer 108

A

a visual depiction of the distribution of a single variable based on a smoothed calculation of the density of cases across the range of values

Answer 109

A

a measure that indicates the symmetry of the variable’s distribution around the mean

Answer 110

A

a measure that indicates the steepness of the distribution of a variable

Answer 111

A

lots of nonrespondents.

Answer 112

A

a sample such that each member of the underlying population does NOT necessarily has an equal probability of being selected.

Answer 113

A

the process of using what we
know about a sample to make probabilistic statements about the broader population.

Answer 114

A

parameters are numerical values that
describe certain characteristics or features of a sample or an entire population, such as the mean, variance, or proportion.

Answer 115

A

a fundamental result from
statistics indicating that if one were to collect an infinite number of random samples and plot the resulting sample means, those sample means would be distributed normally around the true population mean

Answer 116

A

a mathematical function that describes the probabilities of different outcomes in a random variable or set of data

Answer 117

A

the underlying mechanism or
model that describes how data is produced and collected

Answer 118

A

an outcome whose occurrence is not influenced by the outcome of another event.

Answer 119

A

a bell-shaped statistical distribution that can be entirely characterized by its mean and standard deviation.

Answer 120

A

One standard deviation in each direction captures
68.3% of the area under the curve.
Two standard deviations in each direction captures
95.5% of the area under the curve.
Three standard deviations in each direction captures
99.7% of the area under the curve.

Answer 121

A

the standard deviation of the sampling distribution means.
-It is the measure of the variability or dispersion of sample means around the population mean

Answer 122

A

a probabilistic statement about the likely value of a population characteristic based on the observations in a sample.

Answer 123

A

a testable statement predicting a relationship or effect between variables, often framed as an expectation of what will happen

Answer 124

A

a specific type of hypothesis that assumes no effect or no difference between variables and serves as a baseline to test against

Answer 125

A

an alternative scenario or condition that contrasts with the proposed effect or relationship in the hypothesis, effectively serving as the null hypothesis which assumes no effect or difference

Answer 126

A

a predetermined threshold derived from a particular statistical distribution used to conduct a statistical test

Answer 127

A

the probability of rejecting the null hypothesis when its actually true, representing the threshold for statistical significance.

Answer 128

A

a value calculated by:
* identifying the sample statistic (e.g., the mean),
* determining its standard error (e.g. standard error of the mean), and
* using a specific formula to assess how far the sample result deviates from the null hypothesis

Answer 129

A

the probability of obtaining a test statistic as extreme as, or more extreme than, the one observed, assuming the null hypothesis is true

Answer 130

A

p = 0.05.

Answer 131

A

an indication that an observed effect or relationship in the data is unlikely to have occurred by random chance alone. (assuming the null hypothesis is true and the study is repeated an infinite number of times by drawing random samples from the same population, less than 5% of these results will be more extreme than the current result.)

Answer 132

A

the alternative hypothesis is proven to be true. It just means you can reject the null hypothesis

Answer 133

A

a statistical test that
evaluates whether observed categorical data align with the expected frequencies based on a specific hypothesis

Answer 134

A

a matrix that displays the frequency distribution of two categorical variables, showing how their values intersect

Answer 135

A

the number of independent values or quantities that can vary in a statistical calculation, typically indicating the number of values that are free to vary after certain constraints are applied

Answer 136

A

degrees of freedom