chapter 13: inferential statistics Flashcards
inferential statistics
a research method that allows researchers to draw conclusions or infer about a population based on data from a sample.
statistics
descriptive data that involves measuring one or more variables in a sample and computing descriptive summary data (ex means, correlation coefficients) for those variables.
parameters
corresponding values in the population.
sampling error
the random variability in a statistic from sample to sample.
null hypothesis testing
a formal approach to deciding between two interpretations of a statistical relationship in a sample.
alternative hypothesis
an alternative to the null hypothesis (HI). proposes that there is a relationship in the population and that the relationship in the sample reflects this relationship in the population.
reject null hypothesis
a decision made by researchers using null hypothesis testing which occurs when the sample relationships would be extremely unlikely.
retain null hypothesis
a decision made by researchers in null hypothesis testing which occurs when the sample relationship would not be extremely unlikely.
p value
the probability of obtaining the sample result or a more extreme result if the null hypothesis were true.
alpha
the criterion that shows how low a p-value should be before the sample result is considered unlikely enough to reject the null hypothesis (usually up to .05).
statistically significant
an effect that is unlikely due to random chance and therefore likely represents a real effect in the population.
practical significance
refers to the importance or usefulness of the result in some real-world context.
t-test
a test that involves looking at the difference between two means.
one-sample t-test
used to compare a sample mean (M) with a hypothetical population mean that provides some interesting standard of comparison.
test statistic
a statistic (ex F, t) that is computed to compare against what is expected in the null hypothesis, and thus helps find the p value.
critical value
the absolute value that a test statistic (ex F, t) must exceed to be considered statistically significant (usually .05).
two-tailed test
where we reject the null hypothesis if the test statistic for the sample is extreme in either direction (positive or negative).
one-tailed test
where we reject the null hypothesis only if the t score for the sample is extreme in one direction that we specify before collecting the data.
dependent-samples t-test
used to compare two means for the same sample tested at two different times or under two different conditions. aka paired-samples t-test.
difference score
a method to reduce pairs of scores (ex pre- and post-test) to a single score by calculating the difference between them.
independent-samples t-test
used to compare the means of two separate samples (M1 and M2).
analysis of variance (ANOVA)
a statistical test used when there are more than two groups or condition means to be compared.
one-way ANOVA
used for between-subjects designs with a single independent variable.
mean squares between groups
an estimate of the population variance and is based on the differences among the sample means.
mean squares within groups
an estimate of the population variance and is based on the differences among the scores within each group.
post hoc comparisons
an unplanned (not hypothesized) test of which pairs of group mean scores are different from which others.
repeated measures ANOVA
compares the means from the same participants tested under different conditions or at different times in which the dependent variable is measured multiple times for each participant.
factorial ANOVA
a statistical relationships in which researchers study the relationships among a large number of conceptually similar variables.
type I error
a false positive in which the researcher concludes that their results are statistically significant when in reality there is no real effect in the population and the results are due to chance. in other words, rejecting the null hypothesis when it is true.
type II error
a missed opportunity in which the researcher concludes that their results are not statistically significant when in reality there is a real effect in the population and they just missed detecting it. in other words, retaining the null hypothesis when it is false.
file drawer problem
the problem of research results not being published that fail to find a statistically significant result. as a consequence, the published literature fails to contain a full representation of the positive and negative findings about a research question.
open science practices
a practice in which researchers openly share their research materials with other researchers in hopes of increasing the transparency and openness of the scientific enterprise.
p-hacking
when researchers make various decisions in the research process to increase their chance of a statistically significant result (and type I error) by arbitrarily removing outliers, selectively choosing to report dependent variables, only presenting significant results, etc. until their results yield a desirable p value.
statistical power
in the research design, it means the probability of rejecting the null hypothesis given the sample size and expected relationship strength. effected by sample size and effect size.
confidence intervals
a range of values that is computed in such a way that some percentage of the time (usually 95%) the population parameter will lie within that range.
bayesian statistics
an approach in which the researcher specifies the probability that the null hypothesis and any important alternative hypotheses are true before conducting the study, conducts the study, and then updates the probabilities based on the data.
replicability crisis
a phrase that refers to the inability of researchers to replicate earlier research findings.
HARKing
hypothesizing after the results are known. a practice where researchers analyze data without an a priori hypothesis, claiming afterward that a statistically significant result had been originally predicted.