Chapter 13 - Inferential Statistics Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Parameters

A

Corresponding values in the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sampling error

A

The random variability in a statistic from sample to sample

Note that the term error here refers to random variability and does not imply that anyone has made a mistake. No one “commits a sampling error.”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Interpreting Possibilities of statistical relationships in a sample

A

In fact, any statistical relationship in a sample can be interpreted in two ways:

There is a relationship in the population, and the relationship in the sample reflects this.

There is no relationship in the population, and the relationship in the sample reflects only sampling error.

The purpose of null hypothesis testing is simply to help researchers decide between these two interpretations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

NHST

A

A formal approach to deciding between two interpretations of a statistical relationship in a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Null Hypothesis

A

The idea that there is no relationship in the population and that the relationship in the sample reflects only sampling error (often symbolized H0 and read as “H-zero”).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Alternative Hypothesis

A

An alternative to the null hypothesis (often symbolized as H1), this hypothesis proposes that thereisa relationship in the population and that the relationship in the sample reflects this relationship in the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Steps of NHST

A

Although there are many specific null hypothesis testing techniques, they are all based on the same general logic. The steps are as follows:

Assume for the moment that the null hypothesis is true. There is no relationship between the variables in the population.

Determine how likely the sample relationship would be if the null hypothesis were true.

If the sample relationship would be extremely unlikely, thenrejectthenullhypothesisin favor of the alternative hypothesis. If it would not be extremely unlikely, thenretainthenullhypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

P value

A

A crucial step in null hypothesis testing is finding the probability of the sample result or a more extreme result if the null hypothesis were true (Lakens, 2017).[1]This probability is called thepvalue. A lowpvalue means that the sample or more extreme result would be unlikely if the null hypothesis were true and leads to the rejection of the null hypothesis. Apvalue that is not low means that the sample or more extreme result would be likely if the null hypothesis were true and leads to the retention of the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How low does a p value have to be before sample result is considered unlikely enough to reject null hypothesis?

A

But how low must thepvalue criterion be before the sample result is considered unlikely enough to reject the null hypothesis? In null hypothesis testing, this criterion is calledα(alpha)and is almost always set to .05. If there is a 5% chance or less of a result at least as extreme as the sample result if the null hypothesis were true, then the null hypothesis is rejected. When this happens, the result is said to bestatistically significant. If there is greater than a 5% chance of a result as extreme as the sample result when the null hypothesis is true, then the null hypothesis is retained.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Misunderstanding the P value

A

Thepvalue is one of the most misunderstood quantities in psychological research (Cohen, 1994)[2]. Even professional researchers misinterpret it, and it is not unusual for such misinterpretations to appear in statistics textbooks!

The most common misinterpretation is that thepvalue is the probability that the null hypothesis is true—that the sample result occurred by chance. For example, a misguided researcher might say that because thepvalue is .02, there is only a 2% chance that the result is due to chance and a 98% chance that it reflects a real relationship in the population. But this isincorrect. Thepvalue is really the probability of a result at least as extreme as the sample resultifthe null hypothesisweretrue. So apvalue of .02 means that if the null hypothesis were true, a sample result this extreme would occur only 2% of the time.

You can avoid this misunderstanding by remembering that thepvalue is not the probability that any particularhypothesisis true or false. Instead, it is the probability of obtaining thesample resultif the null hypothesis were true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

clarifying what it means to test the null hypothesis

A

Recall that null hypothesis testing involves answering the question, “If the null hypothesis were true, what is the probability of a sample result as extreme as this one?” In other words, “What is thepvalue?”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Interpretive skill

A

weak relationships based on medium or small samples are never statistically significant and that strong relationships based on medium or larger samples are always statistically significant. If you keep this lesson in mind, you will often know whether a result is statistically significant based on the descriptive statistics alone. It is extremely useful to be able to develop this kind of intuitive judgment. One reason is that it allows you to develop expectations about how your formal null hypothesis tests are going to come out, which in turn allows you to detect problems in your analyses. For example, if your sample relationship is strong and your sample is medium, then you would expect to reject the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A statistically _____ result is not necessarily a ____ one.

A

significant, strong

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

practical significance

A

Practicalsignificancerefers to the importance or usefulness of the result in some real-world context. Many sex differences are statistically significant—and may even be interesting for purely scientific reasons—but they are not practically significant. In clinical practice, this same concept is often referred to as “clinical significance.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

T test

A

The most common null hypothesis test for this type of statistical relationship is thet-test.

The t-test is a test that involves looking at the difference between two means.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

One Sample t-test

A

Used to compare a sample mean (M) with a hypothetical population mean (μ0) that provides some interesting standard of comparison

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Test Statistic

A

A statistic (e.g.,F,t, etc.) that is computed to compare against what is expected in the null hypothesis, and thus helps find the p value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Critical Value

A

The absolute value that a test statistic (e.g.,F,t, etc.) must exceed to be considered statistically significant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Two-tailed test

A

Where we reject the null hypothesis if the test statistic for the sample is extreme in either direction (+/-).

20
Q

One-tailed test

A

Where we reject the null hypothesis only if thetscore for the sample is extreme in one direction that we specify before collecting the data.

21
Q

Dependent-samples t-test

A

Used to compare two means for the same sample tested at two different times or under two different conditions (sometimes called the paired-samplest-test).

22
Q

Difference score

A

A method to reduce pairs of scores (e.g., pre- and post-test) to a single score by calculating the difference between them.

23
Q

Independent-samples t-test

A

Used to compare the means of two separate samples (M1andM2).

24
Q

ANOVA stands for?

A

Analysis Of VAriance
A statistical test used when there are more than two groups or condition means to be compared.

25
Q

one-way ANOVA

A

Used for between-subjects designs with a single independent variable

26
Q

meansquaresbetweengroups(MSB)

A

An estimate of the population variance and is based on the differences among the sample means.

27
Q

mean squares within groups(MSW)

A

An estimate of the population variance and is based on the differences among the scores within each group.

28
Q

post hoc comparisons

A

An unplanned (not hypothesized) test of which pairs of group mean scores are different from which others.

29
Q

repeated-measures ANOVA

A

Compares the means from the same participants tested under different conditions or at different times in which the dependent variable is measured multiple times for each participant.

30
Q

Factorial ANOVA

A

A statistical method to detect differences in the means between conditions when there are two or more independent variables in a factorial design. It allows the detection of main effects and interaction effects.

31
Q

Rejecting the null hypothesis when it is true is called a ____ __ _____

A

Type I Error - This error means that we have concluded that there i a relationship in the population when in fact there is not. This is a false positive.

32
Q

Retaining the null hypothesis when it is false is called a ____ ___ ___

A

Type II Error - This error means that we have concluded that there is no relationship i the population when in fact there is a relationship. This is a false negative.

33
Q

setting a to .05 is ideal why?

A

It keeps the rates of both type I and II errors at acceptable levels; accoring to some researchers.

34
Q

What is the file-drawer problem?

A

The problem of research results not being published that fail to find a statistically significant result. As a consequence, the published literature fails to contain a full representation of the positive and negative findings about a research question.

35
Q

What is the result of the file-drawer problem?

A

One effect of this tendency is that the published literature probably contains a higher proportion of Type I errors than we might expect on the basis of statistical considerations alone. Even when there is a relationship between two variables in the population, the published research literature is likely to overstate the strength of that relationship.

36
Q

What is p-hacking?

A

When researchers make various decisions in the research process to increase their chance of a statistically significant result (and type I error) by arbitrarily removing outliers, selectively choosing to report dependent variables, only presenting significant results, etc. until their results yield a desirablepvalue.

37
Q

Statistical Power

A

In research design, it means the probability of rejecting the null hypothesis given the sample size and expected relationship strength.

38
Q

How to increase statistical power

A

Given that statistical power depends primarily on relationship strength and sample size, there are essentially two steps you can take to increase statistical power: increase the strength of the relationship or increase the sample size. Increasing the strength of the relationship can sometimes be accomplished by using a stronger manipulation or by more carefully controlling extraneous variables to reduce the amount of noise in the data (e.g., by using a within-subjects design rather than a between-subjects design). The usual strategy, however, is to increase the sample size. For any expected relationship strength, there will always be some sample large enough to achieve adequate power.

39
Q

Confidence Intervals

A

A range of values that is computed in such a way that some percentage of the time (usually 95%) the population parameter will lie within that range.

40
Q

Bayesian Statistics

A

An approach in which the researcher specifies the probability that the null hypothesis and any important alternative hypotheses are true before conducting the study, conducts the study, and then updates the probabilities based on the data.

41
Q

Replicability crisis

A

A phrase that refers to the inability of researchers to replicate earlier research findings.

42
Q

5 questionable research practices:

A
  1. Deleting outliers to influence statistical results.
  2. Cherry picking results.
  3. HARKing - Hypothesizing After Results are Known.
  4. p-hacking - determining whether a statistic is significant before recruiting more participants.
  5. Outright fabrication of data (fraud).
43
Q

Important things to do now

A

Designing and conducting studies that have sufficient statistical power, in order to increase the reliability of findings.

Publishing both null and significant findings (thereby counteracting the publication bias and reducing the file drawer problem).

Describing one’s research designs in sufficient detail to enable other researchers to replicate your study using an identical or at least very similar procedure.

Conducting high-quality replications and publishing these results (Brandt et al., 2014)[10].

44
Q

Open science practices

A

A practice in which researchers openly share their research materials with other researchers in hopes of Increasing the transparency and openness of the scientific enterprise.

45
Q

Criticisms and Defenses of Null Hypothesis Testing

A

Null hypothesis testing has been criticized on the grounds that researchers misunderstand it, that it is illogical, and that it is uninformative. Others argue that it serves an important purpose—especially when used with effect size measures, confidence intervals, and other techniques. It remains the dominant approach to inferential statistics in psychology.