Statistics Flashcards

1
Q

Descriptive Statistics

A

Organizes, summarizes, and communicates a group of numerical observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Inferential Statistics

A

Uses a sample data to make general estimates about the larger population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Sample

A

Set of observations drawn from the population of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Population

A

includes all possible observations about which we’d like to know something

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Variable

A

any observation of physical, attitudinal, or behavioural characteristic that can take on different values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Discrete observation

A

Can take only specific values; on other values can exist between the numbers; times one woke up early in a week

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Continuous observation

A

can take on a full range of values (numbers out to several decimal places); infinite number of potential values exist; A person might complete a task in 12.839 seconds, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Nominal Variable

A

variable used in observations that have categories, or names, as their values; 1 for female and 2 for male

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Ordinal Variable

A

A variable used for observations that have rankings as their values; team sports, which team placed first, second, third

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Interval Variables

A

used for observations that have numbers are their values; distance (or interval) between pairs of consecutive numbers assumed to be equal; temperature because the interval from one degree to the next is always the same; cannot be anything but whole numbers, can be personality and attitude measures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Ratio Variables

A

Variables that meet the criteria for interval variables but also have meaningful zero points; reaction time; time has a meaningful zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Scale Variable

A

Variable that meets the criteria for an interval variable or a ratio variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Level

A

Discrete value or condition that a variable can take on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Independent Variable

A

has at least two levels that we either manipulate or observe to determine its effects on the dependent variable; does gender predict one’s attitude about politics; gender with two levels, male and female

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Dependent Variable

A

Outcome variable that we hypothesise to be related to, or caused by, changes in the independent variable;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Confounding Variable

A

Any variable that systematically varies with the independent variable to that we cannot logically determine which variable is at work; also called a confound; start using a diet drug AND exercise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Reliable measure

A

One that is consistent, your weight now will be the same as your weight an hour from now, your scale is reliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Valid measure

A

One that measures what it was intended to measure; your scale may match your weight when you measure it at the doctor’s office

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Hypothesis Testing

A

process of drawing conclusions about whether a particular relation between variables is supported by evidence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Operational Definition

A

Specifies the operations or procedures used to measure or manipulate a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Correlation

A

An association between two or more variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Random assignment

A

Every participant in the study has an equal chance of being assigned to any of the groups or experimental conditions in a study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Experiment

A

A study in which participants are randomly assigned to a condition or level of one or more independent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Between-Groups Research Design

A

Participants experience one, and only one, level of the independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Within-Groups Research Design

A

The different levels of the independent variable are experienced by all participants in the study, also called a Repeated measures design

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Outlier

A

an extreme score that is either very high or very low in comparison with the rest of the scores in the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Outlier Analysis

A

Studies that examine observations that do not fit the overall pattern of the data, in an effort to understand the factors that influence the dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Raw Score

A

Data point that has not yet been transformed or analyzed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Frequency Distribution

A

Describes the pattern of a set of numbers by displaying a count or proportion for each possible value of a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Frequency Table

A

Visual description of data that shows how often each value occurred, that is, how many scores were at each value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Grouped Frequency Table

A

Visual depiction of data that reports the frequencies within a given interval rather than the frequencies for a specific value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Normal Distribution

A

A very specific frequency that is bell-shaped, symmetric, unimodal curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Skewed distribution

A

Distributions in which one of the tails of the distribution is pulled away from the centre; lopsided, off-venter, or nonsymmetric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Positively skewed data

A

The distribution’s tail extends to the right, in a positive direction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Floor effect

A

Situation in which a constraint prevents a variable from taking values below a certain point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Negatively skewed data

A

Have a distribution with a tail that extends to the left, in a negative direction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Ceiling effect

A

situation in which a constraint prevents a variable from taking on values above a given number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Hint to tell whether the data is positively or negatively skewed

A

The tail tells the tale; negative scores are to the left, when the long thin tail of a distribution is to the left of the distribution centre, it is negatively skewed. When the long thin tail of a distribution is to the right of the distribution centre, it is positively skewed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Ways to present raw data

A

Frequency Tables, Grouped Frequency tables, Histograms, and Frequency Polygons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Ways to mislead with graphs

A
False Face Validity Lie
Biased Scale Lie
Sneaky sample lie
Interpolation Lie
Extrapolation Lie
Inaccurate Values Lie
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Types of Graphs

A
Scatterplot
Line Graph
Time Series Plot
Bar Graph
Pictorial Graphs
Pie Charts
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Central tendency

A

Refers to the descriptive statistics that represents the centre of a data set, the particular value that all the other data seem to be gathering around, it’s what we mean when we refer to the typical score; can be measured through the mean, median, and mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Mean

A

Arithmetic average of a group of scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Statistic

A

A number based on a sample taken from a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Parameter

A

number based on the whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Median

A

the middle score of all the score in a sample when the scores arranged in ascending order, if there is no single middle score, the median is the mean of the two middle scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Mode

A

The most common score of all the scores in the sample; used (1) when one particular score dominates a distribution (2) when the distribution is bimodal or multimodal (3) when the data are nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Unimodal distribution

A

has one mode, or most common score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Bimodal distribution

A

has two modes, or most common scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Multimodal distribution

A

has more than two modes, or most commons cores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Standard deviation

A

The square root of the average of the squared deviation from the mean, the typical amount that each score varies, or deviates, from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

Measures of variability

A

Range
Variance
Standard Deviation
Interquartile Range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

Independent measures t-test

A

Mann-Whitney Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

Repeated measures t-test

A

Wilcoxon Signed Rank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Independent measures Anova

A

Kruskal Wallis Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Repeated measures Anova

A

Friedman Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Pearson r

A

Spearman Rho

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Random sample

A

One in which every member of the population has an equal chance of being selected into the study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

Convenience Sample

A

One that uses participants who are readily available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

Generalizability

A

Refers to researchers’ ability to apply findings from one sample or in one context to the other samples or contexts, known as external validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

Replication

A

refers to the duplication of scientific results, ideally in a different context or with a sample that has different characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

Volunteer sample

A

special kind of convenience sample in which participants actively choose to participate in a study; also called a self-selected sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

Control Group

A

A level of the independent variable that does not receive the treatment of interest in a study; designed to match an experimental group in all ways but the experimental manipulation itself

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

Experimental Group

A

Level of the independent variable that receives the treatment or intervention of interest in an experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Null hypothesis

A

a statement that postulates that there is no difference between populations or that the difference is in a direction opposite from that anticipated by the researcher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

Research hypothesis

A

Statement that postulates that there is a difference between populations or sometimes, more specifically, that there is a difference in a certain direction, positive or negative; also called an alternative hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

Making a Decision About Our Hypothesis

A

We decide to reject the null hypothesis (there is a difference)
We dede to fail to reject the null hypothesis (there is no difference)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

Rules of Formal Hypothesis Testing

A

The null hypothesis is that there is no difference between groups and usually, our hypotheses explore the possibility of a mean difference
We either reject or fail to reject the null hypothesis. There are no other options.
We never use the word accept in reference to formal hypothesis testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

Type I Error

A

Occurs when we reject the null hypothesis but the null hypothesis is correct; false positive; rejecting the null hypothesis falsely; detrimental consequences because people often take action based on a mistaken finding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

Type II Error

A

Occurs when we fail to reject the null hypothesis but the null hypothesis is false; false negative; results in a failure to take action because a research intervention is not supported or a given diagnosis is not received;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

Standardization

A

Converts individual scores to standard scores for which we know the percentiles if the data were normally distributed;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

z Score

A

The number of standard deviations a particular score is from the mean; can be computed if we know the mean and the standard deviation of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

z scores into percentiles

A

2-14-34-34-14-2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

Central Limit Theorem

A

REfers to how a distribution of sample means is a more normal distribution than a distribution of scores, even when the population distribution is not normal; repeated sampling approximates a normal curve even when the original population is not normally distributed; a distribution of means is less variable than a distribution of individual scores; minimum of thirty comprises each sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

Distribution of means

A

Distribution composed of many means that are calculated from all popsicle samples of a given size, all taken from the same population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

Ways to describe the same scores within a normal distribution

A

Raw Scores
z Scores
Percentile Rankings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

Assumptions

A

The characteristics that we ideally require the population from which we are sampling to have so that we can make accurate inferences;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

Parametric Tests

A

Inferential statistical analyses based on a set of assumptions about the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

Nonparametric Tests

A

Inferential statistical analyses that are not based on a set of assumptions about the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

Assumptions for Conducting Analyses

A

The dependent variable is assessed using a scale measure, there is an equal distance between the number. If variable is nominal or ordinal, don’t make assumption.
Assume that the participants are randomly selected.
Distribution of the population of interest must be approximately normal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

Steps of Hypothesis Testing

A

Identify populations, comparison, distribution, and assumptions.
State the null and research hypothesis
Determine the characteristics of the comparison distribution
Determine critical values or cutoffs
Calculate the test statistic
Decide whether to reject or fail to reject the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

Statistically significant finding

A

If the data differ from what we would expect by chance if there were, in fact, no actual difference; does not necessarily mean the finding is important or meaningful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

Robust hypothesis test

A

one that produces fairly accurate results even when the data suggest that the population might not meet some of the assumptions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

Critical value

A

Test statistic value beyond which we reject the null hypothesis, also known as a cutoff

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

Critical region

A

refers to the area in the tails of the comparison distribution in which we reject the null hypothesis if our test statistic falls there.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

p level/alpha

A

The probability used to determine the critical values, or cutoffs, in hypothesis testing

87
Q

Two-tailed test

A

Hypothesis test in which the research hypothesis does not indicate a direction of the mean difference or change in the dependent variable, but merely indicates that there will be a mean difference;

88
Q

Point estimate

A

Summary statistic from a sample that is just one number used as an estimate of the population parameter

89
Q

Interval Estimate

A

Based on a sample statistic and provides a range of plausible values for the ovulation parameter; used when reporting polls;

90
Q

Confidence Interval

A

Internal estimate, based on the sample statistic, that would include the population mean a certain percentage of the time if we sampled from the same population repeatedly; centred around the mean. 95% confidence interval most commonly used, 95% falls between the two tails; Confidence level is 95%, confidence interval is the range between the two values that surround the sample mean.

91
Q

Note on sample size and statistic

A

As sample size increases, there is a corresponding increase in test statistic during hypothesis testing; A larger sample size should influence our level of confidence but it shouldn’t increase our confidence that the story is important

92
Q

Effect Size

A

Indicates the size of a difference and is unaffected by sample size; tells us how much two populations DO NOT overlap; the less overlap, the bigger the effect size

93
Q

How to decrease amount of overlap between two distributions

A

If means are farther apart

If the variation within each population is smaller

94
Q

Effect Size and Standard Deviation

A

When two population distributions decrease their spread, the overlap of the distributions is less and the distribution is bigger

95
Q

Cohen’s d

A

Developed by Jacob Cohen; a measure of effect size that assesses the difference between two means in terms of standard deviation, not standard error; similar to a z statistic

96
Q

Cohen’s Conventions for Effect Sizes

A

Effect Size Convention Overlap
Small 0.2 85%
Medium 0.5 67%
Large 0.8 53%

97
Q

Statistical Power

A

Measure of our ability to reject the null hypothesis given that the null hypothesis is false; the probability that we will reject the null hypothesis when we should reject the null hypothesis; the probability that we will not make a Type II Error. Acceptable rate is .80

98
Q

Ways to increase power of a statistical Test

A
  1. Increase the alpha. Take the p level of 0.05 and increase it to 0.10; Side effect of increasing the probability of a Type I error from 5% to 10%
  2. Turn a two-tailed hypothesis into a one tailed hypothesis
  3. Increase N. Increasing sample size leads to an increase in the test statistic, making it easier to reject the null because a larger test statistic is more likely to fall beyond the cutoff
  4. Exaggerate the levels of the independent variable. Example is to add to the length of group therapy if the study is on the effectiveness of group therapy for social phobia
  5. Decrease the standard deviation (use reliable measure from the beginning of the study and sampling from a more homogeneous group in which participants’ responses are more likely to be more similar to begin with)
99
Q

Meta-analysis

A

Study that involves the calculation of a mean effect size from the individual effect sizes of many studies

100
Q

Ways of analysing data

A

Hypothesis Testing
Confidence Intervals
Effect Size
Power Analysis

101
Q

t Distributions

A

Help us specify precisely how confident we can be in our research findings; The t test, based on t distributions, tells us how confident we can be that our sample differs from the larger population; used instead of a z distribution when sampling requires us to estimate the population standard deviation from the sample standard deviation

102
Q

t Statistic

A

Indicates the distance of a sample mean from a population mean in terms of the standard error

103
Q

Single-Sample t-test

A

Hypothesis test in which we compare data from one sample to a population for which we know the mean but not the standard deviation

104
Q

Degrees of freedom

A

The number of scores that are free to vary when estimating a population parameter from a sample

105
Q

Reporting a t statistic in APA Format

A
  1. Write the symbol for the test statistic
  2. Write the degrees of freedom, in parentheses
  3. Write an equal sign and then the value of the test statistic, typically to two decimal places
  4. Write a comma and then indicate the p value by writing “p=” and then the actual value

t(4) = 2.87, p < 0.05
It appears that counselling centre clients who sign a contract to attend at least 10 sessions do attend more sessions, on average, than do clients who do not sign such a contract, t(4) = 2.87, p < 0.05

106
Q

Dot plot

A

Graph that displays all the data points in a sample with the range of scores along the x-axis and a dot for each data point above the appropriate value

107
Q

t Test

A

Types of t tests: Single sample t test (when we compare a sample mean to a population mean but don’t know the population standard deviation)
Paired-Samples t-test [Dependent Samples t test]
(when we are comparing two samples and every participant is in both samples, a within-groups design; before and after comparisons)
Independent samples t test (when we are comparing two samples and every participant is in only one sample, a between-groups design)

108
Q

Assumptions for a paired samples t test

A
  1. The dependent variable is scale
  2. The participants were randomly selected
  3. The population is normally distributed
109
Q

Order Effects/Practice Effects

A

Refer to how a participant’s behaviour changes when the dependent variable is presented for a second time

110
Q

Counterbalancing

A

minimises the practice effect by varying the order of presentation of different levels of the independent variable from one participant to the next

111
Q

Example of a Paired Samples t Test

A

Salaries for the same position in two different cities; scores of 30 students on two different exams; scores on tests before and after interventions

112
Q

Independent samples t-test

A

used to compare two means for a between-groups design, situation in which each participant is assigned to only one condition; difference between means

113
Q

Three Assumptions for Independent Samples t-test

A

1) The dependent variable is a rating on a liking measure, which can be considered a scale variable
2) We do not know whether the population is normally distributed, there are at least 30 participants
3) Participants are randomly selected

114
Q

Example of independent samples t-test

A

Group 1: Low trust in leader
Group 2: High trust in leader

Level of agreement with their supervisor from 1 (strongly disagree) to 7 (strongly agree)

Population 1: women exposed to humorous cartoons
Population 2: men exposed to humorous cartoons
Dependent variable: percentage of cartoons characterised as funny (scale)
Ho: u1 = u2

Are women really more talkative than men?
How long, in minutes, do male and female students spend getting ready for a date?
Can women experience “mother hearing,” an increased sensitivity to and awareness of noises, in particular, those of children? Mothers and non mothers.

115
Q

Taylor and Ste-Marie studied eating disorders in 41 Canadian female figure skaters, They compared the figure skaters’ data on the Eating Disorder Inventory to the means of known populations, including women with eating disorders. On average, the figure skaters were more similar to the population of women with eating disorders than to those without eating disorders

A

Single sample t test because we have one sample of figure skaters and are comparing that sample to a population (women with eating disorders) for which we know the mean

116
Q

In an article titled “A Fair and Balanced Look at the News: What Affects Memory for controversial Arguments,” Wiley found that people with a high level of previous knowledge about a given controversial topic (abortion, military intervention) had better average recall for arguments on both sides of that issue than did those with lower levels of knowledge

A

Independent Samples t test because we have two samples, and no participant can be in both samples. One cannot have both high level and low level of knowledge about a topic

117
Q

Engle-Friedman and colleagues studied the effects of sleep deprivation. Fifty students were assigned to one night of sleep loss (students were required to call the laboratory every half-hour all night) and then one night of no sleep loss (normal sleep). The next day, the students were offered a choice of math problems with differing levels of difficulty. Following sleep loss, students tended to choose less challenging problems

A

We would use a paired-samples t test because we have two samples, but every student is assigned to both samples - one night of sleep loss and one night of no sleep loss

118
Q

Anova Example

A

Three experiments to compare Group 1 vs Group 2, Group 1 vs Group 3, Group 2 vs Group 3, and putting all three groups in a single experiments is far more efficient. Scores in the final exam for Group 1 (control group), and group 3 (take responsibility group) were the same. average scores on the final exam for group 2 (self-esteem group) sank to .37%.

119
Q

Using T tests to compare three groups

A

Leads to more chances of committing a Type I error. (0.95)(0.95)(0.95) = 0.857, this gives us almost a 15% chance of having at least one Type I error if we run 3 analyses.

120
Q

F distributions

A

Allow us to conduct a single hypothesis test with multiple groups; more complex variations of the z distributions and the t distributions

121
Q

Anova

A

Analysis of Variance; a hypothesis test typically with one or more nominal independent variables with at least three groups overall and a scale dependent variable

122
Q

F statistic

A

Ratio of two measures of variance: (1) between groups variance, which indicates differences among sample means, and (2) within-groups variance, which is essentially an average of the sample variances; a way of measuring whether three or more groups vary from one another; an expansion of the z statistic and t statistic

123
Q

Between Groups Variance

A

An estimate of the population of the population variance based on the differences among the means

124
Q

Within Groups Variance

A

An estimate of the population variance based on the differences within each of the three (or more) sample distributions

125
Q

When to use z, t, and F statistic

A
z = one sample population and standard deviation are known
t = one sample, only population is known; two samples
F = three or more samples
126
Q

One-Way Anova

A

A hypothesis test that includes one nominal independent variable with more than two levels and a scale dependent variable

127
Q

Within-Groups Anova

A

A hypothesis test in which there are more than two samples, and each sample is composed of the same participants; repeated measured ANOVA

128
Q

Between-Groups Anova

A

A hypothesis test in which there are more than two samples, and each sample is composed of different participants

129
Q

Assumptions for Anova

A

Samples are randomly selected.
Population distribution is normal.
All samples comes from populations with the same variances

130
Q

External Validity

A

Ability to generalize beyond the sample

131
Q

Homoscedasticity

A

Homoscedastic populations are those that have the same variance

132
Q

Heteroscedastic Populations

A

Those that have different variances

133
Q

Null Hypothesis for Anova

A

Ho = u1 = u2 = u3 = u4

134
Q

Source Table

A

Presents the important calculations and final results of an Anova in a consistent and easy-to-read format

135
Q

Grand Mean

A

The mean of every score in a study, regardless of which sample the score came from

136
Q

R2

A

Proportion of variance accounted for by the dependent variable that is accounted for by the independent variable

137
Q

Planned comparison

A

A test that is conducted when there are multiple groups of scores, but specific comparisons have been specified prior to data collection

138
Q

Post hoc Test

A

Statistical procedure frequently carried out after we reject the null hypothesis in an analysis of variance; it allows us to make multiple comparisons among several means; often referred to as a follow-up test

139
Q

A priori comparisons

A

Guided by an existing theory or a previous finding

140
Q

Choices a researcher can make

A

Conducting one or more independent samples t tests with a p level of 0.05
Conducting one or more independent-samples t tests using a more conservative p level as determined by a Bonferroni Test

141
Q

Tukey HSD Test

A

Post-hoc test that determines the differences between means in terms of standard error, comparable to a critical value; sometimes referred to as the q test;

Involves (1) calculation of differences between each pair of means (2) division of each difference by the standard error

142
Q

Within Groups Degrees of Freedom for a one-way between-groups ANOVA

A

df within = df1+df+df3+df4

Sum the degrees of freedom for each group by subtracting 1 from the number of people in that sample

143
Q

One way within-groups ANOVA

A

When there’s just one nominal or ordinal independent variable (type of beer), the independent variable has more than two levels (cheap, mid-range, and high-end), the dependent variable is scale (ratings of beers), and every participant is in every group (each participant tastes the beers in every category)

144
Q

How to reduce error in a within groups design

A

Each group includes exactly the same participants, groups are identical on all the relevant variables; same taste preferences, amount of alcohol typically consumed, tendency to be critical or lenient when rating, and so on

145
Q

Steps of Hypothesis Testing for within-groups ANOVA

A

Identify the populations, distribution, assumptions
State the null and research hypotheses
Determine the characteristics of the comparison distribution
(F distribution, degrees of freedom [df within = (df between)(df subjects)] [df total = df between + df subjects + df within)]
Determine critical values or cutoffs (F statistic for a p level of 0.05)
Calculate the test statistic
Make a decision

146
Q

As social scientists

A

We should critically examine the research design and, regardless of its merits, call for a replication

147
Q

Problems to watch for when using matched groups

A

We may not be aware of all of the important variables of interest
If one of the people in a matched pair deicdes not to complete the study, then we must discard the data for the match for this person

148
Q

Statistical Interaction

A

Occurs when a factorial design when two or more independent variables have an effect in combination that we do not see when we examine each independent variable on its own

149
Q

Two-way ANOVA

A

Hypothesis test that includes two nominal independent variables, regardless of their numbers of levels, and a scale dependent variable

150
Q

Factorial ANOVA

A

A statistical analysis used with one scale dependent variable and at least two nominal independent variables (factors); also called a multifactorial ANOVA

151
Q

Factor

A

Term used to describe an independent variable in a study with more than one independent variable

152
Q

How to name an ANOVA

A

if IVs Participants in 1 or all Always follows desc.
One-way Between Groups ANOVA
Two-way Within-Groups
Three-way Mixed-Design

153
Q

Example of ANOVA

A

Examine (1) the effect of Lipitor versus other medication (2) the effect of grapefruit juice versus other beverages (3) ways in which a drug and a juice might combine to create some entirely new and unexpected effect
Lipitor Zocor Placebo
GF JUICE L & G Z & G P & G
WATER L & W Z & W P & W

154
Q

Main effect

A

Occurs in a factorial design when one of the independent variables has an influence on the independent variable

155
Q

Quantitative interaction

A

An interaction in which one independent variable exhibits a strengthening or weakening of its effect at one or more levels of the other independent variable, but the direction of the initial effect does not change

156
Q

Qualitative interaction

A

Particular type of quantitative interaction of two (or more) independent variables in which one independent variable reverses its effect depending on the level of the other independent variable

157
Q

Marginal Mean

A

The mean of a row or a column in a table that shows the cells of a study with a two-way ANOVA design

158
Q

Six Steps of a Two-Way ANOVA

A

Identify the populations, distribution, assumptions
State the null and research hypothesis
Determine characteristics of the comparison distribution
Determine the critical values, or cutoffs
Calculate the test statistic
Make a decision

159
Q

Mixed Design ANOVA

A

Used to analyse data from a study with at least two independent variables; at least one variable must be between groups. Includes both a between-groups variable and within-groups variable

160
Q

Multivariate Analysis of Variance (MANOVA)

A

Form of ANOVA in which there is more than one dependent variable; The word multivariate refers to the number of dependent variables, not the number of independent variables

161
Q

Analysis of Covariance (ANCOVA)

A

Type of Anova in which a covariate is included so that statistical findings reflect effects after a scale variable has been statistically removed;

162
Q

Covariate

A

scale variable that we suspect associates, or covaries, with the independent variable of interest; statistically subtracts the effect of a possible confounding variable

163
Q

Multivariate Analysis of Covariance (MANCOVA)

A

An ANOVA with multiple dependent variables and the inclusion of a covariate

164
Q

Example of Two-Way Between-Groups ANOVA

A

Online dating Website allows users to post personal ads to meet others. Each person is asked to specify a range from the youngest age acceptable to the oldest age acceptable. Data were randomly selected from ads of 25-year-old people living in the New York City area. Scores represent youngest acceptable ages listed by those in the sample.

25 y.o. women seeking men
25 y.o. women seeking women
25 y.o. men seeking women
25 y.o. men seeking men

Two independent variables (gender of seeker, levels: male and female); and gender of the person being sought, levels: male and female); one dependent variable: youngest acceptable age of the person being sought)

165
Q

Correlation

A

Association or relation between two variables; gives new ways to measure behaviour and to distinguish among the influences of overlapping variable

166
Q

Correlation Coefficient

A

A statistic that quantifies a relation between two variables

167
Q

Positive Correlation

A

An association between two variables such that participants with high scores on one variable tend to have high scores on one variable tend to have high scores on the other variable as well, and those with low scores on one variable tend to have low scores on the other variable

168
Q

Three main characteristics of the Correlation Coefficient

A

It can be either positive or negative
It always falls between -1.00 and 1.00
It is the strength (or magnitude) of the coefficient, not its sign, that indicates how large it is

169
Q

Negative Correlation

A

An association between two variables in which participants with high scores on one variable tend to have low scores on the other variable

170
Q

Guidelines on Size of Correlation and Correlation Coefficient

A

Size of Correlation Correlation Coefficient
Small 0.10
Medium 0.30
Large 0.50

171
Q

Limitations of Correlation

A

Correlation is Not Causation

Restricted Range

172
Q

Possible Causal Explanations for a Correlation

A

The first variable might cause the second variable
The second variable could cause the first variable
A third variable could cause both A and B

173
Q

Effect of an extreme outlier on a correlation

A

A correlation can be dramatically altered by a restricted range or by an extreme outlier

174
Q

Pearson Correlation Coefficient

A

Statistic that quantifies a linear relation between two scale variables; a single number is used to describe the direction and strength of the relation between two variables when their overall pattern indicates a straight-line relation

175
Q

Hypothesis Testing with the Pearson Correlation Coefficient

A
  1. Identify the population, distribution, and assumptions
  2. State the null and research hypotheses
    Ho: p = 0; H1: ≠ 0
  3. Determine the characteristics of the comparison distribution (df = N-2)
  4. Determine the critical/cutoff values. Look up values in r table given the degrees of freedom and the p level
  5. Calculate the test Statistic
  6. Make a decision
176
Q

Coefficient alpha

A

Estimate of a test measure’s reliability and is calculated by taking the average of all possible split-half correlations

177
Q

Partial Correlation

A

Technique that quantifies the degree of association between two variables after statistically removing the association of a third variables after statistically removing the association of a third variable with both of those two variables

178
Q

Simple Linear Regression

A

A statistical tool that lets us predict a person’s score on the dependent variable from his or her score on one independent variable

179
Q

Regression

A

A statistical technique that can provide specific quantitative information that predicts relations between variables; can provide specific quantitative predictions that more precisely explain relations among variables

180
Q

Regression to the mean

A

Regression of the dependent variable; the tendency of scores that are particularly high or low to drift toward the mean over time

181
Q

Standardized Regression Coefficient

A

A standardised version of the slope in a regression equation, is the predicted change in the dependent variable in terms of standard deviations for an increase of 1 standard deviation in the independent variable

182
Q

Regression Line

A

The line that best fits the points on the scatterplot; the regression line is the line that leads to the least amount of error in prediction

183
Q

Standard of Error of the Estimate

A

Statistic indicating the typical distance between a regression line and the actual data point; we are concerned with variability around the best line of fit rather than variability around the mean

184
Q

Regression to the mean

A

Occurs because extreme scores tend to become less extreme, that is, they tend to regress towards the mean

185
Q

Proportionate Reduction in Error/Coefficient of Determination

A

Statistic that quantifies how much more accurate predictions are when we use the regression line instead of the mean as a prediction tool

186
Q

Orthogonal variable

A

An independent variable that makes a separate and distinct contribution in the prediction of a dependent variable, as compared with another variable

187
Q

Multiple regression

A

A statistical technique that includes two or more predictor variables in a prediction equation

188
Q

Use of Statistical Techniques

A

A way of quantifying whether multiple pieces of evidence really are better one
A way of quantifying precisely how much better each additional piece of evidence actually is

189
Q

Stepwise Multiple Regression

A

A type of multiple regression in which computer software determines the order in which independent variables are included in the equation

190
Q

Hierarchical multiple regression

A

type of multiple regression in which the researcher adds independent variables to the equation in an order determined by theory

191
Q

Structural Equation Modeling (SEM)

A

A statistical technique that quantifies how well sample data “fit” a theoretical model that hypothesises a set of relations among multiple variables

192
Q

Statistical (or Theoretical) Model

A

Hypothesized network of relations, often portrayed graphically among multiple variables

193
Q

Path

A

A term that statisticians use to describe the connection between two variables in a statistical model

194
Q

Path Analysis

A

A statistical method that examines a hypothesised model, usually by conducting a series of regression analyses that quantify the paths at each succeeding step in the model

195
Q

Manifest Variables

A

The variables in a study that we can observe and that are measured

196
Q

Latent Variables

A

The ideas we want to research but cannot directly measure

197
Q

Chi Square Statistic

A

Allows us to test relations between variables when they are nominal

198
Q

When to use nonparametric test

A
  1. When the dependent variable is nominal (whether or not a woman gets pregnant)
  2. When the dependent variable is ordinal
  3. When the sample size is small and we suspect that the underlying population of interest is skewed
199
Q

Chi Square Test for Goodness-of-Fit

A

A nonparametric hypothesis test used with one nominal variable

200
Q

Chi-Square Test for Independence

A

A nonparametric hypothesis test used with two nominal variables

201
Q

Example of Chi Square

A

Researchers reported that the best soccer players in the world were more likely to have been born early in the year than later. 52 elite youth players in Germany were born in January, February, or March. Only 4 players were born in October, November, or December

202
Q

Steps to conduct chi-square test for goodness-of-fit for Hypothesis Testing

A
  1. Identify populations, distribution, and assumption
  2. State the null and research hypotheses
  3. Determine the characteristics of the comparison distribution (how many degrees of freedom)
  4. Determine the critical values or cutoffs (use the chi-square table, basis is degrees of freedom and p level)
  5. Calculate the test statistic
  6. Make a decision
203
Q

Steps in hypothesis testing for chi-square test for independence

A
  1. Identify populations, distribution, and assumption
  2. State the null and research hypotheses
  3. Determine the characteristics of the comparison distribution
  4. Determine the critical values or cutoffs using the degrees of freedom and the p level
  5. Calculate the test statistic
  6. Make a decision
204
Q

Example of chi square test for independents

A

EXPECTED FREQUENCIES WITH TOTALS

               Pregnant                   Not Pregnant Clown No Clown
205
Q

Relative risk/relative likelihood/relative chance

A

A measure created by making a ration of two conditional proportions

206
Q

Adjusted Standardized Residual

A

The difference between the observed frequency and the expected frequency for a cell in a chi-square research design, divided by the standard error

207
Q

Spearman Rank-Order Correlation Coefficient

A

A nonparametric Statistic that quantifies the association between two ordinal variables; coefficient can range from -1 to +1; can indicate a strong correlation but no causation

208
Q

Wilcoxon Signed-Rank Test

A

Nonparametric hypothesis test used when there are two groups, a within-groups design, and an ordinal dependent variable

209
Q

Hypothesis Testing for Wilcoxon Signed Rank Test

A
  1. Identify the assumptions (differences between pairs must be ranked, random selection, difference scores should come from a symmetric population distribution)
  2. State the null and research hypotheses (only in words, not symbols)
  3. Determine the characteristics of the comparison distribution (T statistic; decide the cutoff or critical value; one-tailed or two-tailed test; determine the sample size)
  4. Determine the critical values (check the table)
  5. Calculate the test statistic
  6. Make the decision
210
Q

Mann-Whitney U Test

A

Nonparametric hypothesis test used when there are two groups, a between-groups design, and an ordinal dependent variable; U statistic

211
Q

Hypothesis Testing for Mann-Whitney U Test

A
  1. Identify the assumptions
  2. State the null and research hypotheses
  3. Determine the characteristics of the comparison distribution
  4. Determine the critical values, or cutoffs (We want the smaller of the test statistics to be equal or smaller than this critical value)
  5. Calculate the test statistics
  6. Make the decision
212
Q

Kruskal-Wallis H Test

A

A nonparametric hypothesis test used when there are more than two groups, a between-groups design, and an ordinal dependent variable, H

213
Q

Hypothesis Testing for Kurskal-Wallis Test

A
  1. Identify the assumptions
  2. State the null and research hypotheses
  3. Determine the characteristics of the comparison distribution
  4. Determine the critical values, or cutoffs using a table, based on a chi square distribution with a p level of 0.05, and degrees of freedom
  5. Calculate the test statistic
  6. Make a decision
214
Q

Bootstrapping

A

Statistical process in which the original sample data are used to represent the entire population, and we repeatedly take samples from the original sample data to form a confidence interval