Statistics for People Who (Think They) Hate Statistics Flashcards

1
Q

Analysis of variance

A

A test for the difference between two or more means. A simple analysis for variance (or ANOVA) has only one independent variable, whereas a factorial analysis of variance tests the means of more than one independent variable. One-way analysis of variance looks for differences between the means of more than two groups.

See One-way analysis of variance and Simple analysis of variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Arithmetic mean

A

A measure of central tendency calculated by summing all the scores and dividing by the number of scores.

(See Mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Asymptotic

A

The quality of the normal curve such that the tails never touch the horizontal axis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Average

A

The most representative score in a set of scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bell-shaped curve

(normal curve)

A

A distribution of scores that is symmetrical about the mean, the median, and the mode and has asymptotic tails.

Often called the normal curve.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Class interval

A

A fixed range of values, used in the creation of a frequency distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Coefficient of alienation

A

The amount of variance in one variable that is not accounted for by the variance in another variable.

Also known as Coefficient of nondetermination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Coefficient of determination

A

The amount of variance in one variable that is accounted for by the variance in another variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Coefficient of nondetermination

A

The amount of variance in one variable that is not accounted for by the variance in another variable.

Also known as Coefficient of alienation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Concurrent criterion validity

A

How well a test outcome is consistent with a criterion that exists in the present.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Confidence interval

A

The best estimate of the range of a population value given the sample value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Construct-based validity

A

How well a test reflects an underlying idea, such as intelligence or aggression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Content validity

A

A type of validity that examines how well a test samples a universe of items.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Correlation coefficient

A

A numerical index that reflects the relationship between two variables, specifically how the value of one variable changes when the value of the other variable changes.

See Pearson product-moment correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Correlation matrix

A

A table showing correlation coefficients among more than two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Criterion

A

The outcome variable or the predicted variable in a regression equation.

See Dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Criterion-based validity

A

A type of validity that examines how well a test reflects some criterion that exists in either the present (concurrent) or the future (predictive).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Critical value

A

The value resulting from application of a statistical test that is necessary for rejection (or nonacceptance) of the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Cross-tabulation table

A

A table that shows frequencies by two or more variables. The levels of one variable become column labels, and the levels of the other variable become row labels. Often called a crosstab.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Cumulative frequency distribution

A

A frequency distribution that shows frequencies for class intervals along with the cummulative frequency at each.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Data

A

A record of an observation or an event such as a test score, a grade in math class, or response time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Data mining

A

Examining large data sets for patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Data point

A

An observation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Data set

A

A set of data points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
**Degrees of freedom**
A value, which is different for different statistical tests, that approximates the sample size of number of individual cells in an experimental design.
26
**Dependent variable**
The outcome variable or the predicted variable in a regression equation. See ***Criterion***
27
**Descriptive statistics**
Values that organize and describe the characteristics of a collection of data, sometimes called a *data set.*
28
**Direct correlation**
A positive correlation where the values of both variables change in the same direction. See ***Positive correlation***
29
**Directional research hypothesis**
A research hypothesis that points a difference between groups in one direction. See ***Nondirectional research hypothesis***
30
**Effect size**
A measure of the magnitude of difference between two groups, usually calculated as Cohen's *d*.
31
**Error in prediction**
The difference betwen the observed score (*Y*) and the predicted score. See ***Standard error of estimate***
32
**Error score**
The part of a test score that is random and contributes to the unreliability of a test.
33
**Exabyte**
1,152,921,504,606,846,976,976 bytes of data - lots and lots of data, and the amount of data in the world grew just as you read this. Wow.
34
**Factorial analysis of variance**
An analysis of variance with more than one factor or independent variable.
35
**Factorial design**
A research design used to explore more than one treatment variable.
36
**Frequency distribution**
A method for illustrating how often scores occur in groups called class intervals.
37
**Frequency polygon**
A graphical representation of a frequency distribution that uses a continuous line to show the number of values that fall within a *class interval*.
38
**Goodness-of-fit test**
A chi-squqare test on one dimension, which examines whether the distribution of frequencies is different from what one would expect by chance.
39
**Histogram**
A graphical representation of a frequency distribution that uses bars of different heights to show the number of values that fall within each *class interval*.
40
**Hypothesis**
An if-then statement of conjecture that relates to variables to one another and is used to reflect the general problem statement or question that is the motivation for asking a research question.
41
**Independent variable**
The treatment variable that is manipulated or the predictor variable in a regression equation. See ***Predictor***
42
**Indirect correlation**
A negative correlation where the values of variables move in opposite directions. See ***Negative correlation***
43
**Inferential statistics**
Tools that are used to infer characteristics of a population based on data from a sample of that population.
44
**Interaction effect**
The outcome where the effect of one factor is differentiated across another factor.
45
**Internal consistency reliability**
A type of reliability that examines whether items on a test measure only one-dimension, construct, or area of interest.
46
**Interrater reliability**
A type of reliability that examines whether observers are consistent with one another.
47
**Interval level of measurement**
A level of measurement that places a variable's values into catagories that are equidistant from each other, as when points are evenly spaced along a scale.
48
**Kurtosis**
The quality of a distribution that defines how flat or peaked it is.
49
**Leptokurtic**
The quality of a normal curve that is relativeley peaked compared with a normal distribution.
50
**Line of best fit**
The regression line that best fits the observed scores and minimizes the error in prediction.
51
**Linear correlation**
A correlation that is best expressed visually as a straight line.
52
**Main effect**
In analysis of variance, when a factor or an independent variable has a significant effect upon the outcome variable.
53
**Mean**
A type of average calculated by summing values and dividing that sum by the number of values. Also known as ***Arithmetic mean***.
54
**Mean deviation**
The average deviation for all scores from the mean of a distribution, calculated as the sum of the absolute value of the scores' deviations from the mean divided by the number of scores.
55
**Measures of central tendency**
The mean, the median, and the mode.
56
**Median**
The midpoint in a set of values, such as that 50% of the cases in a distrbution fall below the median and 50% fall above it.
57
**Midpoint**
The central point in a class interval.
58
**Mode**
The most frequently occurring score in a distribution.
59
**Multiple regression**
A statistical technique whereby several variables are used to predict one.
60
**Neagative correlation**
A negative correlation where the values of variables move in opposite directions. See ***Indirect correlation***
61
**Nominal level of measurement**
The most gross level of measurement by which a variable's value can be placed in one and only one catagory.
62
**Nondirectional research hypothesis**
A research hypothesis that posts a difference between groups but not in either direction. See ***Directional research hypothesis***
63
**Nonparametric statistics**
Distribution-free statistics that do not require the same assumptions as do parametric statistics. See ***Parametric statistics***
64
**Normal curve** **(Bell-shaped cuve)**
A distribution of scores that is symmetrical about the mean, the median, and the mode and has asymptotic tails. Often called the ***Bell-shaped curve.***
65
**Null hypothesis**
A statement of equality between sets of variables. See ***Research hypothesis***
66
**Observed score**
The score that is recorded or observed. See ***True score***
67
**Obtained value**
The value that results from the application of a statistical test. See ***Test statistic value***
68
**Ogive**
A visual representation of a cumulative frequency distribution.
69
**One-sample** *z* **test**
Used to compare a sample mean to a population mean.
70
**One-tailed test**
A directional test, reflecting a directional hypothesis.
71
**One-way analysis of variance**
A test for the difference between two or more means. A simple analysis for variance (or ANOVA) has only one independent variable, whereas a factorial analysis of variance tests the means of more than one independent variable. One-way analysis of variance looks for differences between the means of more than two groups. See ***Analysis of variance***
72
**Ordinal level of measurement**
A level of measurement that places a variable's value into a catagory and assigns that category an order with respect to other categories.
73
**Outliers**
Those scores in a distribution that are noticeably much more extreme than the majority of scores. Whether a score is an outlier or not is usually an arbitrary decision made by the researcher.
74
**Parallel forms reliability**
A type of reliability that examines consistency across different forms of the same test.
75
**Parametric statistics**
Statistics used for the inference from a sample to a population that assume the variances of each group are similar and that the sample in large enough to represent the population. See ***Nonparametric statistics***
76
**Partial correlation**
A numerical index that reflects the relationship between two variables with the removal of the influence of a third variable (called a mediating or confounding variable).
77
**Pearson product-moment correlation**
A numerical index that reflects the relationship between two variables, specifically how the value of one variable changes when the value of the other variable changes. See ***Correlation coefficient***
78
**Percentile rank**
The percentage of cases equal to and below a particular score in a distribution or set of scores.
79
**Pivot table**
A tool in statistical software, such as SPSS or Excel, that allows the user to easily manipulate the rows, columns, and frequencies included in cross-tabulation tables.
80
**Platykurtic**
THe quaility of a normal curve that is relatively flat compared with a normal distribution.
81
**Population**
All the possible subjects or cases of interest. See ***Sample***
82
**Positive correlation**
A positive correlation where the values of both variables change in the same direction. See ***Direct correlation***
83
**Post hoc**
After the fact, referring to tests done to determine the true source of a difference among three or more groups.
84
**Predictor**
The treatment variable that is manipulated or the predictor variable in a regression equation. (See ***Independent variable***)
85
**Range**
The positive difference between the highest and lowest score in a distribution. It is a gross measure of variability.. Exclusive range is the highest score minus the lowest score. Inclusive range is the highest score minus the lowest score plus 1.
86
**Ratio level of measurement**
A level of measurement defined as having an absolute zero.
87
**Regression equation**
The equation that defines the points and the line that are closest to the observed scores.
88
**Regression line**
The line drawn based on values in a regression equation. Also known as a ***trend line***.
89
**Reliability**
The consistency of a test.
90
**Research hypothesis**
A statement of inequality between two variables. See ***Null hypothesis***
91
**Sample**
A subset of a population. See ***Population***
92
**Sampling error**
The difference between sample and population values.
93
**Scales of measurement**
Different ways of categorizing measurement outcomes: nominal, ordinal, interval, and ratio.
94
**Scattergram or scatterplot**
A plot of paired data points on an x-axis and y-axis, used to visually represent a correlation.
95
**Significance level**
The risk set by the researcher for rejecting a null hypothesis when it is true. See ***Statistical significance***
96
**Simple analysis of variance**
A test for the difference between two or more means. A simple analysis for variance (or ANOVA) has only one independent variable, whereas a factorial analysis of variance tests the means of more than one independent variable. One-way analysis of variance looks for differences between the means of more than two groups. See ***Analysis of variance*** and ***One-way analysis of variance***
97
**Skew or skewness**
The quality of a distribution that defines the disproportionate frequency of certain scores. A longer right tail than left corresponds to a smaller number of occurrences at the high end of the distribution; this isk ***positively sewed distribution***. A shorter right tail than left corresponds to a larger number of occurrences at the hight end of the distribution; this is a ***negatively skewed distribution***.
98
**Source table**
An analysis of variance summary table that lists sources of variance.
99
**Standard deviation**
The average amount of variability in a set of scores or the scores average deviation from the mean.
100
**Standard error of estimate**
A measure of accuracy in prediction that reflects variability about the regression line. See ***Error in prediction***
101
**Standard score**
A raw score that is adjusted for the mean and standard deviation of the distribution from which the raw score comes. (See ***z score***)
102
**Statistical significance**
The risk set by the researcher for rejecting a null hypothesis when it is true. (See ***Significance level***)
103
**Statistics**
A set of tools and techniques used to describe, organize, and interpret information or data.
104
**Test of independence**
A chi-square test of two dimensions or more that examines whether the distribution of frequencies on a variable is independent of other variables.
105
**Test-retest reliability**
A type of reliability that examines a test's consistency over time.
106
**Test statistic value**
The value that results from the application of a statistical test. (See ***Obtained value***)
107
**Trend line**
The line drawn based on values in a regression equation. Also known as a ***regression line***.
108
**True score**
The score that, if it could be observed, would reflect the actual ability or behavior being measured. Also known as ***Observed score***.
109
**Two-tailed test**
A nondirectional test, reflecting a nondirectional hypothesis.
110
**Type I error**
The probability of rejecting a null hypothesis when it is true.
111
**Type II error​**
The probability of accepting a null hypothesis when it is false.
112
**Unbiased estimate**
A conservative estimate of a population parameter.
113
**Validity**
How well a test measures what it says it does.
114
**Variability**
How much scores differ from one another or, put another way, the amount of spead or dispersion in a set of scores.
115
**Variance**
The square of the standard deviation and another measure of a distribution's spread or dispersion.
116
**Y1 or Y prime**
The predicted Y value in a regression equation.
117
***z* score**
A raw score that is adjusted for the mean and standard deviation of the distribution from which the raw score comes. See ***Standard score***