Final Exam Flashcards

1
Q

Descriptive statistics

A

Statistical tools to organize and summarize data
- information about a collection of observations (their central tendency)
- information about the variability in a set of observations
- information about the shape of a distribution of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Inferential statistics

A

Statistical tools to generalize beyond collections (samples) of actual observations in order to make predictions and test hypotheses about the general population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Population

A

Any complete collection of observations or potential observations (ENTIRE group of interest)
- population characteristics are called parameters
- μ, σ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Real population

A

All potential observations are available at the time of sampling
- ex. anxiety scores of current participants in a meditation program

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Hypothetical population

A

One in which not all potential observations are available at the time of sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sample

A

Any smaller collection of actual observations drawn from a population
- sample characteristics are called statistics
- x̅, s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Level of measurement

A

Specifies the extent to which a number, word, letter, etc. represents something in the world

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Nominal

A
  • Words, letters, or numerical codes
  • Observations are sorted into categories, no order
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Ordinal

A
  • Values have an inherent, logical order
  • No equal intervals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Interval

A

The distance between consecutive points on the scale is the same all the way along the scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Ratio

A

Amounts or counts of quantitative data that reflect differences in degree based on equal intervals and a true zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Qualitative data

A

Consists of words, letters, or numerical codes that represent a class or category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Quantitative data

A

Consists of numbers that represent an amount or a count

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is data type important?

A

We use different statistical tests depending on the type of data we have collected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Frequency distribution

A

A collection of observations produced by sorting observations into classes and showing their frequency (f) of occurrence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Ungrouped frequency distribution

A
  • Frequencies are tallied for each and every value
  • Each class has a single value
  • Only use these for data sets that have ≤ 20 values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Grouped frequency distribution

A
  • Observations are sorted into classes of multiple values
  • Use for data sets with > 20 values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Relative frequency

A

Shows the frequency of each class as part of a fraction of the total frequency for the entire distribution
- frequency per class/total

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Cumulative frequency

A

Shows the total number of observations and all lower-ranking classes
- add up from the bottom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Cumulative relative frequency

A

Shows the cumulative frequency of each class as a proportion of the total
- divide the cumulative frequency by the total

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Percentile rank

A

Percentage of scores in the entire distribution with similar or smaller values than that score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Measures of central tendency

A

Means, medians, and modes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Mean

A

The average
- sum of all scores/number of scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Median

A

The middle value when observations are ordered from smallest to largest (or vice versa)

25
Q

Mode

A

The most frequent score in a distribution

26
Q

Variability

A

The degree by which scores are spread out across a distribution
- range
- variance
- standard deviation

27
Q

Range

A

Highest value – lowest value

28
Q

Variance

A

A measure of how data points differ from the mean

29
Q

Standard deviation

A

A measure of how dispersed the data is in relation to the mean
- σ = sum of squares/N

30
Q

Sum of squares

A

A statistical measure of deviation from the mean
- population: SS = Σ(x - μ)^2
- sample: SS = Σ(x - x̅)^2

31
Q

Negatively skewed distribution

A

The majority of observations are at the high end of the distribution, with few negative scores
- ex. retirement ages, scores on an easy test

32
Q

Positively skewed distribution

A

Most scores are at the low end of the distribution, with few high scores
- ex. U.S. incomes, scores on a very difficult test

33
Q

The normal distribution

A
  • Most of the area under the curve falls in the middle
  • No skew, a bell curve
  • Symmetrical
  • Mean = median = mode
  • Half of scores fall on either side of the mean
  • Total area under the curve = 1.00 or 100%
  • X-axis is in units measure in experience (lbs, inches, mph)
    • ex. IQ, height, weight
34
Q

Standard normal distribution

A
  • X-axis is in standard deviation units (x-axis can be turned into Z-scores)
  • Mean is always 0
  • Standard deviation is always 1
35
Q

Z-score

A

A unit-free, standardized score that indicates how many standard deviations a score is above or below the mean
- can be positive or negative (unlike standard deviations; scores above the mean are positive, scores below the mean are negative)
- population: z = (x - μ)/σ
- sample: z = (x - x̅)/s

36
Q

Table A / Z Table

A

Provides z-scores and their associated areas under the curve

37
Q

How to use table A/the Z table

A
  • Sketch the problem, know what you’re looking for, and plan the solution
  • Calculate the necessary z-scores
  • Find the appropriate areas under the standard normal curve in table A
38
Q

Correlation

A

The relationship between variables, and how paired values of two variables change together (ex. height and weight, years of education and annual income, medication and anxiety)
- described as positive or negative, strong, moderate, or weak

39
Q

Positive correlations

A

As one variable increases, the other increases (as one decreases, the other also decreases)

40
Q

Negative correlations

A
  • As one variable increases, the other decreases
  • As one variable decreases, the other increases
41
Q

Scatterplot

A

Graphs showing individual data points plotted as combinations of two variables
- useful for determining the direction of a relationship (negative or positive)
- useful for determining the strength of a relationship (strong, moderate, weak)

42
Q

Pearson’s r

A

Describes the strength of correlation and direction of the relationship
- r = (Σ ZxZy)/(n-1)
- ranges from -1 to +1
- direction indicated by sign (+ or -)
- strength indicated by value (0 = no relationship, ±1 = perfect relationship)
- 0 < |r| < .3 = weak
- .3 < |r| < .7 = moderate correlation
- |r| > .7 = strong correlation
- correlation coefficient

43
Q

Coefficient of determination (r^2)

A

The percentage of variance in one variable explained/predicted by the relationship between two variables
- ex. r^2 = (.94)^2 = .88
- 88% of the variation in psych GRE score is explained by the relationship between grades on a cognition final and psych GRE scores
- 1 - r^2 = (1 - .88) = .12 tells me that 12% of the variation in psych GRE scores is NOT explained by the relationship between grades on a cognition final and psych GRE scores

44
Q

Linear regression

A

Plots a straight line through a cluster of dots on a scatterplot, and uses that line to predict the value of one variable from the value of another

45
Q

Least squares regression line

A

Best fitting line for a set of data that minimizes the sum of the standard deviations from each data point to the line (minimizes the average distance to the line)
- Y’ = bx + a
- Y’ = predicted value
- x = value for which we are predicting y
- b = slope of regression line = r(sqrt((SSy)/(SSx)))
- a = y-intercept of the regression line = ȳ - bx̄

46
Q

Standard error of the estimate

A

The estimation of the accuracy of any predictions
- Sx|y = sqrt((Σ (y - y’)^2)/(n-2))

47
Q

Independent variable

A

A variable (or treatment) manipulated by the investigator in an experiment

48
Q

Dependent variable

A

The variable believed to be influenced (changed) by the IV

49
Q

Sampling distribution of the mean

A

Refers to the probability distribution of means for all possible random samples of a given size from some population
- mean = same as population mean
- shape will approximate a normal curve if sample size is sufficiently large (central limit theorem)

50
Q

Standard error of the mean

A

The sampling distribution’s standard deviation
- σx̅ = σ / √n
- measures variability in the sampling distribution
- extent to which sample means vary around their mean

51
Q

Null hypothesis

A

A statistical hypothesis that nothing special is going on in the sample with respect to a specific characteristic of the underlying difference; the hypothesis of no difference

52
Q

Alternative hypothesis

A

Opposite of null; states that the sample is special or different from the population

53
Q

Significance level

A

Indicates how rare a sample mean must be to reject the null hypothesis
- α (alpha)

54
Q

Type I error (α)

A

Rejecting a null hypothesis when it is in fact true

55
Q

Type II Error (β)

A

The likelihood of incorrectly retaining the null hypothesis, failing to reject a null hypothesis when it is in fact false

56
Q

Confidence interval

A

A range of values that with a known degree of certainty, includes an unknown population characteristic
- x̅ ± (Zconf)(σx̅)
- Zconf is the critical z value used in the decision rule
- a 95% CI is a range of values that in the long run would contain the parameter of interest 95% of the time

57
Q

Cohen’s d

A

Tells you about the observed mean difference in terms of SD units
- (mean 1 – mean 2)/standard deviation
- .2 = small
- .5 = medium
- .8 large

58
Q

T-test

A

Used when we don’t know the standard deviation
- t = (x̄ - μx̄)/Sx̄
- Sx̄ = estimated standard error = s/sqrt(n)
- x̄ = sample mean
- μx̄ = hypothesized population mean