Statistics Flashcards

1
Q

nominal data

A

involves tallying people to see which non-ordered category each person falls into
e.g. sex, voting preference, ethnicity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

ordinal data

A

involves tallying people to see which ordered category each person falls into
group means cannot be calculated from ordinal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

interval data

A

involves obtaining numerical scores for each person, where score values have equal intervals
either no zero score (e.g. IQ scores, t-scores) or zero is not absolute (e.g. temperature)
group mean can be calculated from interval data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

ratio data

A

involves obtaining numerical scores for each person, where scores have equal intervals and an absolute zero
e.g. savings in bank, scores on EPPP, number of children, weight
comparisons can be made across score values (e.g. $10 is twice as much as $5)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

measures of central tendency

A

mean, median, mode
best measure of central tendency typically the mean
when data skewed or there are some very extreme scores present, median preferable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

standard deviation

A

measure of average deviation (or spread) from the mean in a given set of scores
square root of the variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

variance

A

standard deviation squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

range

A

crudest measure of variability

difference between highest and lowest value obtained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

positive skew

A

higher proportion of scores in the lower range of values
mode has lowest value, mean has highest value
(bump on left)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

negative skew

A

higher proportion of scores in the higher range of values
mean has lowest value, mode has highest value
(bump on right)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

kurtosis

A

how peaked a distribution is
leptokurtotic distribution - very sharp peak
platykurtotic - flattened

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

norm-referenced score

A

provides information on how the person scored relative to the group
e.g. percentile rank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

criterion-reference or domain-referenced score

A

e.g. percentage correct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

standard scores

A

based on the standard deviation of the sample

e.g z-scores, t-scores, IQ scores, SAT scores, EPPP scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

z-scores

A

mean of zero, SD of one
shape of z-score distribution always identical to shape of the raw score distribution
useful because correspond directly to percentile ranks (ONLY IF distribution is normal) and easy to calculate from raw score data
transforming raw scores into z-scores does not normalize distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

z-score formula

A

z=(score-mean)/(SD)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

standard error of the mean

A

if researcher were to tape many, many samples of equal size and plot the mean IQ scores of these samples, researcher would get normal distribution of means
any spread or deviation in these means is error
average amount of deviation = standard error of the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

standard error of the mean formula

A

SD(population) / SQRT (N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

central limit theorem

A

assuming an infinite number of equal sized samples are drawn from the population, and the means of these samples are plotted, a normally distributed of the means will result
tells researcher how likely it is that particular mean will be obtained just y chance - can calculate whether the obtained mean is most likely due to treatment or experimental effects or to chance (sampling error, random error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

rejection region

A

aka rejection of unlikely values
size of rejection region corresponds to alpha level e.g. when alpha is .05, rejection region is 5% of curve
when obtained values fall in rejection region, null hypothesis rejected, researcher concludes treatment did have an effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Type I error

A

mistakenly rejecting null (differences found when they don’t exist)
corresponds to alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Type II error

A

mistakenly accepting null (differences not found, but they do exist)
corresponds to beta

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

power

A

defined as ability to correctly reject the null
increased when sample size is large, magnitude of intervention is large, random error is small, statistical test is parametric, test is one-tailed
power = 1-beta
as alpha increases, so does power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

non-parametric tests

A

e.g. Chi-square, Mann-Whitney, Wilcoxin

if DV is nominal or ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

parametric tests

A

e.g. t-test, ANOVA

if DV is interval or ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

assumptions of parametric tests

A

homoscedasticity - there should be similar variability or SD in the different groups
data are normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Kolmogorowv-Smirnov test

A

same qualifications as independent samples or single sample t-test, except it’s a non-parametric test
1 IV, 1 DV
1 or 2 independent groups

28
Q

Wilcoxon (sign rank)

A

same qualifications as matched t-test, except it’s a non-parametric test
1 IV, 1 DV
2 correlated groups

29
Q

Krusall Wallis

A

same qualifications as 1-way ANOVA, except it’s a non-parametric test
1 IV, 1 DV
>2 independent groups

30
Q

Friedman test

A

same qualifications as 1-way repeated measures ANOVA, except it’s non-parametric test
1IV, 2 DV
>2 correlated groups

31
Q

single sample chi-square test

description and degrees of freedom

A

nominal data collected for one independent variable
e.g. 100 psychologists sampled as to voting preference
df = #columns - 1 (in example, 3-1=2 df)

32
Q

multiple sample chi-square

A

nominal data collected for two IVs
e,g. 100 psychologists sampled for voting preference and ethnicity
df = (#rows - 1)(#columns-1)
in example (3-1)(5-1) = 2X4 = 8

33
Q

t-test for simple sample

A

interval or ratio data collected for one group of subjects

df=N-1

34
Q

t-tests for matched or correlated samples

A

interval or ratio data collected for two correlated groups of subjects
df = #pairs - 1

35
Q

t-tests for independent samples

A

interval or ratio data collected for two independent groups of subjects
df = N-2

36
Q

one-way ANOVAs: dfs

A

df total = N-1
df between groups = #groups-1
df within groups = dftotal - dfbetween

37
Q

One-Way ANOVA:

F ratio

A

MSbetween/MSwithin
When F ratio equals or approximately 1, no significance
As F ratio gets above 2.0, typically considered to be significant

38
Q

One-Way ANOVA: mean squares

A

MS between = SS between/df between

MS within = SS between/df within

39
Q

Post Hoc tests

A

Scheffe followed by Tukey, provide most protection from Type I error (most conservative)
Fisher’s LSD provides least protection from Type I error
Duncan, Dunette, Neuman-Kuels provide mid-range protection
REVERSE true for Type II error

40
Q

assumptions of bivariate correlations

A

linear relationship
homoscedasticity - similar spread of scores across the entire scatter plot
unrestricted range

41
Q

Spearman’s Rho or Kendall’s Tau Correlation

A

ordinal (rank ordered) X

ordinal (rank ordered) Y

42
Q

Pearson’s r Correlation

A

interval or ratio X

interval or ratio Y

43
Q

Point-Biserial Correlation

A

interval or ratio X

true dichotomy Y

44
Q

Biserial Correlation

A

interval or ratio X

artificial dichotomy Y

45
Q

Phi Correlation

A

true dichotomy X

true dichotomy Y

46
Q

Tetrachoric Correlation

A

artificial dichotomy X

artificial dichotomy Y

47
Q

Eta correlation

A

curvilinear relationship between X and Y

48
Q

zero-order correlation

A

most basic correlation
analyzes relationship between X and Y when it is believed that there are no extraneous variables affecting the relationship

49
Q

partial correlation (first order correlation)

A

examines the relationship between X and Y with the effect of a third variable removed
e.g. if it is believed that parent education (third variable) affects both SAT an GPA, this variable could be measured and its effect removed from the correlation of SAT and GPA

50
Q

part (semipartial) correlation

A

examines relationship between Z and Y with the influence of a third variable removed from only one of the original variables

51
Q

coefficient of multiple determination

A

R squared
index of the amount of variability in the criterion Y that is accounted for by the combination of all the predictors (Xs)

52
Q

multiple R

A

correlation between 2 or more IVs (Xs) and one DV (Y) where Y is always interval or ratio data at at least one X is interval or ratio data

53
Q

multicollinearity

A

problem that occurs in multiple regression when predictors are highly correlated with one another and essentially redundant

54
Q

canonical R

A

extension on multiple R
correlation between two or more IVs (X) and two or more DVs (Y)
e.g. examining relationship between time spent studying for EPPP (X1) and number f practice tests completed (X2) with score obtained on exam (Y1) and amount of subjected distress experienced while taking the exam (Y2)

55
Q

discriminant function analysis

A

special case of multiple regression
used when there are two or more Xs and one Y
however, used when Y is nominal (Categorial)

56
Q

loglinear anlysis

A

aka logit analysis
used to predict categorical Y based on categorical Xs
e.. if type of graduate school and sex were used to predict likelihood of passing or failing the EPPP

57
Q

path analysis

A

applies multiple regression techniques to testing a model that specifies causal links among variables

58
Q

structural equation modeling

A

enables researchers to make inferences about causation

e.g. LISREL ( Linear Structure Relations)

59
Q

factor analysis

A

operates by extracting as many significant factors from data as possible

60
Q

eigenvalues

A

factor analysis
indicates strength of factor
<1.0 usually not considered significant
aka characteristic root

61
Q

factor loadings

A

correlation between a variable (e.g. item or subtest) and underlying factor
interpreted if equal or exceed +/- .30

62
Q

orthogonal rotation

A

type of factor rotation
axes remain perpendicular (90 degrees)
always results in factors that have no correlation with one another
generally preferred because easier to interpret
communalities must be calculated

63
Q

communalities

A

calculated in orthogonal rotation
refers to how much of a test’s variability is explained by combination of all the factors
factor loadings all squared and added together

64
Q

oblique rotation

A

type of factor rotation
angle between axes is non-perpendicular and factors are correlated
some argue that oblique rotations are preferable to orthogonal rotations because factors tend to be correlated in the real world

65
Q

principal components analysis

A

type of factor analysis
when one is trying to extract factors and there is no empirical or theoretical guidance on the values of the communalities
always results in a few unrelated factors, called components
factors empirically derived, researcher has no prior hypotheses
first factor (component) accounts for largest amount of variability, each additional component explaining somewhat less

66
Q

(principle) factor analysis

A

type of factor analysis

communality values would need to be ascertained before analysis

67
Q

Normal curve

A

See pic