Stats Flashcards

1
Q

2 Basic Mathematical Principles important for EPPP

A

Squaring Decimals

Square rooting Decimals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Critical Factor in determining the type of stat test to be used

A

Type of data, particularly for the DV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

4 Types of Data

*NOIR

A

Nominal
Ordinal
Interval
Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Nominal data

A

Non ordered categorical data, assigned a number for identification purposes but no further meaning to numbers
Sex, political party, race
Can compute percentages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Ordinal Data

A

Ordered categorical data

Ex-grouped according to SES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Interval Data

A

Numerical scores, but no zero score, or zero is not absolute (e.g. temp in celcius or farenheit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Ratio data

A

Numerical score, has an absolute zero
Ex- money in bank, EPPP score, weight
Means can be calculated as well a comparisons across values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

2 Broad classes of statistics

A

Descriptive

Inferential

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

With descriptive stats, the data collected is ____, whereas with inferential stats, the goal is to make inferences about the ___ from the ___

A

simply described
population
sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

2 basic groups of Descriptive stats

A
  1. Stats on on whole group’s data

2. Stats describing ind’s score relative to the group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Descriptive stats on group data include

A

measures of central tendency
measures of variability
Graphs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Measures of Central Tendency

A

Mean-avg score
Median- score at 50th percentile
Mode-most frequently occurring score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The best measure of central tendency is typically the ___

A

mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

If data is skewed (extreme scores present) the most accurate measure of central tendency is ___

A

median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Measure of Variability

A

Standard Deviation-avg spread from the mean
Variance-
Range-diff between lowest & highest score obtained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Standard deviation is the __ __ of the variance

A

square root

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Variance is the standard deviation

A

squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Data that are not normally distributed are ___ or ___, meaning that scores are not equally distributed above & below the mean

A

skewed, kurtotic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

In a positive skew, how are measures of central tendency impacted?

A

Mode is lowest, mean is highest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

In a negative skew, how are measures of central tendency impacted?

A

Mode is highest, mean is lowest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Leptokurtic distribution

A

Very sharp peak

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Platykurtotic Distribution

A

Flattened

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Normal Distribution

A

Bell shaped

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Norm referenced score

A

provides info as to how a person scored relative to the group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

The most informative norm referenced score is the ___ ___.

A

Percentile rank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Graphs for percentile ranks are ___ or ___

A

flat, rectangular

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Standard scores

A

based on standard deviation of the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Examples of standard scores

A
z-scores
t-scores
IQ scores
SAT scores
EPPP scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

z-score

A

most basic standard score
corresponds directly to standard deviation units, mean of 0, SD of 1
Ex- z score of +2 means the score is 2 SDs above the mean
Shape of z score distribution always same as raw score distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

z-score formula

A

z= score - mean/standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Parameters vs. Statistics

A

Population values vs Sample Values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

mu

A

population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

sigma

A

population standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Sampling Error

A

Samples are not perfectly representative of the population (sample means not identical to pop mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Standard Error of the Mean

A

The avg amount of deviation in a distribution of sample means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Standard Error of the Mean formula

A

SD population/square root of N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Central Limit Theorem

A

If an infinite number of equal sized samples are drawn from a population, the means of these samples will be a normal distribution.
The mean of the means (the grand mean) will equal the population mean
The standard deviation of the means will equal the SD of the population divided by the square root of the sample size (standard error of the mean)
*the shape of a sampling distribution of means approaches normality as sample size increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Standard Error of the mean helps up to determine

A

If an obtained mean is most likely due to treatment/experimental effects vs chance (sampling error)
Ex: if SEM of IQ is 3 and testing the effectiveness of a IQ enhancement program yields a mean sample IQ of 103 this difference is likely due to chance. as opposed to sample IQ of 110, which would be 3 standard errors away from the mean (meaning that this is likely statistically significant)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Key concepts in hypothesis testing

A

Null Hypothesis
Alternative Hypothesis
Rejection Hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Null Hypothesis

A

States that there are no differences between groups, experimental research always hopes to reject the null hyp
*results almost always stated in terms of the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Alternative Hypothesis

A

Directly states that there are differences between groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Rejection region/Region of Unlikely Values

A

The tail end of the curve; unlikely that a researcher will obtain means in this region simply by chance. Suggests that treatment did have an effect & null hyp is rejected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Size of the rejection region corresponds to the ___ ___

A

alpha level

Ex: alpha of .05 indicates that rejection region is 5% of the curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Acceptance/Retention region

A

No sig diffs between groups, null hyp is accepted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

2 Factors contributing to conclusions re: stat significance

A
  1. Treatment Effects

2. Sampling Error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

The only way to know w/certainty if a tx effect is significant is to:

A

Replicate study numerous times

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

4 Possible Outcomes in terms of Correctness of Research Findings

A

Type I Error
Type II Error
Power
Correct Decision w/no name

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Type I Error

A

Null is rejected, but later turns out to be a mistake, or diffs are found when they do not actually exist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

The size of ___directly corresponds to likelihood of making Type I Error

A

Alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Conventional cutoff for alpha (.05, .01. .001) indicate that:

A

obtained means are different enough to be attributed to tx effects and not to chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Type II Error

A

Null is accepted, but this is a mistake, or no diffs are found where differences do actually exist

52
Q

The value of ___ corresponds to the probability of making Type II error

A

beta

53
Q

Power

A

Null is rejected, and this is correct

Defined as the ability to correctly reject the null

54
Q

Factors affecting Power

A
Increased w/:
Large Sample Size
Small random error
Magnitude of intervention is large
Statistical test is parametric
Test is one tailed
55
Q

___ has the most sig measurable effect of power; as ___ increases, so does power.

A

Beta; Alpha

56
Q

Correct Decision w/no name

A

Null is accepted and this is correct

57
Q

In determining the appropriate statistical test, you must first determine:

A

what type of question is being addressed in the research

58
Q

Commonly asked questions in research

A

Questions of Difference between groups
Questions of Relationship & Prediction
Questions of Structure or Fit

59
Q

Steps to Select the Appropriate Test of Difference

A
  1. Type of Data of the DV (Nominal, Ordinal, Interval, Ratio)
  2. Number of IVs and Levels of IVs
  3. Sample/Group Independence vs. Correlation
60
Q

If the DV is Nominal or Ordinal, a ___ test test will be used

A

non-parametric, for example chi-square, Mann-Whitney, Wilcoxin

61
Q

If the DV is interval or ratio data, a ___ test will be used

A

parametric, for example t-test or ANOVA

62
Q

If there is more than one DV (interval or ratio data), a ___ will the stat test of choice

A

MANOVA

63
Q

Independent Groups

A

Subjects randomly assigned to conditions or are grouped based on a pre-existing characteristic (gender or ethnicity)

64
Q

3 Factors Resulting in Correlated Groups

A
  1. Repeated measures
  2. Subjects matched prior to assignment to groups (i.e. matched on income, IQ, etc)
  3. Inherent relationship between subjects (twins, siblings, spouses)
65
Q

In order to use a parametric test, what 3 assumptions must be met?

A
  1. Data is interval or ratio
  2. Homoscedasticity-similar variability or SDs in the different groups
  3. Data must be normally distributed
    * If one of these is not met, stat of choice will typically be one use for ordinal data
66
Q

Assumption for the chi square test

A

Non parametric test

Answer: Independence of observations (no repeated measures design)

67
Q

Degrees of freedom

A
# of possible variations in outcome that can be obtained
*calculated differently based on the type of stat test
68
Q

Single Sample Chi Square

A

Nominal data collected for 1 IV

Ex: 100 psychologists sampled as to their political affiliation (political party seen as columns or groups)

69
Q

Single Sample Chi Square degrees of freedom formula

A

df= #columns - 1

70
Q

Multiple Sample Chi Square degrees of freedom formula

A

Nominal data collected for 2 IVs

df= (#rows - 1) x (#columns -1)

71
Q

Standard Error of the mean has a direct relationship with the ____ ____ ____ and an indirect relationship with ___ ___

A

population standard deviation
sample size
*SEM increases as SD increases and sample size decreases

72
Q

2 Way ANOVA calculates:

A

calculates 3 F ratios (one for each main effect and one for the interaction)

73
Q

df formula for single sample t test

A

df=N - 1

N- number of subjects

74
Q

when do we use a one sample t test?

A

interval or ratio data collected for one group of subjects

Ex-BDI obtained for 30 subjects

75
Q

when do we use a t test for matched or correlated samples?

A

interval or ratio data collected for 2 correlated groups of subjects
Ex- BDI obtained for 2 matched groups of 15 people (so 30 total)

76
Q

df formula for matched samples t test

A

df= #pairs - 1

77
Q

when do we use a Multiple sample chi square?

A

nominal data collected for 2 IVs

Ex- 100 psychologists sampled as to voting pref and ethnicity

78
Q

when do we use a t test for independent samples?

A

interval or ratio data collected for 2 independent groups of subjects
Ex-BDI obtained for 2 group of 15 randomly assigned subjects (30 total)

79
Q

df formula for t test for independent samples

A

df= N -2

80
Q

One Way ANOVA

A

interval or ratio data collected for more than 2 groups of subjects
Ex- 60 subjects assigned to one of 4 tx groups

81
Q

Formulas for df in one way ANOVA

A

df total= N - 1
df between groups= #groups - 1
df within groups= dftotal - dfbetweengroups

82
Q

Formula for Expected Frequency in Chi Square when N & the groups are given

A
Expected Freq= N/total # of cells
Ex- 4x2 chi square with a sample of 160
total # of cells is 8
160/8=20
expected freq in each cell=20
83
Q

Formula for expected freq in any cell when data are given for a chi square

A

Expected freq for any cell= (sum of the row x sum of the column)/ N

84
Q

When do you use a one-way ANOVA?

A

when more than 2 groups are being compared on one IV
Ex- comparing 4 diff depression txs
preferable to using multiple t tests to avoid increasing probability of Type I error

85
Q

Stat for One Way ANOVA

A

F Ratio

Want to find high variability between groups and low within

86
Q

Formula for F Ratio; Guidelines for significance

A

F ratio= Mean Square between groups/Mean Square within groups
*Mean square is measure of avg variability
F Ratio= 1, no significance
Typically sig when above 2.0

87
Q

A significant F Ratio with an ANOVA means:

A

There are differences between groups, but you do not know which ones. Must perform post hoc analyses

88
Q

Post hoc analyses following significant ANOVA involve:

A

many pairwise comparisons

89
Q

Possible post hoc tests following sig ANOVA, in order from most to least protection from Type I error

A
Scheffe
Tukey
Duncan
Dunette
Neuman-Kuels
Fisher's least sig diff
*reverse order for protection from Type II error
90
Q

When to use a Two Way ANOVA & main advantage over 2 separate one way ANOVAs

A

Groups are being compared on 2 IVs (ex- sex and treatment); examines main effects for each IV and interaction effects

91
Q

In a 2 way ANOVA, if there are sig main & interaction effects, which is interp first?

A

Interactions

92
Q

To calculate Main & Interaction effects of a 2 Way ANOVA on the test you:

A
  1. Find the sum of each column (if sums are different, there is a main effect for that IV)
  2. Find the sum of each row (if sums are different, there is a main effect for the second IV)
  3. Divide the table into squares and the diagonal means for each square (if sums are diff, there is an interaction effect for those IVs)
93
Q

When do we use a MANOVA?

A

When there is more than one outcome measure or DV

94
Q

When an IV is quantitative, how do we analyze the data?

A

Trend Analysis
Ex: IV is dosage of a drug, length of time, etc
Data is non-linear, so less interested in group diffs but trends in the data

95
Q

Stats depicting relationships between variables are termed ____, while stats that predict are termed ___ or ___

A

correlations

regressions/analyses

96
Q

Bivariate correlations

A

look at relationship between variables, X (predictor) and Y (criterion)

97
Q

Range of Correlation Coefficient

A

-1.0 to +1.0 (describes strength and direction of the correlation)

98
Q

Graphic depictions of correlations

A

data point reps ind’s score on both X and Y, the closer the points are clustered, the stronger the correlation

99
Q

Correlation coefficient tells you

A

how the variability or spread of Y scores for any given X score compares to the total variability of Y scores
Ex- if there is no correlation at all (coefficient of 0.0), for any given X, the range of possible Y could be anywhere from bottom to top of possible scores

100
Q

Coefficient of Determination

A

correlation coefficient squared
Represents amount of variability in Y that is explained or accounted for by X
Ex- correlation coefficient of .50 for level of education and income
.5 squared= .25, meaning that 25% of variability in income is explained by education level

101
Q

Simple Linear Regression Equation

A

Derived anytime the correlation coefficient is other than 0.0, based on line of best fit through the scatter plot of scores

102
Q

3 basic assumptions of bivariate correlations

A

Linear relationship between X and Y
Homoscedasticity-similar spread of scores across scatter plot
Unrestricted range of scores on both X and Y

103
Q

Impact of restriction of range

A

Correlation, reliability and validity is always dramatically lower when the range of either variable is restricted

104
Q

For Bivariate correlations, if both X and Y are interval or ratio data, you use

A

Pearson r

105
Q

For Bivariate correlations, if both X and Y are ordinal (rank ordered) data, you use

A

Spearman’s rho or Kendall’s Tau

106
Q

Zero Order Correlation

A

most basic correlation

analyzes rel btwn X and Y when no extraneous variable affect relationship

107
Q

Partial Correlation ( First Order)

A

examines rel btwn X and Y when effect of a third, confounding variable is removed
Ex: examine relationship btwn GPA & SAT scores after removing impact of parental education

108
Q

Part (Semipartial) Correlation

A

examines rel btwn X and Y when the effect of a third, confounding variable is removed from only one of the orig variables

109
Q

Moderator Variable (in Bivariate Corr)

A

A variable that influences the strength of relationship between predictor & criterion
Ex- relationship between income & smoking may be different strength at diff ages

110
Q

Mediator Variable (in Bivar Corr)

A

Explains why there is a rel between predictor & criterion

Ex- if effect of education removed from link btwn SES and smoking, corr goes down to almost 0

111
Q

Multivariate Tests of correlation & prediction

A
Involve several predictors or IVs & one or more criterions or DVs
Multiple R
Multiple Regression
Canonical R & Canonical Analysis
Discriminant Functional Analysis
Loglinear Analysis
Path Analysis
Structual Equation Modeling
112
Q

Multiple R

A

Correlation btwn 2 or more IVs and one DV, where Y is always interval or ratio data and at least one X is interval or ratio data

113
Q

Coefficient of Multiple Determination

A

Index of amt of variability in criterion Y that is accounted for by all predictors (Xs).

114
Q

Multiple Regression

A

Uses Multiple R to derive equation that allows prediction of the criterion based on values of the predictors

  • To optimally predict, want low corr btwn predictors (Xs) and moderate to high corr btwn each predictor and the criterion
  • Compensatory technique b/c low scores on one predictor can be compensated for by high scores on another
115
Q

Multicollinearity

A

Problem that occurs w/multiple regression equation when predictors are highly correlated with one another

116
Q

2 most common subtypes of multiple regression

A

Stepwise-computerized, forward or backward

Hierarchical-researcher controls, adds variables to regr analysis in order most consistent w/theory proposed

117
Q

Canonical R & Canonical Analysis

A

Extension of multiple R
Corr btwn 2 or more IVs (rpedictor set) and 2 or more DVs (criterion set)
*compensatory approach

118
Q

Discriminant Fx Analysis

A

Used when there are 2 or more predictors (Xs) and one nominal (categorical) criterion variable
Ex: predicting likelihood of passing or failing EPPP (categorical Y) based on time spent studying and number of practice tests completed
*compensatory

119
Q

Loglinear Analysis

A

Used to predict categorical criterion (Y) based on categorical predictors (Xs)
Ex: type of grad program (categorical X) and sex (categorical X) used as predictors for passing or failing EPPP (cat Y)
*compensatory

120
Q

2 Approaches that apply correlational techniques to causal modeling

A

Path Analysis

Structural Equation Modeling

121
Q

Tests of Structure

A

determine which variables in the set fit best together or form coherent subsets that are relatively independent of one another
Includes:
Factor Analysis, Cluster Analsysis

122
Q

Factor Analysis

A

Extracts as many sig factors from the data (strongest to weakest), stronger the factor the more it will account for variability in scores

123
Q

Eigenvalue

A

indicates strength of a factor, less than 1.0 are not interpreted

124
Q

Factor Analysis starts w/___ ___ and computes ___ ___, which are correlations between a variable and the underlying factor

A

correlation matrix

factor loadings

125
Q

Factor Rotation

A

Makes factor loadings more distinct & interpretable

126
Q

2 types of factor rotation

A

Orthogonal (axes remain perpendicular)

Oblique

127
Q

Cluster analysis

A

Gather data on variety of DVs and look for naturally occurring subgroups in the data, without a priori hypotheses