Stats Flashcards
Tests of structure
Factor analysis, principal components analysis, cluster analysis
What is the underlying structure of a construct?
Discriminant analysis
Predicts group membership
When would you use a factorial/two-way vs. split-plot/mixed ANOVA
Factorial: 2 IVs, both independent groups (e.g. TX and gender)
Split-plot: 2 IVs, mixed independent and correlated groups (e.g. TX and time)
Cross sequential research
Combination of cross-sectional and longitudinal, use a cohort but study them over time, but for shorter periods than longitudinal
What impacts statistical power? (Correctly reject null/find efx)
Larger sample size
Stronger intervention
Less error
Parametric test
One-tailed test
How to tell if data is independent or correlated?
Independent:
Random assignment
Based on pre-existing differences, gender
Correlated:
Measured over time, or repeatedly
Matched subjects
Related subjects
Assumptions for parametric tests
Interval ratio data
Homoscedasticity same spread between groups
Normally distributed data
Assumptions for chi-square test
Independent observations, so cannot be measured more than once, or over time
Multicolliniarity
Multiple regression tests, when predictors are highly correlated with each other
Formula for z-scores
(Raw score - mean)/SD
Canonical R & Canonical Analysis
Two x’s and two y’s
Canonical R = correlation
Canonical Analysis = prediction/regression
Discriminant function analysis
Predict a nominal Y - where people fall into categories- from interval/ratio X, special multiple regression equation. For example, college admissions, or pass fail on an exam
Log linear analysis
Predicting nominal y with nominal X’s
Path analysis and structural equation modeling
Uses correlational techniques to test out causal models
Path analysis test out researcher identified relationships
Structural equation modeling tests out different paths, uses LISREL
Test of difference for ordinal/nominal data, or when I/R data violates assumptions of parametric tests
Nominal: Chi-square
- 1 IV
- 2+ IV = multiple sample Chi-square
- Correlated data = McNemar
Ordinal, 1 IV
- 1 group = Kolmigorov (single vodka)
- 2 gps, independent = Kol-Smirnov, Mann-Whitney (double vodka)
- 2 gps, correlated = Wilcoxon (oxen are yoked/correlated)
Scheffe & Tukey vs. Fischer’s LSD
Post-hoc ANOVA tests to identify where sig gp difference coming from
S&T Best protection from type 1 error
Fishers LSD best protection from type 2 error
Assumptions of bivariate correlations
- Homoscedasticity
- Unrestricted range on both variables
Homoscodasticity
Equal variability
across entire scatter plot for correlations
Between groups for tests of difference
Which correlation coefficient to use?
- both interval ratio: Pearson r
- both ordinal: spearman’s Rho, Kendall’s Tau
- interval ratio and dichotomous point biserial/biserial (point for true dichotomy)
- true and true dichotomous: phi
- artificial and artificial dichotomous: tetrachoric
- curvilinear relationship: eta
Latent trait model, or item response Theory
Item performance is related to the amount of the respondents latent trait. Latent trait models are used to establish a uniform scale of measurement that can be applied to individuals of varying ability and test content of varying difficulty
Classical test Theory
Total variability in scores can be explained by combination of test reliability and error variability
Cluster analysis
Looking for naturally occurring subgroups in data without a priori hypothesis (e.g. profiles of individuals)
Relationship of standard error with standard deviation
All of the standard errors have a direct relationship with their related standard deviation
Standard error of the mean versus measurement versus estimate
All measures of average variability
Standard error of the mean: variability of group mean from population mean. SDpop/sqrt(N)
Standard error of the measurement: variability of individual scores measurement error. Formula includes SDx and reliability coefficient Rxx. Range is 0 - SDx
Standard error of the estimate: variability in prediction error. Formula includes SDy (why why why??) And validity coefficient Rxy. Range is 0 - SDy