Research Design and Statistics Flashcards
What is the sequence of the scientific method?
- Form a hypothesis
- Operationally define the hypothesis: what will be measured to show results?
- Collect & analyze data
- Disseminate results
Independent Variable: Define
The variable that is manipulated by researchers
The variable that is thought to impact the dependent variable
Dependent Variable: define
The outcome variable
What is hypothesized to change as a result of the IV
Predictor & Criterion Variables: Define
**Predictor: **Essentially the same as IV, but it can’t be manipulated
E.g. gender, age
Criterion: essentially the dependent variable
This is for correlational research
Can a variable have levels in a study?
Yes, especially the independent variable
E.g. Male & Female could be levels of the predictor variable
No treatment/Med Only/Combined treatment could be levels of the IV for treatment group
Factorial Designs
These have multiple IV’s
E.g. 1 IV is treatment; 2nd IV is type of schizophrenia
If you look at the effects of all levels on each other, it becomes a factorial design
What gives a study Internal Validity?
If you can determine a causal relationship between the IV and DV
No/limited effects of extraneous variables
Internal Validity in Multiple Group Studies: what impacts it?
The groups must be comparable to control for extraneous/confounding factors
Internal Validity: History
What is it? Any external event that affects scores on the dependent variable
Example: learning environment between groups is different, w/ one being superior
Internal Validity: Maturation
What is it? an internal change that occurs in subjects while the experiment is in progress
Example: time may lead to intellectual development, or fatigue, boredom, hunger may impact it
Internal Validity: Testing
What is it? practice effects
Example: take an EPPP sample test, attend a course, and then retake an exam to see if the course helped improve. but it may be just knowing what to expect on the test
Internal Validity: Instrumentation
What is it? changes in DV scores that are due to the measuring instrument changing
Example: raters may gain more experience over time. This is why we need highly reliable measuring instruments
Internal Validity: What is Statistical Regression?
What is it? extreme scores tend to fall closer to the mean upon re-testing
Example: if you test severe rated depression people, just by nature they are likely to report as less depressed next time regardless of any IV
Internal Validity: Selection
What is it? Pre-existing subject factors that account for scores on DV
Example: Classroom A students may simply just be smarter than Classroom B students, so regardless of different interventions they will score better
Internal Validity: Differential Validity
What is it? drop out is inevitable, so if you have 2 diff groups and there are differences in the type of people who drop out from each group, it can affect int. validity
Example: studying a new SSRI, some people may experience a worsening of depression/SI while on it, and they drop out. Because they dropped out, the med may appear to have been more helpful than it truly is
Internal Validity: Experimenter Bias
What is it? researchers preconceived bias impacts how they interact with subjects, which impacts the subjects scores
AKA: experimental expectancy effect, rosenthal effect, pygmalion effect
Example: experimenter unintentionally communicates expectations to subject
Prevention: double-blind technique
Protecting Internal Validity: Random Assignment
Each person has equal chance of ending up in a particular group
Protecting Internal Validity: Matching
What is it? ID subjects who are matched on an expected confounding variable, and then randomly assign them to treatment/control group
Ensures that both groups have equal proportion of the confounding variable
Protecting Internal Validity: Blocking
What is it? make the confounding variable another IV to determine to what extent it may be impacting the DV
Allows you to separate the effects of a variable and see interactions
Protecting Internal Validity: Holding the Extraneous Variable Constant
What is it? only use subjects who match the same on the extraneous variable
Problem: results not generalizable to other groups
Protecting Internal Validity: Analysis of Covariance
What is it? a stat strategy that adjusts DV scores so that subjects are equalized in terms of status on extraneous variables
Pitfall: only effective for extraneous variables that have been identified by the researchers
External Validity: Define
The degree to which results of a study can be generalized to other settings, times, people
Threats to External Validity: Interaction between Selection & Treatment
What is it? effects of a treatmetn don’t generalize to other target populations
Example: may work with college students, but not non-college students
Threats to External Validity: Interaction between History & Treatment
What is it? effects of treatment don’t generalize beyond setting and/or time period the experiment was done in
Threats to External Validity: Interaction between Testing & Treatment
What is it? pre-tests may sensitize the subjects to the purpose of the research study
AKA: Pretest sensitization
Example: pre-test before a film designed to reduce racism. The group who viewed the film may be primed and more motivated to pay attention to the film, as opposed to those who may watch the film without a pretest
Threats to External Validity: Demand Characteristics
What are they? cues in the research setting that may tip subjects off to the hypothesis
People pleasers may act in ways to confirm the hypothesis, while others may act to disprove it
Threats to External Validity: Hawthorne Effect
What is it? research subjects may behave differently simply because they are participating in research
Threats to External Validity: Order Effects
AKA Carryover effects & Multiple Treatment Interference
What is it? DV is impacted by other aspects of the study
Example: subjects get three treatments, always in the same order. Last treatment may show the best results, but there’s no way of knowing if it’s just from that treatment, or from impacts of the previous two
Stratified Random Sampling
Protecting External Validity
Take a random sample from subgroups of a population
Example: random sample of different age groups
Cluster Sample
Protecting External Validity
The unit of sampling is a naturally occurring group of individuals
Example: residents of a city
Naturalistic Research
Protecting External Validity
Behaviour is observed and recorded in its natural setting
Reduces many external validity concerns, but has no internal validity
What is analogue research?
Protecting External Validity
Results of lab studies are used to draw conclusions about real-world phenomenon
E.g. Milgram’s obedience studies
Single and Double-Blind Research
Protecting External Validity
Single Blind: subjects don’t know what group they are in
Double Blind: neither subjects or research know what group they are in
Reduce demand characteristics, researcher bias and hawthorne effect
Counterbalancing
Protecting External Validity
Controls for order effects by ensuring variables are received in different order
Latin Square Design: order the administration of variables so that each appears only once in each position
True Experimental Research
Subjects randomly assigned to groups
Groups receive different levels of manipulated variable
Greatest for internal validity
Quasi Experimental Research
When to use? when random assignment is not possible
Example: studying a learning program that is being introduced to all grade 1 classes
Next best for internal validity
Correlational Research
What is it used for?
Does it have Internal Validity?
Internal Validity? correlational research has none
Use for? Prediction, esp. for variables that can’t be manipulated
Developmental Research: 3 types
**Goal: **Assessing variables over time
Longitudinal: same people studied over long time
* Pitfall: underestimate changes, bc its often those who drop out that have the most significant changes
Cross-Sectional: different groups of subjects, divided by age, are assessed at same time
* Pitfall: cohort effects lead to overestimation of differences (e.g. may not account for an aid a different generation had, which is responsible for helping memory)
Cross-Sequential: combines the two. Samples of diff groups are assessed more than once
Time-Series Design
What is it?
What are the benefits?
Take multiple measurements over time (e.g. multiple pretest/posttest) to assess effects of IV
Benefits: controls for threats to internal validity. You can add a control group to help with history effects
Example: smoking reduction program in school. degree of post test results can indicate if it was a confounding factor or a result of the program
Single Subjects Design
Can be one subject, or multiple that are treated as one group
Used for: behaviour modification research
Dependent variable measured multiple times during phases of the study (phase 1-no treatment/phase 2-treatment)
Single Subject Design: AB Design
Single baseline and single treatment phase
Phase 1: collect data on frequency of behaviour before treatment
Phase 2: give treatment, collect data on if it reduced behaviour
Single Subject Design: Reversal (Withdrawal)
Benefits: controls for extraneous factors, which AB does not
What does it do? give treatment, withdraw treatment and reassess, and then provide treatment again. If behaviour continues again without treatment, the effect was likely due to treatment
Types:
ABA: baseline -> treatment -> withdraw
ABAB: baseline -> treatment -> withdraw -> treatment
Multiple Baseline Design
When to use?
Types of baselines to use
Used when: reversal not possible for ethical reasons
It doesn’t involve withdrawal of treatment
Treatment applied sequentially
Multiple Baseline Across Behaviours: start with one behaviour, then use same treatment for another
Multiple Baseline Across Settings: home, school
Multiple Baseline Across Subjects: try treatment on another subject
Qualitative Research: Surveys
Types
Risks/Benefits
Cons: many threats to validity
Pros: can try to ensure random sample
Types: personal interviews, telephone surveys, mail surveys
Qualitative Research: Case Studies
Con: lack internal and external validity
Pro: thorough on one person
Useful as pilot studies that can ID variables to be studied in a more systematic manner
Qualitative Research: Protocol Analysis
What is it? research involving the collection and analysis of verbatim reports
Example: subject thinks aloud while doing something, which is then analyzed to look for themes/concepts evident as the subject performed the task
Scales of Measurement: Nominal Data
Unordered categories, none of which are higher than the others
E.g. male/female
Scales of Measurement: Ordinal Data
Provides info about the ordering of categories, but not specifics
E.g. agree, strongly agree, neutral, etc.
Scales of Measurement: Interval Data
Numbers are scaled at equal distances, but the scale has no absolute zero point
e.g. IQ scores, temperature
Multiplication or division not possible, but addition and subtraction are
Scales of Measurement: Ratio Data
Identical to interval, but they have an absolute zero
E.g. dollar amounts, time, distance, height, weight, frequency of behaviours per hr
What does a Frequency Distribution provide? How are they displayed?
A summary of a set of data
tables, bar graphs, histograms
Normal Distribution
Symmetrical, half scores above mean and half below
Most scores are close to mean
Skewed Distributions
May happen with ceiling/floor effects
Negatively Skewed: has a tail on the left. Indicates easy test
Positively Skewed: has a tail on the right. Indicates difficult test
Measures of Central Tendency: the mean
Arithmetic average
Add all values and divide by n
Con: sensitive to extreme values
Measures of Central Tendency: the median
What is it? The middle value of data when ordered from lowest to highest (Md)
Odd groups: literally the middle number
Even groups: mean of the two middle numbers
Pros: not as affected by extreme scores, so good for skewed distributions
Measures of Central Tendency: the mode
What is it? the most frequent value in a set of numbers
May have multiple modes (bimodal/multimodal)
Relationship between the Mean, Median & Mode
Normal Distribution: all equal
Positively Skewed Distribution: mean higher than median, median higher than mode
Negatively Skewed Distribution: mean is less than median, mode is more than median
Measures of Variability: the range
What is it? the difference between the highest and lowest scores
Cons: impacted by extremes, so doesn’t give accurate representation of the distribution
Measures of Variability: The Variance
What is it? The average of the squared differences of each observation from the mean
For me: Get the mean. How far is each score from the mean? Square that distance, and then add them all up. Take an average of the sum. This is variance.
What to know?
1. measure of variability of distribution
2. many stat tests use it in formulas
3. It’s equal to the square of the SD
Measures of Variability: the standard deviation
What is it? the expected deviation from the mean of a score chosen at random
Higher SD = more scores are likely to deviate from the mean
Transformed Scores: z-scores
What are they? raw scores stated in standard deviation terms. Measures how many SD’s a raw score is from the mean
Calculate by: subtract teh sample mean by the score, and divide by the SD
Pro: can compare across different measures and tests
Transformed Scores: t-scores
What are they? mean of 50, Sd of 10
Percentile Ranks: what shape is their distribution and what does it mean?
Shape:It is flat/rectangular.
Means: within a given number of percentile ranks, there will always be the same number of scores
Standard Deviation Curve: 4 things to know
- In a normal distribution, 68% of scores fall between -1.0Z and +1.0Z
- In a ND, 95% of scores fall between z scores -2.0 and +2.0
- In a ND, z-score +1.0 is a percentile rank of 84 (top 16%). -1.0 z-score is a PR of 16 (bottom 16%)
- In an ND, z-score of +2.0 is 98th PR (top 2%). z-score -2.0 is PR of 2 (bottom 2%)
Where are Percentile Rank Scores Clustered?
Most are around the mean (PR 50-84)
At the extreme end there are less (84-98)
What is the point of Inferential Statistics?
To allow us to make inferences about the population based on a sample
What is Sampling Error?
Inferential Statistics
The inevitable error between the sample scores and the population
What is the Standard Error of the Mean?
The extent to which a sample mean can be expected to deviate from its corresponding population mean
What is the relationship between Standard Error of the Mean and Sample Size?
As sample size increases, the standard error decreases
INVERSE relationship
Null VS Alternative Hypothesis
Inferential Statistics
Null: no difference between means of sampled populations. IV has no effect on DV.
Alternative: IV does have an effect on DV
4 possible outcomes of testing a null hypothesis
- Retain null, no difference exists in population (correctly retained)
- False null rejected, differences do exist in population (correctly rejected)
- Null rejected, no differences exist (incorrectly rejected)
- False null retained, differences do exist (incorrectly retained)
One-tailed VS Two-tailed Hypotheses
Inferential Statistics
One-tailed: we hypothesize a particular direction. E.g. we anticipate one mean to be significantly higher than the other mean
Two-tailed: hypothesize a difference in means, but not in what direction.
Type I Error
Null hypothesis is rejected but it is true
You think you have something but you really don’t
Alpha Level and Type I Error
Inferential Statistics
Set by the research in advance, and it is the probability of making a Type I Error
p = 0.05 or .01
Type II (Beta) Error
Inferential Statistics
Fail to reject null hypothesis, but it is false
Thinking you don’t have something when you really do
Type II Error and Power
Inferential Statistics
Power: the probability of NOT making a Type II error
* 1-beta
* Sensitivity of a statistical test to detect an existing difference
What affects Power?
Inferential Statistics
- Sample size
- Alpha: higher alpha level = higher power
- One-tailed tests are more powerful
- Magnitude of Population Difference: more difference between population means = more likely to detect them. Can impact this by increases difference levels of IV
Parametric Tests: what are they used for and what are their assumptions?
Inferential Statistics
Used for: interval and ratio data
Assumptions:
1.* Normal distribution of DV. Robust
2. Homogeneity of Variance*: variance of groups is equal. Robust
3. Independence of Observations: scores within same sample or group shouldn’t be correlated (if they are, it means the scores could be impacted by a group factor) Not robust
Nonparametric Tests
Used for?
How are they similar/different than parametric?
Name 2 types
Inferential Statistics
Used for: DV measured on ordinal or nominal scale
Differences from Parametric:
* don’t assume normal distribution
* Less powerful
Similarity to Parametric:
* assume data come from unbiased sample
Types:
* chi-square
* Mann-Whitney U
How to decide to reject the null hypothesis?
Inferential Statistics
The obtained stat value is compared to a critical value in table
1. depends on the pre-set alpha level
2. degrees of freedom for the test
t-test: what is it used for? what does it mean?
**Used for: **To test hypotheses about 2 different means
It cannot be used for more than 2 means
Means: t-ratio, if significant, indicates that the means are different
One-sample t-test: when to use
Inferential Statistics
Inferential Statistics
When a study involves only one sample
Compare one mean to a known population mean
Rarely used
degrees of freedom: N - 1
T-test for independent samples
Degrees of Freedom?
When to use?
Inferential Statistics
Used for: compare 2 means from unrelated samples (e.g. treatment & control group; test scores of students from different schools; avg height of men & women)
Degrees of Freedom: N - 2
Assumptions:
-Homogeneity of variances
-Data in each group ~normally distributed
Paired Samples T-Test
When to use?
Degrees of Freedom?
Inferential Statistics
Used for: samples that are related to each other somehow (e.g. matched sample, pretest-posttest)
Degrees of Freedom: N-1 (N is the pair of scores)
One-Way Analysis of Variance (ANOVA): when to use
Inferential Statistics
- In a study w/ one independent variable where the means of more than two groups are compared
- What is the probability that these means are from the same population?
- F Ratio: if significant, the null is rejected
- It doesn’t tell you which means are different, so must do post-hoc tests
ANOVA: what does the F ratio represent?
Inferential Statistics
It represents a comparison between 2 estimates of variance
1. Between-group variance
2. Within-group variance
ANOVA: F ratio and significant relationship
How does it effect the 2 types of variance?
Inferential Statistics
If the null hypothesis is true, the 2 estimates of varaince should be the ~same
If null hypothesis false, the between-group variance should be HIGHER than within-group
Differences between group means should be large enough to not be accounted for by error
What is the ANOVA fraction?
Inferential Statistics
variance between groups/variance within groups
If top is BIG and bottom SMALL, that means it’s significant
ANOVA: sum of squares
Inferential Statistics
What does it do? measure of the variability of a set of data
In ANOVA Summary Table:
1. between-group sum of squares
2. within-group sum of squares (error/residual)
3. Total Sum of Squares
Used to calculate the F-Ratio
ANOVA: degrees of freedom
Inferential Statistics
Two Types
1. df between (k - 1)
2. df within (N - k)
*K = number of groups
*N = total number of observations
ANOVA: mean square
Inferential Statistics
What is it? the stat measure to estimate between and within-group variance
Mean Square Between: sum of squares between / df between
Mean Square Within: sum of squares within / df within
ANOVA: how to get the F ratio
What two other values are used?
Equation:
mean square between / mean square within
ANOVA: what do post-hoc tests do?
Inferential Statistics
They make pairwise comparisons or complex comparisons between means
Pair wise: compare means of novel treatment group to mean of typical treatment group
Complex: compare combined mean of novel treatment and typical treatment with the control group mean
ANOVA: risks of doing multiple post-hoc comparisons
Increases risk of Type I error
ANOVA
When to use certain post-hoc tests
- Scheffe test is most conservative (most protection against Type I error, but increases chances of Type II)
2.If only doing pairwise comparisons, Tukey is the best one to choose
One-way ANOVA for repeated measures
Used when all subjects receive all levels of the IV (e.g. group receives novel treatment and typical treatment)
ANCOVA: when to use it?
When you need to adjust dependent variable scores to control for effects of extraneous variables
Factorial ANOVA: when to use?
When study has more than one IV and you want to look at the effects of each IV separately (main effect) but also together (interactions)
It helps you see the bigger picture, as the reality is that multiple factors play into dependent variables
Factorial ANOVA: main effect
the effect of one independent variable by itself
Factorial ANOVA: interaction
effects of an independent variable at the different levels of the other independent variables
E.g. one-sided versus two-sided communication have different effectiveness based on a persons intelligence
In graphs, can be seen as intersecting lines (e.g. in an X pattern)
What are some variations of the Factorial ANOVA?
Mixed ANOVA: more than one independent variable, but it has ~1 between subjects IV and ~1 repeated measured (within-subjects) variable
Multivariate Analysis of Variance (MANOVA): when to use
When: study involves 2+ dependent variables and 1+ independent variable. You want to look at the effect of each IV separately but also together
Why use it over multiple one-way ANOVA or factorial ANOVA? Reduces the likelihood of Type I error
Chi-Square Test: when to use?
Nonparametric
Use for: categorical data (nominal)
e.g. survey results
Means: the obtained frequencies in a set of categories differ significantly from null hypothesis
How to calculate df in chi-square test?
Single sample & Multiple sample
Single sample chi-square: C-1
Multiple sample chi-square: (C-1)(R-1)
R = no. of rows
3 considerations in using Chi-Square Test
- no observation can be related to one another. can’t be used in before-after studies
- each observation classified into only one category/cell (e.g. you can only belong to one political party)
- Percentages of observations w/i categories can’t be compared. Frequency data is required.
How to calculate expected frequencies in chi-square tests?
Single Sample:
divide no. of subjects by the number of cells
Multiple Sample:
Mann-Whitney U
Non-parametric tests
Used for:
* rank ordered ordinal data w/ two independent groups
* Assumptions of independent t-test not met
Used when:
1. when data from a research study are rank-ordered
2. 2 independent groups you want to compare
2. Assumptions of parametric tests are not met
3. Ordinal data (ranked but differences between ranks aren’t consistent)
Wilcoxon Matched-Pairs Test
Nonparametric test
Used for:
* Comparing two related groups (repeated measures) using rank-ordered data
* Assumptions for paired t-test not met
* Is there a consistent difference between the two sets of paired data?
Used when:
1. assumptions of parametric are not met
Kruskal-Wallis Test
Nonparametric test
- Comparing 3+ groups
- Data not normally distributed; assumptions for ANOVA not met
- Ordinal data
- Question: Is there a significant difference in the ranks of the data between the groups?
Person r Correlation Coefficient
- Used when calculating the relationship between two variables measured on an interval or ratio scale
- Calculated based on z-scores, but don’t need to know specifics
What affects the Pearson r?
3 things (LHR)
- Linearity: assumes linear relationship between two variables, so can’t be used for curvilinear relationships
- Homoscedasticity: refers to an equal distribution of scores throughout the scattergram. Heteroscedasticity is when they are not equally dispersed. It lowers the r
- Range of Scores: wider range makes for more accurate correlation
What is the coefficient of determination?
Regression
- The squared correlation coefficient
- Indicates the percentage of variability in one measure (IV) that is accounted for by the variability in another measure (DV)
- E.g. .70 correlation between IW and grades = 49% variation explained by IQ (get this by squaring the correlation coefficient)
Point-Biserial and Biserial Coefficients
Correlation Coefficients
Point-Biserial:
* Look @ relationship between a continuous variable and dichotomous variable
Biserial:
* Look @ relationship between one continuous variable and an artifically dichotomized variable (a continuous variable that has been divided up)
Phi and Tetrachoric Coefficients
Correlation
Phi Coefficient:
* 2 naturally binary dichotomous variables
* No assumption about distribution
Tetrachoric Coefficient:
* Two artificially dichotomized variables
* Assumes the variables are continuous and that they follow a normal distribution
Contingency
Correlation coefficients
- Correlation between two nominally scaled variables (unordered variables, each having more than two categories)
- Describes how two categorical variables are related
- Uses contingency tables
- Things like Phi, Chi-square measure the strength of the associations between variables
Spearman’s Rho
Correlation coefficients
- Correlation measure between two variables w/ an ordinal scale (ranked data)
- Data not linear (monotonic), may have outliers
- If data was linear and continuous w/o ranks, pearson r would be used
- E.g. same students ranked on two different tests, Rho could be used to correlate them
Eta
Correlation coefficients
- Strength of relationship between categorical and continuous variable
- This measures NON-LINEAR relationships
- Eta (n): tells you how much variance in the continuous variable is explained by the categorical variable
- Eta (n2): expresses the PROPORTION of variance in the continuous variable that can be explained by the categorical variable
- Often used with ANOVA
- Ranges from 0 to 1
What is the purpose of a regression?
- An equation that is used to estimate the value of one variable based on the value of another
- It finds the line of best fit, which is used to predict the dependent variable
- E.g. can the EPPP score predict my performance ratings as a psychologist?
What variables are in a regression?
- Predictor/Independent Variable
- Criterion/dependent Variable
Don’t need to know the equation
The 3 Assumptions of Regression
-
Linear Relationship
This is often depicted with the line of best fit (determined using least squares criterion) - Error scores are normally distributed w/ a mean of 0 Independence of errors
- Homoscedasticity: variance of residuals is constant across levels of IVs
- No perfect Multicollinearity: IVs not highly correlated w/ each other
- Normality of Residuals:
- Exogeneity: IV’s not correlated w/ error term
- Correct Model Specification: includes relevant variables, excluded unnecessary ones
How to Substitute Regression for ANOVA?
Code the subjects status on the IV using numbers, which are then put into the regression equation to predict a DV
What is Multiple Correlation Coefficient?
- A measure of how well multiple IV’s predict the DV in a multiple regression
- Higher values = stronger relationship between combo of predictor variables at the criterion variable
- R ranges from 0 to 1
- R2 (coefficient of determination) tells you the PROPORTION of variance in DV that is explained by IVs
What is multiple regression?
The scores on more than one predictor are used to estimate scores on a criterion
4 Things to understand about multiple correlation/multiple regression?
- Multiple correlation coefficient is highest when predictor variables have high correlations with criterion but low with each other (multicollinearity = they correlate)
- Multiple correlation coefficient is never lower than the highest simply correlation between an individual predictor and the criterion
- Multiple R can never be negative
- Can be squared (coefficient of multiple determination)
What is multicollinearity?
Multiple Regression
When predictors have high correlations with one another in a multiple regression
What is the Coefficient of Multiple Determination?
Multiple Regression
- The multiple correlation squared
- Indicates the proportion of variance in the criterion variable accounted for by combo of predictor variables
- Ranges from 0 to 1, w/ higher scores meaning the IVs provide a good fit to the data
Stepwise Multiple Regression
Forward & Backward Multiple Regressions
When to use? if you have a large number of potential predictors, but want to use a small subset of them for the final equation
Forward Step Wise MR: start with one predictor and add others to the equation one at a time. Check predictive power after each one.
MOST COMMON ONE
Backward Stepwise Regression: start with all potential predictors, remove them on at a time and check for predictive power.
Canonical Correlation
- This is used when there are multiple criterion and multiple predictor variables, and you want to understand their overall relationship
- Looks @ two SETS of variables
- Creates LINEAR COMBINATIONS of the variables that are maximally correlated w/ one another
Discriminant Function Analysis
- Creates discriminant functions (linear combos of the IV’s) that best distinguish between groups
- Used when: to classify cases into groups, or find which variables best differentiate between groups
- WIlks Lambda test looks at how well a DFA separates the groups
- Eigenvalues: amount of variance in DV explained by DF
- Canonical Correlation: tells you ow well IV’s explain group diffs
How is it different from multiple regression?
It predicts criterion GROUP rather than criterion SCORE
E.g. high achievement or low achievement group, rather than specific scores
What is Differential Validity?
Discriminant Function Analysis
A characteristic in which the predictors involved in classifying people into criterion groups should have a different correlation with each criterion variable
E.g. if you are trying to predict which major a uni student will excel in, the predictor variables should all be different to differentiate between possible groups (english, science, etc)
Logistic Regression
What is it used for?
What do scores mean?
How is it different from DFA?
Used for: make predictions about which criterion group a person belongs to
How is it different from Discriminant Function Analysis?
* doesn’t rely on the same assumptions
* predictors can be nominal (categorical) or continuous
Use when:
* with dichotomous dependent variables (e.g. responder/non-responder to therapy)
Scores:
0-1
0.80 = 80% chance of being a responder
What are the assumptions of the Discriminant Function Analysis?
- Normality
- Homogeneity of Variance-Covariance Matrices
- Independence: observations independent of each other
- Linearity: relationships between IV’s linear
Multiple Cutoff
Correlation & Regression
Cut offs are used for each predictor, and missing the cut off of even one predictor eliminates you
E.g. job selection in which you need ALL eligibility criteria
Compared to multiple regression, in which high scores on one predictor can compensate for lower scores on another (e.g. GRE scores)
Partial Correlation
If a relationship between two variables is obtained, but you suspect that the relationship may be due to another variable, you can ‘partial out’ its effect
E.g. partially out hot weather in the correlation between ice cream and boat accidents
What is a Suppressor Variable?
Partial Correlation
This is a spurious/extraneous variable that reduces the correlation rather than inflate it
E.g. reading skill on a test impacting a job that doesn’t require reading skill
Structural Equation Modeling
What is it? A general term for techniques that are based on correlations between multiple variables
Assumptions: linear relationship between variables
Used for: testing causal models based on multiple variables
Steps for using Structural Equation Modeling to Test Causal Models based on Multiple Variables
- Specify a causal model involving many variables: IQ -> education -> empathy -> parenting -> childrens IQ
- Conduct Stat Analysis: correlation between all pairs of variables
- Interpret Results of Analysis: show if the data are consistent with the model
Path Analysis
Structural Equation Modeling
Correlation & Regression
Verify causal models that propose one-way causal flows between variables
Can be used only with observed variables (what you measure)
LISREL
Structural Equation Modeling
Correlation & Regression
can be used with one-way and two-way causal relationships
E.g. prediction that self esteem increases work success, which in turn leads to more self esteem
Uses observed and latent (inferred) variables
Trend Analysis
Correlation & Regression
Used when:
* both variables are quantitative (interval/ratio)
* interested in the trend of change rather than magnitude
Break points: point where the scores for subjects change direction in a predictable way
What does it tell you? what trends are significant
Theoretical Sampling Distribution
- Population: whole set of something the research is interested in
- Sample Distribution: set of scores obtained from a sample of a population
-
Sampling Distribution: multiple samples taken from population, and the means of those multiple samples are used to create a frequency distribution
*samples must be same size
*each population member must have same probability of being selected
What is Sampling with Replacement?
Theoretical Sampling Distribution
When you pick a sample from a population, record the mean of that sample, and then return the sample to the population before you select your next sample
The ones you just put back have the same probability of ending up in the next sample as do all the rest
Assumptions of Central Limit Theorem
- As sample size increases, shape of the sample distribution approaches normalcy. True even if the population distribution of scores isn’t normal
- The mean of the sampling distribution is equal to the mean of the population
2 Assumptions of Sampling Distribution
- Sample distribution has less variability than population distribution
- SD of sample distribution is equal to population SD divided by square root of the size of the samples from which means were obtained
What is test robustness?
Related to parametric test assumptions
Central Limit Theorem
When the rate of Type I error’s is not increased by violations of the assumptions of parametric stat tests
Central Limit Theorem is why parametric tests are robust with normality assumption, provided that the sample size is adequate enough to bring normality to the sample distribution
Homogeneity of variance assumption: parametric tests are robust so long as equal no. of subjects in each experimental group
Time-Series Analysis
You don’t need to have independence of observation to use this test (as opposed to t-tests)
Autocorrelation: correlation between observations at given lags (e.g. between observations re: lag)
Bayes’ Theorem
This is a formula used to get a special type of conditional probability
E.g. probability that an 85yo has Alzheimer’s given that they came up positive on a diagnostic test?
-Basically, what is the probability that they have it and not a false positive?
Meta-Analysis
What is it?
What measure does it use?
Multiple studies analyzed at once, each study becomes a separate subject
Effect Size: indicates magnitude of IV effect
calculated for each DV, then summed and divided by # of effects
It’s the difference between the means of the control group and treatment group, divided by SD of control group