Research Methods II Flashcards
Frequency distribution
Organized tabulation of individuals in each category on the scale of measurement.
f (frequency)
Frequency of a particular score.
Cumulative frequency
Accumulation of individual scores as one moves up the scale. Lowest score, sum of all frequencies, and all scores below it. Highest score should have cumulative frequency equal to total sample size.
Cumulative percentile rank
Accumulation of percent of all scores as one moves up the scale. Start with lowest score divide the cumulative frequency of a particular score by the total sample size.
Mean
Sum of scores divided by number of scores. Average.
Median
Score divides distribution in half. 50th percentile.
Mode
Score in distribution with greatest frequency.
Degrees of Freedom (v or df)
Number of values used to estimate a parameter minus the number of parameters to be estimated.
Why use df in sample SD?
All scores in a set of scores are free to vary, EXCEPT for the last score. Last score is restricted if the mean (or the sum) and the number of scores are known. The correct way to get the UNBIASED estimators of the population is to divide the square deviations by N - 1.
Transformation of the Scale for the SD
- Adding or subtracting a constant to each score in a distribution will not change the standard deviation but will change the mean by the same constant.
- Multiplying or dividing each score by a constant causes the standard deviation and the mean to be multiplied or divided by the same constant.
Symmetrical distribution
Distribution which the left side of the distribution “mirrors” the right side of the distribution.
Skewed distribution
Distribution is skewed if one tail is longer than the other. Long tail in positive direction = positive skew. One tail in negative direction = negative.
Order of Mode, Median, & Mean for Positive skew
Left to right:
- Mode
- Median
- Mean
Order of Mode, Median & Mean for negative skew
Left to right:
- Mean
- Median
- Mode
Kurtosis
“Peakedness” or “flatness” of distribution. How fat or thin a distribution is. Degree of frequency distribution is flat (low kurtosis) or peaked (high kurtosis).
Mesokurtic
Distribution with zero kurtosis. Normal distribution.
Leptokurtic
Distribution with positive kurtosis. Acute peak around the mean, fat tails.
Platykurtic
Distribution with negative kurtosis. Small peak around mean, thin tails.
Bimodal distribution
2 modes.
Rectangular distribution
Mean, median & no mode. No mode because all scores have the same frequency.
Sampling error
The amount of error between statistics calculated from sample to corresponding population parameter.
A sampling distribution
Distribution selected by all possible samples of a specific size from the population.
Distribution of sample means
Collection of sample means for all possible random samples of a particular size (n) that can be obtained from the population.
Central Limit Theorem
For any population with mean µ and standard deviation σ, the distribution of sample means for a sample size of n will approach normal distribution with a mean of µ and a standard deviation of σ/ square root n as n approaches infinity.
The Standard Error of Xbar
The standard deviation of the distribution of sample means.
Law of Large Numbers
The larger the sample size, the more probable it is that the sample mean will be close to the population mean.
Confidence Intervals
Used to help estimate or feel confident that the actual µ is within a certain range of the mean of means from many samples.
Statistical Model
Statistical representation of the real world.
A Simple Statistical Model
The mean is a hypothetical value (i.e., doesn’t have to be a value that actually exists in the data set). As such, the mean is a simple statistical model.
Measuring the “Fit” of the Model
The mean if a model of what happens in the real world: typical score. It is not a perfect representation of data.
Null Hypothesis
The predicted relationship does NOT exist. The symbol is Ho.
Alternative Hypothesis
The predicted relationship does exist. The symbol is H1.
Type I error
The rejection of the null hypothesis is true. Saying there is a relationship when it does not exist.
Type II error
The acceptance of the null hypothesis when the null hypothesis is false. Saying there is no relationship when there is a relationship.
Alpha Level
Minimizing risk of a Type I error.
Power
The probability of correctly rejecting the null hypothesis when the null is false. It is the probability that Type II error is not committed.
Factors Affecting Power
- The alpha level. Increase the alpha level x increase power of statistical test.
- One tailed vs. 2-tailed alpha test.
- Sample size. As sample size increases, so does power.
- Reduce error variance will increase power. (Test everyone in same quiet room rather than different room, different noises.)
Increasing effect size of the independent variable…
Would increase power.
Greater subject variability will…
Decrease power.
Parametric tests based on normal distribution requires 4 basic assumptions
- Normally distributed sampling distribution.
- Homogeneity of variance/ homoscedasticity.
- Interval or ratio data.
- Independence of scores.
Normal dsitribution
A probability of distribution of a random variable with perfect symmetry and a skew of 0 and kurtosis of 0.
Non-parametric tests
A family of statistical tests that do not rely on the restrictive assumptions of parametric test. Does not assume sampling distribution is normally distributed.
Homogeneity of variance (HOV)
Assumption that the variance 1 continuous variable is stable/consistent between treatment groups of a discrete variable for t-tests & ANOVAs.
Homoscedasticity
Assumption the variance of 1 continuous variable is stable/consistent across scores of another continuous variable. For regressions.
Independence of scores
One data point does not influence another data point.
Big advantage of Parametric Tests
More powerful (statistically speaking in reflecting the null hypothesis when it is false) compared to non-parametric tests.
Big advantage of Nonparametric tests
More freedom! Not restricted by assumptions to do data analysis.
Kolmogorov-Smirnov Test
- Tests if data differ from a normal distribution.
- Significant = non-normal data.
- Non-significant = normal data.
Histograms
Frequency distribution with bar drawn adjacent. Gives continuous figure that emphasizes the continuity of variables. Good for continuous variables.
Q-Q Plots
Quantile-quantile plot. Plots quantiles of a particular distribution. If value falls on the diagonal plot, the value shares the same distribution as the normal distribution.
Quantile
The proportion of case we find below a certain value.
A perfect normal distribution would…
Have a skewness of 0 and a kurtosis of 0.
What does a significant Levene’s test mean?
- Tests if variances in different groups are the same.
- Significant = variances not equal (bad)
- Non-significant = variances are equal (good)
Log transformation (log (xi)) to…
Reduce positive skew.
Square root transformation (square root Xi) to reduce…
Positive skew and to stabilize variance.
Reciprocal transformation ( l / xi) can also reduce skewness.
Dividing 1 by each scores also reduces the impact of large scores. This transformation reverse the scores; you can avoid this by reversing the scores before the transformation, 1 (xhighest - xi)
Potential problems with transforming data
Transforming the data helps as often as it hinders the accuracy of F.
Use Robust methods (e.g.. Bootstrap) to…
account for violations of assumptions (e.g., normality).
Correlations
Way of measuring the extent to which two continuous variables are related. Measures pattern of responses across variables.
Scatterplot
A perfect linear relationship (r = +1.00 or - 1.00) is when all the data points lie on a straight line in a scatter plot.
Covariance
Average cross-product deviations.
- Calculate the error between the mean and each subjects’ score for the first variable (x).
- Calculate the error between the mean and their score for the second variable (y).
- Multiply these error values.
- Add these values and you get the cross product deviations.
Problems with Covariance
- Depends upon the units of measurement. e.g. the covariance of the two variables measured in miles might be 4.25, but if the same scores are converted to kilometers, the covariance is 11.
- Standardize it. Divide by the standard deviations of both variables.
- The standardized version of covariance is known as the correlation coefficient. H is relatively unaffected by units of measurement.
The Correlation Coefficient
Measures the degree and direction of linear relationship between two variables in terms of standardized (z-scores with a mean of 0 and a SD of 1).
Sum of Products
To calculate the Pearson Correlation, you need to first calculate the Sum of Products.
Correlation simply describes… a relationship between 2 variables, not causality.
a relationship between 2 variables, not causality.
Third-variable Problem
Causality between two variables cannot be assumed because there may be other measured or unmeasured variables affecting the results.
Direction of causality
Correlation coefficients say nothing about which variable causes the other to change.
Measurement error affecting r
The more error there is in a measure, the smaller the correlation can be. This measures the relationship or correlation between two parallel measures of a variable of interest.
Coefficient of Determination
- r squared = coefficient of determination
- It measures the proportion of variability in one variable that can be determined from the relationship with the other variable.
Linear Transformations
- Variables transformed into standard scores or added or multiplied by a constant, the correlation between the two variables will remain the same.
Spearman’s rho
- Pearson’s correlation on the ranked data
- Good for non-normally distributed data
Kendall’s tau
- Also measures relationship between ordinal variables.
- It gives more accurate p-values for small data sets, but less popular than Spearman’s rho.
Regression Line
The straight line that best describes the linear relationship between two variables.
- Describes the relationship between two variables.
- Identifies the center or “central tendency” of the relation.
- Line used for prediction.
Equation for Linear Relationship
Yi = bo + biXi + Ei
Yi
criterion variable, dependent variable
Xi
predictor variable, independent variable
bi
Regression coefficient for the predictor (IV)
- Gradient (slope) of the regression line
- Direction/strength of the relationship
bo
Y intercept (value of Y when X = 0). Also called the constant.
- Point at which the regression line crosses the Y-axis (ordinate)
Ei
the error of the regression line
Assumption of Regression:
Linearity
Since it is based on linear correlations, multiple regression assumes linear bivariate relationships between each x and y, and also between y and predicted y.
Assumption of Regression:
Normality
Both univariate & multivariate distributions of residuals (actual scores minus predicted scores) are normally distributed. Thus, the Y scores are independent and normally distributed (Shapiro-Wilk).
Assumptions of Regression:
Independence of scores
Independence of Y (outcome: DV) scores.
Assumptions of Regression:
Independence of errors
The errors (residuals) from observations using the regression formula should not be correlated with each other (Durbin-Watson test).
Assumption of Regression:
Minimal Multicollinearity
The predictors (IVs) should not be highly correlated with each other. Rule of thumb is no higher than r -.80 between predictors.
Assumption of Regression:
Homoscedasticity
Variance of the residuals are uniform for all values of U.
Assumption of Regression:
Heteroscedasticity
Spread of data points is smaller for low X values and wider for higher X values.
Standard Error of Estimate
- The measure of accuracy of regression.
- The regression equation allows predictors, but does not provide info about accuracy.
- Standard distance between the regression line and the actual data points.
- The greater the correlation, the smaller the standard error of the estimate.
The least squares criterion
- Want the sums of the squared errors to be the smallest.
- Yields values for the b-weights and the y-intercept that results in sum of the squared residuals being at the minimum.
- The best fitting-line has the smallest total squared error.
Regression to the Mean
- Occurs when you have a nonrandom sample from a population and two measures that are imperfectly correlated.
- The sample posttest mean is closer to the posttest population mean than their pretest mean to the pretest population mean.
Characteristics of the Regression to the Mean
- Can occur because the sample is not random.
- Group phenomenon.
- Happens between any two variables.
- The more extreme the sample group; the greater the regression to the mean.
r = 1, no regression to the mean
r =.5, 50% regression to the mean
r = .2, 80% regression to the mean
r = 0, 100% regression to the mean
Multiple regression
2 or more independent variables in the regression model
- include more than one predictor to enhance the prediction of Y
Zero-Order Correlations
Relationship between two variables, ignoring influence of other variables in prediction.
Highest order correlations:
First order
Relation between 2 variables after controlling for influence of 1 other variable.
Highest order correlations:
Second order
Relation between 2 variables after controlling for influence of 2 other variables.
Partial Correlation
Relationship between two variables after removing the overlap completely from both variables.
Part (semi-partial) correlations
Relationship between two variables after removing a third variable from just the independent variable.
Methods of Variable Entry into the Multiple Regression Model:
Simultaneous Entry
All variables are entered at the same time and the Beta weights are determined simultaneously.
Methods of Variable Entry into the Multiple Regression:
Sequential (hierarchical) Entry
Used to build a subset of predictors.
Methods of Variable Entry into the Multiple Regression Model:
Apriori
Variables entered are determined by some theory.
Methods of Variable Entry into the Multiple Regression Model:
Statistical Criteria
Computer decides which variables are entered based on their unique predictive abilities.
Problem of Shrinkage:
Low N:k Ratio
Enough participants for each predictors. If participants is low to number of predictors, sample estimated may not predict population.
Problem of Shrinkage:
Multicollinearity
High correlations betwene predictors can cause instability of prediction.
Problem of Shrinkage:
Measurement error
Measurement does not reflect true score, the application of Beta weights to ta new sample may not be accurate.
Reading/Interpreting Regression Tables:
R squared
Variance explained by the regression model, aka coefficient of determination
Reading/Interpreting Regression Tables:
B
B weight, raw score; the unique effect of the predictor
Reading/Interpreting Regression Tables:
SE B
Standard error of B weight of the predictor
Reading/Interpreting regression Tables:
beta
beta weight; the standardized unique effect of the predictor based on z-scores
Bootstrapping
A method of resampling in which the model is re-run multiple (e.g., thousands of) times with different permutations of the same data.
- It is used to provide a more reliable estimate (s) of the statistic.
- It is especially useful for small samples and non-normal data.
Moderator
When or how much the IV will affect the DV. (Levels, low, average, high, etc.)
Mediator
The mechanism: how or why the effect of the IV occurs. (Third variable, affects the independent variable)
Why center the continuous IVs to create the interaction terms?
To get the interaction term for 2 continuous independent variables in multiple regression, the score of the predictors must be centered around the mean.
Disadvantage of Barron & Kenny
Low statistical power.
Complete (full) mediation
Relationship between IV & DV completely disappears. Beta weight approaches 0.
Partial mediation
Relationship with IV & DV, but it is reduced. Beta weight may be significant, but difference between new & old beta weights is significant.
Mediation Package Method
More flexible and statistically powerful.
Average Casual Mediation Effects (ACME)
Indirect effect of M (total effect - direct effect). Must be significant to show that mediation is significant.
Average Direct Effects (ADE)
Must NOT be significant to prove complete mediation.
Total effect
Combined indirect & direct effect.
Bootstrapping for…
more accuracy & power.
Rationale for the t-test or when do you use a t-test?
Only one independent variable is manipulated in only two ways and only one outcome is measured.
Assumptions of t-test
- Independent t-test & dependent t-test are PARAMETRIC TESTS based on the NORMAL DISTRIBUTION.
- Sampling distribution is NORMALLY DISTRIBUTED.
- Dependent variable data are CONTINUOUS (interval or ratio).
- Independent t-test assumes:
x Homogeneity of variances.
x Scores in different conditions are independent.
Independent t-test
- Two means based on independent data.
- Data from different groups of people.
The t-test as a GLM (Regression equation)
Nominal IVs are usually done with t-tests or ANOVAs. Can be done in regression as long as predictors are coded correctly (Dummy coding).
Dependent t-test
Compares two means based on related data.
Cohen’s d for effect
0.2 = small
0.5 = medium
0.8 = large
What to do when assumptions are broken:
- Nonparametric version of independent t-test
x Mann-Whitney test; aka Wilcoxon rank-sum test - Nonparametric version of dependent t-test
x Wilcoxon signed-rank test - Robust test
x Bootstrapping
Mann-Whitney test; aka Wilcoxon rank-sum test
- Non-parametric equivalent of the independent sample t-test.
- Use to test differences between two conditions in which different participants have been used.
Wilcoxon signed-rank test
To compare two sets of scores when these scores come from the same participants.
Robust tests
These functions require the data to be in two different columns.
Bootstrapping
Robust method to compare independent means.
Trimmed means
“Where 20% of extreme scores are excluded from calculating the mean”
- Could be excluding a certain percentage from the mean.
M estimator: median
- Median rather than trimmed mean.
Analysis of Variance (ANOVA)
- Determines if mean differences exist for 2 or more treatments or populations. Involves comparison of variances that reflect different sources of variability.
- Test the differences between samples are due to chance (sampling error) or whether they are systematic treatment effects that have caused the scores in one group to be different from the scores in another.
F-ratio = t-squared under what conditions?
T-squared = F when there is 1 independent variable with only 2 treatment conditions.
Components of the ANOVA
- Between-Treatment Group Variability: Between treatment conditions, differences among sample means.
x Treatment effect, individual differences & experimental error could explain differences. - Within-Treatment Group Variability: Variability within each sample.
x Individual differences and experimental error.
What is the F ratio when there is no treatment effect?
Fratio = Treatment effect + experimental error/individual differences + experimental error ~ 1
What is the F ratio when there is a treatment effect?
F ratio = Treatment effect + individual differences + experimental error/ individual differences + experimental error > 1
Assumptions of ANOVA
- Independence
- Normality
- Homogeneity of variance
Test homogeneity of variance assumption
Levene’s Test
The ANOVA as a GLM (regression equation)
- Requires data to be in wide format rather than long format.
- Robust ANOVA based on trimmed means.
- Compare medians rather than means.
- Add bootstrap to the trimmed mean method.
Planned Contrasts (Group Comparisons)
- The variability explained by the model (experimental manipulation, SSm) is due to participants being assigned to different groups.
- This variability can be broken down further to test specific hypotheses about which groups might differ.
- We break down the variance according to hypotheses made a priori (before the experiment).
Robust ANOVA
- Compare medians rather than means
- Add bootstrap to trimmed mean method
Post Hoc tests
Determine specifically which treatment groups or samples are different from each other.
Testwise alpha level
Alpha level you select for each individual hypotheses.
Experimentwise alpha level
Total probability of a Type I error that is accumulated from all the separate tests in the experiment.
Bonferroni
- Calculates new pairwise alpha to keep the familywise alpha at .05.
- Strictest correction procedure, lowest power.
- Bonferroni a = a/number of tests
- Most commonly used.
- May overcome Type I.
Scheffe Test
- Allow researcher to conduct any & all comparisons while preventing experiment wise error from exceeding alpha level.
- Strict, also low power.
Tukey HSD Test
- Help control for experimentwise Type I error when the set of comparisons consists of pairs of treatment means or pairwise comparison.
Kruskal-Wallis test
- Non-parametric counterpart of one way independent ANOVA.
- Data violated assumptions. Based on ranked data.
Friedman’s within-subjects ANOVA
- Used for testing differences between condition when there are 2 or more conditions.
- Same participants for all conditions. Repeated measures design.
- Based on ranked data.
Analysis of Covariance
Any ANOVA design can become an ANCOVA by the addition of a concomitant variable called a covariate (CV). ANCOVA is an extension of ANOVA where the effects of the IVs on the DV are assessed after the effects of one or more covariates are partialed out partitioning variance.
Advantages of ANCOVA
- Reduces error variance – by explaining some of the unexplained variance (SSr) the error variance in the model can be reduced.
- Increases statistical power – by equating the treatment groups, we reduce error variance due to subject variability, and thus, increase power. Matching is a “procedural” way to equate the groups, whereas ANCOVA is a statistical way to equate them.
- Greater experimental control – by controlling known extraneous variables, we gain greater insight into the effect of the predictor variables (s)
Choosing a good covariate
- A covariate is a source of variation that is not controlled for in the design of the experiment, but which does affect the dependent variable.
- The covariate is correlated with the dependent variable. We want a correlation of at least r = .20.
- The covariate should be independent of the independent variable(s), and it shouldn’t correlate highly with any other covariates.
- Adding a covariate complicate the design. It also means you’ll probably need more subjects so you won’t get empty cells in the new design.
Adjusted means
- Use a regression equation to find the adjusted means for each treatment group.
- Will be used when you want to do main group comparisons (contrasts).
Homogeneity of regression slopes assumption
- The regression slope has to be the same for each of the 3 groups.
- Want parallel lines, not interactions.
Assumption covariate (CV) is independent of the independent variable (IV)
No relationship between covariate and IV. Run regular ANOVA.
Testing HOV assumption
Levene Test - same as ANOVA, want the value to be not significant
Contrast (coding) to do Type III ANCOVA
R uses the Type I model to fit predictors into the model which uses the predictors based on how they were entered. Want to use Type III to get the unique effect of each predictor.
- Use contrast coding to write data in format for Type III model.
- Second line describes the relationship between covariate & DV.
- Third line indicates that the IV does has a significant effect on activity level, AFTER controlling for partner activity level.
Robust ANCOVA being bootstrap ANCOVA
Do robust ANCOVA to free the analysis from the restriction of homogeneity of regression slopes.
Benefit of Factorial Designs
- We can look at how variables interact. Moderation model.
- Interactions: Show how the effects of one IV on the DV might depend on the effects (level) of another IV.
Main effects
- The separate effect of each independent variable AVERAGED over the levels of the other independent variable (AVERAGED EFFECTS)
Simple effects
The variability among the treatment means associated with one independent variable at a particular level of the other independent variable. AKA simple main effects.
Interaction
An interaction is present when the effect of one of the independent variables on the dependent variable is not the same at all levels of the second independent variable.
The simple effects of one of the independent variables are not the same at all levels of the second independent variable.
When an interaction is present the main effect is NOT representative of the corresponding simple effects.
The main effects do not fully or accurately describe the data.
Interpreting graphs:
Main effect A
If the average line representing each condition of A is flat or no slope then there is no main effect A. If there is a slope of that averaged line, then there is a significant main effect A.
Interpreting graphs:
Main effect B
If the dots that represent the average of all the conditions of B are right on top of each other, then there is no main effect of B. If the dots are not directly on top of each other and they are spaced apart, then yes, there is a significant main effect of B.
Interpreting graphs:
Interaction
If the lines are parallel, then there is no interaction. If the lines are NOT parallel, then there is a significant interaction.
Interaction comparison
Within the 3-way ANOVA (A x B x C) are smaller interactions (e.g. A x B of level C1 and at level C2) are called simple interactions. The comparison of smaller interactions is called interaction comparisons.
Interaction contrast
- Any interaction analysis that analyses each factor only on 2 of its levels.
- So for a 4 x 4 design, an interaction contrast would be 2 x 2 x 2 (where 1 level of A, 2 levels of B, and 1 level of C are ignored).
- In a 3-factor design, 3 x 4 x 3, an interaction contrast would be 2 x 2 x 2 (where 1 level of A, 2 levels of B, and 1 level of C are ignored).
- These types of interaction analyses are useful because like main & simple comparisons, they break the interaction terms into a single degree of freedom.
- It would enable us to understand the interaction or the simple effects in more detail.
Contrast coding to do Type III ANOVA
- Can utilize planned (apriori) group comparisons in the model
- Do contrast coding of the IVs first so that we can make group comparisons of no versus any alcohol & 2 pints versus 4 pints of alcohol for example.
Goodness of fit test
Used to see if two or more categories of a nominal variable differ significantly from expectation.
- df + (c-1); number of categories minus 1
Pearson’s Chi-Square Test/Chi-Square test of Association
- Use to see whether there’s a relationship between two categorical variables.
- t obs needs to be greater than t crit.
Likelihood Ratio Statistics
- An alternative to Pearson’s chi-square, based on maximum likelihood theory.
- Create a model for which the probability of obtaining the observed set of data is maximized.
- This model is compared to the probability of obtaining the observed set of data is maximized.
- The resulting statistic compares observed frequencies with those predicted by the model.
- i and j are the rows and columns of the contingency table and ln is the natural logarithm.
- Preferred to Pearson’s chi-square when the total sample (N) is small.
Assumptions of Chi-square
- Independence
x Each person, item, or entity contributes to only one cell of the contingency table. - The expected frequencies should be greater than n = 5.
x In larger contingency tables, up to 20% of expected frequencies can be below 5 for a category or group, but there is a loss of statistical power.
x Even in larger contingency tables, no expected frequencies should be below 1. If you find yourself in this situation, consider using Fisher’s exact test.
Chi-square test
df = (rows - 1) x (columns - 1)