Stats Final Flashcards
Parsimonious
Having a select amount of parameters; not including certain types of data
What is multiple regression (compared to bivariate regression)? What is the purpose of the analysis?
There are at least two predictors in multiple regression and only one in bivariate regression. The purpose is to develop a predictive model based on correlations between variables.
What does R^2 measure? Identify another name for it.
Percentage of variance explained by the IVs combined; Multiple correlation coefficient; indicates the correlation between the actual and expected values of the DV.
In SPSS output, how does a B weight differ from a beta weight? How are they used?
B weights are the slope weights used to calculate the value of the DV; beta weights are the standardized B weights that are used to indicate the strength of the predictors.
What is the error or residual in regression analysis?
The amount of variability in the outcome variable (DV) that is not explained by the predictors (IVs).
What is the equation of the regression line used to predict values of the outcome (criterion) variable?
Y-prime = a + bX
Name two reasons it’s preferable to conduct an ANOVA rather than multiple t-tests?
Number of comparisons
Missing info
Need one answer, not several
Inflated Type I error rate
A mean square is the same as what type of statistic?
Variance
How is the variance partitioned in ANOVA?
Between and within groups
What is another word for groups in statistics-speak?
Levels
Why Conduct Correlational Research? (4)
Ethical reasons
Financial reasons
Nature of the question
Can’t form an experimental group
A researcher rejects the null hypothesis for a one-way groups ANOVA, where k = 4. What is the next step in the analysis?
Post-hoc test
What is covariance?
the sum of the products of the deviations between the two sets of scores, divided by the number of pairs of scores.
∑(X – X-bar)(Y-Y-bar)/ n
Assumptions of Linear Correlation (3)
- Linearity: The data are best described by a straight line
- Normality: Data points are normally distributed.
- Homoscedasticity: Equal variance of data points along the line. In the case of heteroscedasticity, the Pearson r will underestimate the strength of a correlation.
What does the correlation coefficient help us do?
predict the value of one variable if we know the value of another. Goal is to develop a formula for making predictions about the DV based on observed values of the IV.
Simple bivariate regression
One predictor, one outcome (e.g., predicting salary from education).
Difference between correlation and regression?
Correlation focuses on degree of “scatter”; regression focuses on the slope of the line.
Error equals?
Y-Y’
The error of prediction is the difference between the actual and predicted values of Y. This difference is also known as the residual.
Sources of error (3)
- Measurement: Very few variables can be measured with perfect accuracy
- Sampling – the sample will never be exactly like the population
- Uncontrolled variation – uncontrolled variables may “disturb” the relationship between the IV and DV
What is multiple regression?
combining multiple IVs (predictors) to calculate what the DV should be.
The omnibus F
Overall test of the significance of the model.
B weights
unstandardized regression coefficients, represent the slope weight for each variable in the model and used to create the regression equation.
Beta weights
Standardized regression coefficient. They’re based on z scores with a mean of 0 and standard deviation of 1
Multicollinearity
indicates that there are correlations between the predictor variables (IVs). We want them to be somewhat correlated, but if they’re too highly correlated there’s a problem
Where on the SPSS output would you find how much variance in our DV is explained by all of our IVs?
R Square
which intercept is the constant variable?
Y-intercept
What does Levene’s Test tell us?
Whether we have homogeneity or heterogeneity of variance.
What extra step do we take in calculating the estimated standard error of the difference between means?
Pooled variance estimate. Because we are dealing with two samples from two different populations, we need to combine their variances to get one value for the denominator of the t-test.
Identify the ways in which we can get dependent or paired samples?
Pre-post or matched pairs
What is making the means of different groups vary? (3)
Individual differences
Experimental error
The IV
The most basic type of ANOVA is called the one-way between groups design. Why?
Because we only have one IV (one-way) and we have separate, independent groups.
ANOVA assumptions (3)
Independence: there is no relationship between the observations (scores) in the different groups and between the observations in the same group.
Normality: the data are normally distributed. Can be checked by looking at skewness and kurtosis data.
Equality of Variance: Can be checked by asking for the homogeneity of variance option in SPSS.
What test would you use for an experiment with 3 groups of individual scores?
1-way ANOVA
The F Ratio
Variance between groups / variance within groups or..
F = inherent variance + treatment effect
inherent variance
Within-Groups Variance
The sample variance for any group is used as an estimate of the inherent variance in that specific population.
Post-Hoc Tests
Helps us find out what groups have a statistically significant difference
Two-way (Factorial) ANOVA
has two independent variables.
Why would it be a paired vs independent t-test?
Paired-samples t tests compare scores on two different variables but for the same group of cases; independent-samples t tests compare scores on the same variable but for two different groups of cases.
What is residual?
The difference between predicted values of y (dependent variable) and observed values of y
Coeffienct of determination (how to find it, what it means)
The coefficient of determination (R²) measures how well a statistical model predicts an outcome.
Assumptions of parametric tests (2)
- Normal distribution
- Homogeneity of Variance
what influences power? (3)
- difference between group means
- inherent variability
- sample size
what is a z-test?
statistical test to determine whether two population means are different when the variances are known and the sample size is large.
what is a multiple regression test?
statistical technique that can be used to analyze the relationship between a single dependent variable and several independent variables
keyword: PREDICT
what is a chi square test?
to compare observed results with expected results. The purpose of this test is to determine if a difference between observed data and expected data is due to chance, or if it is due to a relationship between the variables you are studying.
what is a t-test?
used to compare the means of two groups