2textbook Flashcards
Coefficient of determination�
represents the proportion of the variance in one variable (x) that is accounted for by the other variable (y). r2 (square the correlation coefficient). If the correlation between two variables (x and y) is 0.3. Then 0.3 squared = 0.09, or 9% is the variance in x is accounted for yProportion of variance in x that is systemic variance shared with y.�
Statistical Significance can be influenced by
Sample Size - Larger means more likely a correlation is significant.Magnitude of correlation.P value
Partial Correlation
The correlation between two variables after the influence of the third variable is statistically removed.
Spearman Rank-order correlation
correlation between two variables when one or both of the variables is on an ordinal scale (the numbers reflect rank ordering).E.g. Correlation between teachers ranking of the best to worst students (ordinal scale) and the students IQ scores (interval scale).�.e.g. ask teacher to rank students in class from 1-30 based on what they think their intelligence is, then correlate with actual measured iq
Point biserial correlation:�
used when one variable is dichotomousGender is dichotomous (male or female). To correlate gender with spatial memory you would assign all males a 1 and all females a 2. If you get a significant positive correlation that would mean that females tend to score higher on spatial memory than males. A significant negative correlation would mean that males score higher.1 dichotomous var, 1 interval/ratio
Phi coefficient�
used when both variables being correlated are dichotomous (e.g., gender, handedness, yes/no answer).BOTH variables are dichotomous
On-line outliers
extreme on both variables (very top right of scatter plot graph) INFLATES r
off-line outliers
off to the side extremes… so points on bottom right and top left of scatter plot. DEFLATES r
spurious correlation
correlation between two variables that is not due to any direct relationship between them but rather to their relation to other variables. if researchers think something is spurious, they’ll look for third variables
Factors that distort correlation coefficients
.restricted range.outliers.reliability of measures (less reliable, lower the coefficients)
restricted range how distorts coefficients
Restricted range: the size of the correlation may be reduced by a restriction of the range in the variables being correlated.A restricted range occurs when most participants have similar scores (less variability).This can occur when you are correlating scores that are either either high or low on one variable.E.g. If you correlate SAT scores of people who get into college with their college GPA, you may be dealing with a restricted range because usually those with higher SAT scores get in to college.Must ensure you have a broad range of scores.
Regression�
Predict scores on one variable from scores on another variableUse GRE scores to predict success in grad school
Regression line
A regression line is a straight line that summarizes the linear relationship between two variables.The regression line minimizes the sum of the squared deviations around the line.It describes how an outcome variable y changes as a predictor variable x changes.A regression line is often used as a model to predict the value of the response y for a given value of the explanatory variable x.
multiple regression
Multiple Regression is used when there is more than one predictor variable.If you are predicting success in grad school you may use three predictor variables: GRE scores, University GPA, and IQ scores.Then you can predict success in grad school based on all three predictors, which usually is more accurate than one predictor.Allows the researcher to simultaneously consider the influence of all the predictor variables on the outcome variable.
standard multiple regression
Standard multiple regression (simultaneous multiple regression): enter all the predictor variables at the same time.You can predict grad school success by entering GPA, GRE, and IQ score simultaneously.�
stepwise multiple regression
enter the predictor variables one at a time. First enter the predictor variable that correlates the highest with the outcome variable.Next, you enter the variable that relates the strongest to the outcome variable after the first variable is entered. It will account for the highest amount of variance in the outcome variable after the the first predictor variable is enteredThis may or may not be the second highest correlation. If the second highest correlation was highly correlated with the first variable than it may not predict a unique amount of the variance in the outcome variable.enter strongest predictor variable first. then, add the predictor variable that contributes most strongly to the criterion variable GIVEN THAT THE FIRST PREDICTOR VARIABLE IS ALREADY IN THE EQUATION. SEE PAGE 166 second last paragraph ok dummy.
hierarchical multiple regression
enter the predictor variables in a predetermined order, based on hypotheses the researcher wants to test. Can partial out the effects of predictor variables entered in early steps to see if other predictor variables still have a contribution uniquely to the variance in the outcome variable.Predetermined order to select for predictor variables to see if have any UNIQUE effects. Entered based on a hypothesis that the research wants to test.
We want to determine the relation between drinking while pregnant and child’s IQ score. But, we know that mothers who drink while pregnant also tend to smoke and do other drugs while pregnant, which could also decrease child’s IQ.
We can enter smoking and other drug use into the regression equation first and then enter drinking: to see if after smoking and other drug use are accounted for (partialled out), if drinking uniquely predicts IQ scores above and beyond smoking and other drug use.
Mediation Effects�
occur when the effect of x on y is actually occurring because of a third variable, z.First enter the possible mediator variables.Then you can see if x uniquely predicts variance in y after z is accounted for and partialled out (statistically removed)Correlation between drowning and eating ice cream, but this relation may be related to a mediator variable called summer (heat). We could fist enter heat in to the regression to determine how strongly heat is uniquely related to drowning, then after heat is removed we can determine whether eating ice cream is actually uniquely related to drowning.
Structural Equations Modeling
Allows you to test hypotheses about the pattern of correlations.Researcher makes precise predictions about how three or more variables are causally related.x caused y which cases zThen you can compare your hypothesized correlation matrix against the real correlation matrix.This analysis determines the degree to which the patterns of correlations observed matches or fits with the researchers predictions or model. Can also test two different models against each other to see which one fits best with the observed correlation matrix.
Factor Analysis
Analyze the interrelationships among a number of variables.Look for a pattern in the correlation matrix; look for correlations among the correlations.Can determine if some variables are all highly correlated with each other but not with other variables that may only correlate with each other.
Multiple correlation coefficient (R)
The ability of all the predictor variables together to predict the outcome variable.Represents the degree of the relationship between the outcome variable and the set of predictor variables.Ranges from .00 to 1.00, the larger the R the better the predictor variable accounts for the variance the outcome variable.R can be squared to show the percent of the variance in the outcome variable (y) that is accounted for by the set or predictor variables.R = .50, accounting for 25% of the variance in y.
Randomized groups factorial design�
participants are randomly assigned to one of the combinations of the independent variables�
Matched factorial design�
match participants based on their score on a measure related to the dependent variable. If there are 6 cells then choose six highest scores and randomly assign each to a cell.�
Repeated measure factorial design�
all participants complete all conditions (cells)So in a 2 x 2 (4 cells) each participant completes 4 conditions, in a 3 x 3 x 2 (18 cells) each participant completes all 18 conditions�
Mixed factorial design�
has at least one between and one within subjects variable.2 x 2, one independent variable between subjects (caffeine no caffeine) and other independent variable is within subjects (visual memory test, verbal memory test)Randomly assign participants to the between subjects condition and all participants complete the within subjects condition.�