Final exam Flashcards
bivariate correlations (ch. 8)
An association that involves exactly 2 variables. Also
called bivariate association. There are 3 types of associations: positive, negative, and zero. To investigate associations, researchers need to measure the 1-st variable and 2-d variable in the same group of people.
e.g. Smoking is related to more happiness – association claim (correlation).
describing associations with categorical data (dating study)
For the association between marital satisfaction and online dating, the dating variable is categorical; its values fall in either one category or another.
A person meets their spouse either online or offline. The other variable in this association, marital satisfaction, is quantitative; 7 means more marital satisfaction than 6, 6 means more than 5, and so on.
graphing associations with one variable as categorical
Figure 8.3 is a scatterplot of data in which one variable (marital satisfaction) is quantitative and the other variable (where a person met his or her spouse) is categorical. The correlation between these two variables is r = .06, which is a small correlation.
Figure 8.4: Bar Graph of Meeting Location and Marital Satisfaction
It’s much more common to plot the results of an association between one quantitative variable and one categorical variable in a bar graph (Figure 8.4). In a bar graph, each individual is not represented by a data point. Instead, it shows the group mean (average) for marital satisfaction for those who met their spouse online and the average for those who met their spouse offline. The online mean is slightly higher than the offline mean, corresponding to a weak association between the two variables.
Cohen’s r guidelines for evaluating strength of associations
Psychologists sometimes use the terms weak, moderate and strong to describe r of .1, .3, and .5, respectively. (r is for correlation; d is for effect size)
It’s better, however, to think that effect size indicates the importance of a relationship, but our judgments also depend on the context. Even a tiny effect size can be important.
e.g. At the Olympic level, a tiny adjustment to an athlete’s form or performance might mean the difference between earning a medal and not reaching the podium at all.
The table shows Cohen’s guidelines for evaluating association strengths based on r.
Analyzing Associations When One Variable Is Categorical by t-test? What’s t-test?
1) t test: a statistic to test the difference between two group averages.
Although it is possible to calculate an r value when at least one of your variables is categorical, it’s more common to use a t-test to determine if the group means are statistically different from one another.
2) a statistical test that is used to compare the means of two groups. It is often used in hypothesis testing to determine whether a process or treatment actually has an effect on the population of interest, or whether two groups are different from one another.
correlational studies support association NOT causal claims
When the method of the study involved measuring both variables, the study is correlational, and therefore it can support an association claim. Association claims can be graphed using scatterplots or bar graphs, and they can be made using r or t tests. However, an association claim is not supported by a particular kind of statistic or graph; it is supported by a study design— correlational research—in which all the variables are measured.
e.g. The two variables are “being a parent or not” and “level of happiness.” Association claims can be supported by correlational studies. Why can we assume the study was correlational?
We can assume this is a correlational study because parenting and level of happiness are probably measured variables (it’s not realistically possible to manipulate them). In a correlational study, all variables are measured.
interrogating association claims
Construct validity: How well was each variable measured?
Statistical validity: How well do the data support the conclusion?
Internal validity: Can we make a causal inference from association? (irrelevant)
External validity: To whom can the association be generalized?
The most important validities to interrogate for an association claim are construct validity and statistical validity.
Although internal validity is relevant for causal claims, not association claims, you need to be able to explain why correlational studies do not establish internal validity.
Construct validity of an association claim
Ask about the construct validity of each variable.
How well was each of the variables measured?
Does the measure have good reliability?
Is it measuring what it’s intended to measure?
What is the evidence for its face validity, for its concurrent validity, and for its discriminant and convergent validity?
For example: In the Mehl study, you would ask questions about the researchers’ operationalizations of deep talk and well-being.
Recall that deep talk in this study was observed via the EAR recordings and coded later by research assistants, while well-being was measured using the subjective well-being (SWB) scale (self-report)
Statistical validity of an association claim - effect size
All associations are not equal; some are stronger than others. The term effect size describes the strength of a relationship (association) between two or more variables.
e.g. In Figure 8.6, both associations are positive but B has a stronger association (r is closer to 1) than A.
B depicts a stronger effect size.
We use Cohen’s guidelines for labeling effect size as small, medium, or large.
Statistical validity of an association claim - predictions
Strong effect sizes enable more accurate predictions. The more strongly correlated two variables are, the more accurate our predictions can be. Both the scatterplots here depict positive correlations. Which scatterplot shows the stronger relationship? Part A does, which means that we can make more accurate predictions from the data in Part A.
In other words, we can more accurately predict an individual’s score on one variable when given the score on the other variable. Conversely, we make more prediction errors as associations become weaker as in Part B. Both positive and negative associations can allow us to predict one variable when given the other variable.
Statistical validity of an association claim: statistical significance
Statistical significance-refers to the conclusion researchers make regarding how probable it is that they would get a correlation of that size by chance, assuming that there is not a correlation in the real world.
e.g. It is notable that the 95% CI for the association between sitting and MTL thickness [–.07, –.64] does not include zero. The CI for meeting one’s spouse online [.05, .07] doesn’t include zero either. In both of these cases, we can infer that the true relationship is unlikely to be zero. When the 95% CI does not include zero, it is common to say that the association is statistically significant. The definition of a statistically significant correlation is one that is unlikely to have come from a population in which the association is zero.
Logic of statistical inference (Statistical validity of an association claim)
- Researchers collect data from a sample and make inferences to the population. Typically what happens is the sample mirrors what is happening in the population, but this isn’t always the case.
Calculations of statistical significance help researchers evaluate the probability that the result came from a population in which the association is actually zero. - If there’s an association between two variables in the population, then there is usually an association in the sample.
- If there’s no association between two variables in the population of interest, then there’s probably no association in the sample.
- But sometimes even if there isn’t an association in the population, simply by chance there may be an association in the sample.
What does statistically significant result mean? (Statistical validity of an association claim)
A probability estimate (or p value) provides information about statistical significance by evaluating the probability that the association in the sample came from a population with an association of zero. If p is very small (less than 5%), then it’s very unlikely that the result came from a zero association. Thus, a finding of p < .05 is considered to be statistically significant.
What does a statistically non-significant result mean? (Statistical validity of an association claim)
If p is relatively high (greater than .05), then the result is non-significant (not statistically significant). Therefore, we can’t rule out the possibility that the result came from a zero-association population.
Outliers meaning? (Statistical validity of an association claim).
Outlier: an extreme score (or perhaps a few) that lies far away from the rest of the scores. The two scatterplots in Figure 8.10 are identical except for the outlier in the upper right-hand corner in the top scatterplot. The correlation coefficient for the top graph is r = .37 and for the bottom graph is r = .26. Outliers can cause problems for association claims because they may exert a large amount of influence. In bivariate correlations, outliers are most problematic when they involve extreme scores on both variables.
Outliers are most influential when the sample is small (see Figure 8.11). The two scatterplots are identical except for the outlier. Removing the outlier changes the correlation from .49 to .15, which is much bigger than the change from .37 to .26.
Restriction of range (Statistical validity of an association claim).
When there is not a full range of scores on one of the variables in an association in a correlational study, it can make the correlation appear smaller than it really is.
Figure 8.13: Restriction of Range Underestimates the True Correlation
SAT scores can range from 600 to 2,400, but College S only admits students who score 1800 or higher (restriction of range in Figure 8.13A). Thus, the range is restricted to 1,800–2,400. We can see what the scatterplot would look like if the range was not restricted (see Figure 8.13B). In the top scatterplot, r = .33. In the bottom scatterplot, where the range is not restricted, the correlation is stronger (r = .57). What can researchers do about restriction of range? There are statistical techniques that allow correction for restriction of range, or researchers can recruit more participants to try to widen the range.
Curvilinear association (Statistical validity of an association claim).
one in which the correlation coefficient is zero (or close to zero), and the relationship between two variables isn’t a straight line.
e.g. As people’s age increases, their use of the health care system decreases, but as they approach 60 years of age and beyond, health care use increases again. The correlation coefficient is r = .01, which doesn’t adequately capture the curvilinear nature of the relationship. However, the scatterplot can inform us about curvilinearity in cases where the correlation coefficient suggests that there is no correlation.
To establish causation, a study must
satisfy three criteria:
Applying the three causal criteria:
1. Covariance of cause and effect. The results must show a correlation, or association, between the cause variable and the effect variable.
2. Temporal precedence. The method must ensure that the cause variable preceded the effect variable; it must come first in time.
3. Internal validity. There must be no plausible alternative explanations for the relationship between the two variables.
More on internal validity: When is the potential third variable a problem?
External validity of an association claim?
- How important is it?
Does the association generalize to other people, places, and times? It is important to note that the size of the sample does not matter as much as the way the sample was selected from the population of interest.
What’s moderator (moderating variables)?
When the relationship between two variables changes depending on the level of another variable, that other variable is called a moderator.
Example: Let’s consider a study on the correlation between professional sports games attendance and the success of the team. Using data gathered over many major league baseball seasons, Oishi and his team determined that in cities with high residential mobility, there is a positive correlation between success and attendance; that pattern shows that people are more likely to attend games there when the team is having a winning season. In cities with low residential mobility, there is not a significant correlation between success and attendance; that pattern shows that Pittsburgh Pirates fans attend games regardless of how winning the season is. We say that the degree of residential mobility moderates the relationship between success and attendance.
Simple experiments (ch. 10)
Let’s begin with two examples of experiments that supported valid causal claims.
Example 1: Taking Notes
Having selected five different TED talks on interesting topics, the researchers showed one of the lectures on a video screen. They told the students to take notes on the lectures using their assigned method. After the lecture, students spent 30 minutes doing another activity meant to distract them. Then they were tested on what they had learned from the TED talk.
The results Mueller and Oppenheimer obtained are shown in Figure 10.2. Students in both the laptop and the longhand groups scored about equally on the factual questions, but the longhand group scored higher on the conceptual questions.
Example 2: Eating Pasta
Some researchers at Cornell University conducted an experiment to see if serving bowl size has an effect on portion size. Participants were randomly assigned to either the “large bowl” or “medium bowl” condition.
Each participant’s plate was weighed before he or she ate the pasta and afterward to determine the amount of pasta consumed. The graph on the left shows that participants took more pasta from the large serving bowl than from the medium one and they consumed about 140 calories more.
The researchers concluded that the size of the serving bowl influenced how much pasta people served themselves and how much they ate.
independent vs dependent variable?
Independent variable- Manipulated variable: researcher assigns participants to a particular level of the variable; example: note-taking method (levels: computer, longhand); example: note-taking method was the IV in the academic achievement study.
Dependent variable- Measured variable (outcome): research records what happens in terms of behavior of attitudes based on self-report, behavioral observations, or physiological measures; example: number of anagrams solved correctly.