2bivar Flashcards
Strength of r
+/- .10 = small or weak+/- .30 = Medium, or moderate+/- .50 = Large, or strong
Point-biserial correlation
A statistical test used for evaluating the association between one categorical variable and one quantitative variable.
Phi coefficient:
categorical variables
Considerations for investigating statistical validity
Need to consider: The effect size The statistical significance of the relationship A ny sub tli bgroups or outliers Whether a zero relationship might actually be curvilinear.
Statistically Significant
A conclusion that a result is extreme enough that it is unlikely to have happened by chance if the null hypothesis is true..Statistical significance calculations help researchers evaluate the p y robability that the result came from a population in which the association is really zero. How often would we get an r of 0.24 just by chance, even if there is no association in the population?
P Value
The p p value helps evaluate the p y robability that the sample’s association came from a population in which the association is zero. If the probability is less than 5% (p
Spurious
An association that is attributable only to systematic mean differences on subgroups within the sample. When you consider the subgroups separately, there is no association (or the opposite association appears)two events or variables have no direct causal connection, yet it may be wrongly inferred that they do, due to either coincidence or the presence of a certain third, unseen factor (referred to as a “confounding factor” or “lurking variable”). Suppose there is found to be a correlation between A and B. Aside from coincidence, there are three possible relationships:A causes B,B causes A,ORC causes both A and B.
Outlier
One or a few cases that stand out as either much higher or much lower than most of the other scores in a sample. In a bivariate correlation, outliers are mainly problematic when they involve extreme scores on both of the variables. Outliers matter the most when a sample is small.
The statistically valid way to analyze a curvilinear relationship is to use:
a quadratic model: test the correlation between one variable and the square of the other
Causation: Directionality problem
A situation in which it is unclear which variable in an association came first.
Causation: Third-variable problem
A situation in which plausible alternative explanations exist for the association between two variables.
Moderator
A third variable that, depending on its level, changes the relationship between two other variables.Ex 1: Marital status moderates the relationship between maternal employment and child achievementEx 2: Weekend/weekday status does not moderate the relationship between small talk and well being
Moderators vs Subgroups
When we are asking , about moderators, our goal is to ask whether the association between the two variables is different within the levels of some third variable (the moderator). When we are asking about subgroups, our goal is to make sure that the overall associ tiaon b t th t i bl i th between the two variables is the same within the two subgroups.
Spurious Correlation example
The example we discussed in class is consistent with this definition, though usually spurious is a term reserved for situations where the correlation is an artifact of how the data was collected or analyzed specifically. I’ll go over the example from class again (which refers to figures on slides #24 and 25 from the Bivariate correlational research section).Imagine that you have a large sample of college students and you measure the number of times each student skips class per semester, along with his or her GPA for that semester. You make a scatterplot, which looks something like the one in the first figure. Most people would be surprised by this outcome: More absences are associated with higher grades?But notice that this scatterplot lumps all students together, whereas there are four subgroups being studied: freshmen, sophomores, juniors, and seniors. Now consider that the seniors, because they are taking classes in their own major, tend to get higher grades overall. The freshmen, because they are taking many required classes and are still trying to find a major that matches their talents, get lower grades overall. In addition to getting higher grades, the seniors are skipping more classes than the conscientious freshmen (exhibiting what is often called “senioritis”).The relationship between skipping classes and achieving higher grades is spurious; they are not related because of a direct relationship between the two variables, but because they are related to another variable (year in school). Because there are systematic differences between the subgroups (based on year in school) on both variables there appears to be an overall relationship between those two variables.