Psych Stats Exam #4 Flashcards
How is correlational research different from experimental?
1) no manipulation of the IV
2) no random assignment
3) at least 2 DV’s measured
Purpose of correlational research
to explore association between variables
Correlation definition
the linear association between variables
What does the correlation coefficient provide?
An indicator of a linear relationship
Visualizing correlation
scatterplots: each point represents two measurements of the same person
Things to look for in a scatterplot
- direction
- scatter/dispersal
- shape
Negative correlation
subjects with high scores on one variables tend to have low scores on the other variable
“when a score of X if above the mean of X, scores of Y will tend to be below the mean of Y” (and vice versa)
Positive Correlation
subjects with high scores on one variable tend to have high scores on the other variables (or low/low)
“when a score of X is above the mean of X, scores of Y will tend to be above the mean of Y” (and below/below)
Correlation coefficient definition (r)
statistic that quantifies the linear relationship between two variables
“ a measure of the tendency for paired scores to vary systematically”
What does the sign of r tell us?
- direction NOT magnitude
R value ranges
positive +1 to negative -1
- tells us magnitude
Perfect linear relationship
+1 or -1 (usually don’t exist in nature)
R effect size guidelines
small: 0.1
medium: 0.3
large: 0.5
R as a descriptive statistic
describes effect size
R as an inferential statistic
you can compare it to a critical value to find the rejection region
Null hypothesis of correlation
there is not a linear relationship between A and B (r = 0)
r for a population
degrees of freedom for correlation
df(r) = N-2
- N = number of pairs of observations (20 data points = 10 pairs of data sets)
Example of correlation write-up
“there is a statistically significant negative correlation - a negative linear relationship - between number of absences and exam score r(8) = -0.85, p<0.05. The more classes students miss, the worse they tend to perform on the exam.”
does not equal causation
Factors that influence r
1) truncated range
2) outliers
3) non linear relationships
Truncated range
zooming in on one group of people (ex: just high or low scores)
- can alter correlation: misrepresenting the true strength of the existing relationship by altering sample size
Outliers and small sample sizes
can mask or exaggerate a relationship between variables
- with a small sample size, outliers heavily affect results
- extremity of outlier: very extreme outliers have larger influences
Pearson’s correlation coefficient
for linear relationships only
used for parametric tests (scale DV)
Examples of nonparametric inferential tests
- chi-squared tests
- Mann-Whitney U test
Spearman’s correlation
used in nonparametric tests
When do we use nonparametric tests?
1) When assumptions of parametric tests are not met (population skewed or non linear)
2) small sample sizes (usually under 30)
3) DV is not scale (ordinal and nominal)
Disadvantages of nonparametric tests
1) tend to have low statistical power (higher probability of type II error)
- used when we only have a nominal variable
“how different are the observed values from the expected values under the null hypothesis”
What is “O”
observed value
What is “E”
expected value (under the null hypothesis)
What is Σ
sigma: summation
what is χ2
chi-squared: test statistic
Types of Chi-Squred tests:
1) chi-squared test for goodness of fit: one nominal variable, 2+ categories
- df = number of categories - 1
2) chi-squared test for independence: 2 nominal variables
Misuse of NHST parts
1) failure to control for bias
2) low statistical power
3) poor quality control
4) p-hacking
5) publication bias
What is replication crisis?
ongoing methodological crisis to replicate and reproduce psychological findings
obtaining consistent results using the same original data, methodology, and analysis
obtaining consistent results across several studies that aim to answer the same question with different data
Open science collaboration
- attempted to reproduce the findings of 100 journal articles
- 270 scientists
- only 39% replicated
Power posing
- only self-reported feelings replicated, no physiological impact
Smiling make you happier
did not hold up
P value definition
the probability of your observed results (or results more extreme) occurring if the null is true
Why reliance on P-value can be misleading
1) can result in binary thinking: 0.049 is significant but 0.5 is not
2) statistical significance is not necessarily meaningful (need to look at effect size)
Tools to use besides P-values
1) confidence intervals: more precise and accurate measure of the sample mean as an estimate of the true population
- small interval = better precision
Significant result but small effect size
something may be there but not meaningful
Not significant result but large effect size
might indicate you missed something (type II error) - might indicate low power
P-hacking (ways to increase power)
1) use a higher alpha
2) use a one-tailed hypothesis instead of two
3) increase sample size
4) somehow reduce variability
5) somehow make the difference between populations means bigger
P-hacking (definition)
the misuse of data analysis to find and report statistically significant effects
- data dredging, data snooping, significance chasing
Ways to P-hack
1) trimming data sets (get rid of outliers, zooming in)
2) adjusting values in the data set (what you think participants “mean”)
3) significance chasing: adding a few more participants at a time until the result becomes statistically significant
4) selective reporting: running many analyses but only reporting the ones that showed the desired effect
Debunking published research
very hard - once we see reported evidence, it is hard to change our perceptions
Publication bias
journals tend to publish significant results - may lead researchers to engage in shady research practices
- biased in incomplete understanding: important to know what is NOT different as well
- “file drawer problem”
Best Practices
- publish what you plan to collect and analyze to you don’t adjust
- people held accountable
Simple regression
use data to produce an equation for a straight line that captures the trend of the data
- used to make predictions about Y given a particular X score
Multiple Regression
use data to produce an equation for a line including MANY variables
- multiple predictor variables
- can compare strength of different variables on how they jointly affect Y
IV in regression
predictor variable
DV in regression
outcome/criterion variable
Line of best fit
captures the best trend of the data
Simple linear regression equation
ŷ = a + bX
y = predicted score on outcome
a = intercept
bX = slope of regression line (predicts change in Y for an increase of 1 unit in X)
b = unstandardized regression coefficient
- can not flip variables and get same regression
Ordinary least square (OLS) estimation
used to draw a line minimizing error/residuals
standardized beta
a 1 standard deviation increase in (IV) is related to (beta value) standard deviation increase in (DV)
- used in multiple regressions
Write up for multiple regression (beta)
“Controlling for all other measures variables (TV exposure, age, lower grades, parent education and education aspirations) exposure to sexual content on TV is still a significant predictor of pregnancy”
all variables relate to one another
Regression can not:
1) establish temporal precedence: do not know what came first (can not determine cause and effect)
2) control for variables that aren’t measured (can not measure all the variables in the world)
How is regression different from correlation?
Correlation: association between 2 variables Regression: prediction of DV using IV
When to use Mann-Whitney U
test for significant difference between two independent samples (two levels of IV, ordinal/nominal DV)
- parametric partner: independent samples t-test
When to use Wilcoxon signed-rank T-test
Test for significant difference between two paired samples (two levels of IV, nominal/ordinal DV)
-parametric partner: paired samples t-test
When to use Wilcoxon-Wilcox comparison test
Test for significant differences among all pairs of independent samples (three levels of IV, and ordinal/nominal DV)
- parametric partner: one-way ANOVA, tukey HSD tests
When to use Spearman correlation coefficient
Describe the degree of correlation between two variables (nominal/ordinal DV)
- parametric partner: Pearson coefficient (r)
When is the mean larger than the median?
Negative skewed data
When is mean smaller than the median?
Positive skewed data
Descriptive Statistics
Summarizing a distribution of data with a single number - conclusions you draw from numbers
Sample size and rejecting the null
Sample size increase: easier to reject the null
number describing the population
- muew: mean
- s-hat = standed deviation
number describing sample
- mean = M
- S = standard deviation
Practical use of power
- can be used to determine the sample size required to detect an effect size
Type I error
False alarm: you said yes but there is no effect
Type II error
Miss: you missed an effect that was actually there
Statistical Power definition
the probability that we will correctly reject the null when we should
What is NHST?
Null-hypothesis significance testing
Testing against a null hypothesis (no significant difference) to see how odd your results are
Robust parametric tests
When an assumption of a parametric test is violated, but the test still operates (mostly) as intended
- The tests we’ve covered this semester are robust against the assumption of normality
Spearman vs Pearson Correlation
- parametric: scale DV
-non parametric: nominal/ordinal DV
If assumptions of a parametric test are met and you use a nonparametric test you are more likely to…
make a Type II error