Lecture 5 Flashcards
What does the null hypothesis basically say?
There is no difference
What does the alternative hypothesis say?
There is a difference
What is a Type 1 error?
False Positive: rejecting the null hypothesis when it’s true
IOW: There’s no difference but you wrongly say that there is.
What is a Type 2 error?
False Negative: failing to reject the null hypothesis when it’s false
IOW: There’s a difference but you wrongly say there isn’t
What data do parametric statistics analyze?
Quantitative
What are examples of parametric statistics?
t-test, ANOVA, Pearson correlation, linear regression
When do we use parametric statistics?
- Data must meet assumptions for the model to be correct
- Based on one of the distributions so the data needs to be normalized
What data is non-parametric statistics used to analyze?
Qualitative data
What are examples of non-parametric statistics?
Spearman rho, Mann Whitney U, Friedman’s ANOVA, Wilcoxon-signed ranks
When do we use non-parametric statistics?
- When we have violated assumptions
- When we have nominal or ordinal data
What is a linear regression?
One predictor variable and one outcome variable
- significant relationship if the slope does not equal 0
What are the 5 parametric assumptions for t-test or one-way ANOVA?
- I/R Data
- Normality
- Homogeneity of Variance (HOV)
- Free of Extreme outliers
- Independence of observations
What are 3 ways to test for normality?
- Histograms
- Skewness/kurtosis
- Shapiro-Wilk Test
When does skewness/kurtosis tell you normality is NOT met?
if skewness/kurtosis is >2 or <-2
When does Shapiro-Wilk tell you normality is met?
If the significance is >.05
When is HOV not an issue?
In repeated measures test
What does HOV mean?
In designs looking for differences, the variances of the outcome variable should be about the same in each group
How do you test for HOV?
Levene’s Test
MET = >.05
Why must your data be free of influential outliers?
- For t-test or ANOVA: it will pull the mean toward the outlier
- For regression: it will pull the best-fit line towards the outlier
What are 4 ways to test for influential outliers?
- Histogram
- Skewness/kurtosis
- Boxplots
- Regression: Cook’s Distance
When will Cook’s Distance tell you that “Free of Influential Outliers” assumption is NOT MET?
if > 1
What does independence of observations mean?
Data has to be independent & can’t follow a pattern over time
Scores from one participant can’t influence another participant’s scores
What are 3 regression assumptions?
- Linearity
- Homoscedasticity
- Outlier testing in regression
HOV for ___ stats
Homoscedasticity for ____ stats
HOV: difference stats
Homoscedasticity: relationship stats
If the variance is not evenly distributed, this is called?
Heteroscedasticity
Based on linearity: the model is a linear model, so the data must be ____
Linear
What is the easiest way to check for linearity?
Scatterplot
Tells you if the data points are mostly in a straight line
What is a residual?
The difference between the observed score and the predicted score (the line)
If you have curvilinear data, can you use a linear model?
No, lots of error
What type of graph is also used to check for homoscedasticity and outliers?
Scatterplots
An outlier will have a ____residual
Large
After we determine every data point’s residual scare, we have to determine the ________
Standardized residual
similar to z-score AKA distance from the line in terms of standard deviations
UH OH! You violated an assumption. What are some possible solutions?
- Trim the data
- Windsorizing
- Transform the data
- Analyze with Bootstrapping in SPSS
- Use non-parametric stats
Which of these are good and bad?
- Trim the data
- Windsorizing
- Transform the data
- Analyze with Bootstrapping in SPSS
- Use non-parametric stats
1-3 BAD
4-5 GOOD
What is it called when I delete a certain number of percentage of scores from the extremes?
Trim the data
What is it called when I substitute outliers with the highest value that isn’t an outlier?
Windsorizing
What is it called when I apply a log transformation to the scores in hopes of improving normality but then I’m not actually studying the data?
Transform the data
What is a hypothesis test?
Statistical method that uses sample data to evaluate a hypothesis about a population
What is the goal of a hypothesis test?
To rule out change (sampling error) as plausible explanation for the results from a research study
The alpha level determines the risk of a Type ____ error
Type 1 Error
The critical region consists of the outcomes that are very unlikely to occur if ____ hypothesis is true.
Null
because Null says no difference so it doesn’t want effect to be in the critical region
The middle 95% is if Ho is _____ AKA ______ change
Ho is true
No change
If the t-statistic is in the critical region, do we reject the null hypothesis?
Yes
alpha level vs. p value
alpha: set in advance/pre-set significance
p-value: actual probability that the results occurred just because of sampling error
The hypothesis test is influenced not only by the _____ and the _____ of the sample but also the _____ of the sample
The hypothesis test is influenced not only by the size of the treatment effect and the Variability of the sample but also the size of the sample
Cohen’s d is a measure of _____
Effect size
What are 2 ways to increase ES?
- Increase the mean difference (numerator)
- Decrease the standard deviation (denominator)
What is the effect “size” for each of the below cohen’s D?
d = .2
d = .5
d = .8
d = .2 Small Effect
d = .5 Medium Effect
d = .8 Large Effect
What is power?
Probability the statistical test will reject the null hypothesis when the treatment does have an effect
What are 4 ways to increase power?
- Increase ES
- Increase sample size
- Increase the alpha
- Use a 1-tail test
What is an independent t-test?
compares 2 means based on independent data
e.g. data from different groups of people
What is a dependent t-test?
Repeated or Paired
compares 2 means based on related data
e.g. pre-post testing; matched samples or twins
If SPSS tells you your sig. or p value is .0000, what should you do?
NEVER write this, it’s never 0
Write <.0005 or smaller
What is the statistical advantage for repeated measures t-test over independent t-test?
Less error because you use the same person and test them twice
What is the non-parametric equivalent for independent t-tests?
Mann Whitney U-Test
What is the non-parametric equivalent for repeated measures t-test?
Wilcoxon signed ranks test
Which non-parametric test is more powerful?
Mann-Whitney U-Test
different from t-test
What data is the non-parametric tests done on?
Mean rankings
e.g. both sets of data are put in rank order and the test is done to see if the MEAN ranks are different
As you repeatedly test the same dataset, you are more likely to commit a _____ Error
Type 1
What is the Bonferroni Correction?
Divides alpha by the # of tests you plan to run
Would you apply Bonferroni here:
12 correlations are to be conducted between SAT scores and 12 demographic variables?
Yes