study notes Flashcards
Crosstabs
H0: no association between variables A and B
H1: association between A and B
Assumptions:
1. two nominal variables
2. cells need an expected count greater than 5 (at least 80% of the cells)
SPSS: Analyze -> Descriptive Statistics -> Crosstabs -> Statistics (Phi and Cramers V)
Cells -> Observed + Expected + Row, Columnn, Total
Residuals: standardized
if expected count and count have a big distance we can reject H0 more easily
-> Cramers V measures the strength of the relationship
one sample t-test Assumptions
H0: the populations mean equals the specified mean value
H1: the population mean is different from the specified mean value
Assumptions:
1. Normal distribution (approximately)
H0: normal distribution
H1: no normal distribution
SPSS: Analyze -> descriptive statistics -> explore (add outliers)
- no outliers -> remove them if there are outliers
one sample t-test (the test)
-> when the assumptions are met do the test
SPSS: Analyze -> Compare Means -> One-sample t-test
to reject H0 -> p < 0,05
t > 2 and confidence intervall does not cross 0 -> results are statistically relevant
Cohen´s d: size of the effect ->
0 - 0,4 small effect
0,5 - 0,8 medium effect
0,9 - 1 large effect
Independent sample t-test Assumptions
H0: all means are equal
H1: at least one mean is different
Assumptions
1. normal distribution
H0: normal distribution
H1: no normal distribution
SPSS: Statistics -> explore -> outlier + normality plots
- no outliers
-> if there are outliers remove them with 1,5 IQR
SPSS: Explore -> Statistics -> Percentiles
IQR: Q3 - Q 1
lower outliers: Q1 - 1,5IQR
upper outliers: Q3 + 1,5 IQR
-> take them out with select cases and if condition
independent sample t-test
- Homogeneity of variances - run the test
SPSS: Analyze -> compare means -> independent sample t-test
t > 2, confidence intervall cannot cross the 0 line -> then the results are statistically relevant
p < 0,05 -> we can reject H0
paired sample t-test
H0: the population mean differences between the paired values is equal to zero
H1: the population mean difference between the paired values is NOT equal to zero
Assumptions to test
1. normal distribution
-> we need to compute a new variable from the difference of the variables
SPSS: transform -> compute variables
test normality with Analyze -> explore
H0: normal distribution
H1: no normal distribution
- test for outlliers
-> if there are none you can do the study
paired sample t-test: conduct the test
SPSS: Analyze -> compare means and proportions -> paired sample t-test
Use the old variables
t > 2, confidence interval cannot cross 0 -> then results are statistically relevant
cohen´s d:
0 - 0,3 = small effect
0,3 - 0,5 = medium effect
0,5 - 0,8 = large effect
one way ANOVA
H0: all population means are equal
H1: at least one mean is different
Assumptions to test:
- Normal distribution
Analyze -> Descriptive Statistics -> Explore - no outliers
- Homogeneity of variances (conduct the test)
Analyze -> compare means -> one way ANOVA
Statistics: click Descriptive, Homogeneoty of variance test, Welch test, means plot
options: turkey´s b and Games-Howell
H0: there is homogeneity
H1: there is no homogeneity
conduct test and look at Sig.
Correlation (Pearson)
1 = perfect correlation
0 = no correlation
-1 = negative correlation
Assumptions:
1. Linear relationship
SPSS: Graphs -> chart builder -> scatter plot
- Normal distribution
SPSS: analyze -> descriptive statistics -> explore -> both variables in the dependent list - outliers
-> remove outliers
skewness value is more important than Shapiro wilk test -> skewness between 1 and -1
Test:
Analyze -> Correlate -> Bivariate
0,1 - 0,3 = small
0,3 - 0,5 = medium
0,5 - 1 = strong
H0: no association
H1: there is an association
-> if there is no linear relationship we cannot use Pearson
Spearman´s correlation
Assumptions:
1. paired observation measured at an ordinal or continuous scale
2. relationship does not need to be linear -> it needs to be monotonic
SPSS: Analyze -> correlate -> bivariate
Kendall´s Tau
Data requirements:
1. Two variables
2. Paired observations
H0: the two variables are independent
H1: There is an association between the variables
SPSS: Analyze -> Correlate -> Bivariate
Simple linear regression (Assumptions 1)
independent variable = x-axis
dependent variable = y-axis
- Linear relationship
SPSS: graphs -> chart builder -> scatter plot -> add line of best fit - independence of observations
Durbin Watson -> value between 1,5 and 2,5 - no significant outliers
- Data needs to show homoscedasticity
Simple linear regression test
SPSS: Analyze -> Regression -> Linear
Statistics: Estimates, Confidence Intervals, Durbin-Watson, Casewise diagnostics
Plots
Y = ZRESID
X = ZPRED
Histogramm and Normality plots
- the residuals of the regression line need to be normally distributed -> see in Histogramm
H0: our x variable has no explanatory power
H1: x variable has powers to predict changes
Model summary: Adjusted R-squared tells us how many % of the changes are explained by the dependent variable
Model:
Constant + B *x + e
Statistically relevant if
t> 2, p (Sig) < o,o5 and confidence intervall does not cross the 0 line
Binned data
SPSS: Transform -> Visual Binning -> binned variable name
-> make cutpoints -> first cutpoint: minimum, number of cutpoints or width -> apply -> make labels -> analyze -> descriptive statistics ->frequencies -> bar chart