intro to stats Flashcards
what is the probability of both A and B occuring?
P of A x P of B
what is the combination rule?
N! / r!(N-r)!
where N = group/sample size
where r = pairs, triplets etc.
when is the Wilcoxon Rank-Sum Test used?
to test for a statistically significant difference between 2 different groups
how to work out significance using the Wilcoxon Rank-Sum Test?
rank all of the data, irrespective of the group it’s in, starting at 1 until the end number
calculate the sum of the ranks in the group with the lower n or if the same, the lower total
result is significant if calculated (W) value is less than or equal to the Wilcoxon value
how to rank 2/3 etc. of the same number in a WIlcoxon Rank-Sum Test?
take the ranks of both/all the number and calculate the mean of them
name three measures of central tendency
mean
median
mode
what type of sttistics are measures of central tendency?
descriptive statistics
when is the only time you would use a 1 tailed hypothesis/test?
if there is previous research indicating a certain outcome
how to work out significance using Wilcoxon matched-pairs test?
calculate the difference between each pair (same persons scores in 2 different conditions)
remove pairs whose difference is 0 and adjust N accordingly
rank the differences, ignoring the sign, from lowest to highest
calculate the sum of the ranks of positive differences and sum of rank of negative differences
T value is the smallest group of either positive or negative
result is significangt if T value is less than or equal to value in the table
what is the formula for the binomial test and what does each letter mean?
prob (X) = ( N! / X! (N-X)! ) x ( p^x ) ( q^n-x )
n = total number of people x = number in group p = probabilty of being in group x q = probabilty of not being in group x
how to do the N-by-1 chi square?
calculate the expected values by finding the mean of the data set
apply the formula (O-E)^2/E and add up all the values
calcultae degrees of freedom (N-1)
compare to the critical value
significant if calculated value is more than the crtical value
how to do the contingency chi square test?
add up the row and column totals
calculate the expected frequency of a cell by doing the row total x column total / overall total
apply the formula (O-E)^2 / E and add up all the values
calculate degrees of freedom by doing number of rows -1 x number of columns -1
work out critical value and significant if calculated value is more than critical value
how to do the vairance test?
- work out the mean for the group
- do each observation minus the mean
- square each value
- add all the values up and divide by N -1
- do the same for the other group
- divide the larger variance by the smaller variance to get the F value
- find the degrees of freedom for both the groups
- use the d.f of the larger vairance on the top and the d.f of the smaller variance across the side to find the critical value
- significant if F is larger than the critical value
how to do the Z test?
when 1 participant and 1 observation normally distributed
1) calculate the Z value using Z = (X - mean) / standard deviation
(X means the observed value)
2) work out whether 1 or 2 tailed depending on info given in question
3) find teh Z score in the table
4) find corresponding p value
5) if 2 tailed then double the p value
6) significant if p<0.05
what is the standard deviation in relation to the variance?
the standard deviation is the square root of the variance
how to roughly measure whether a data set is normally distributed?
use Exploratory Data Analysis (EDA)
- create roughly N/4 equal sized bins (find range of data set and divide by the value of N/4 and round to a whole number)
- make a tally for each number in the data set next to the correct bin
- not normally distributed if the data shows bimodality (2 peaks) or a positive/negative skew
How to do the related samples t-test?
Work out the differences between the 2 conditions for each participant
Check if differences normally distributed
Calculate the mean of the differences
Calculate the standard deviation of the differences
Calculate the t value by doing mean of differences / standard deviation / square root of N
Decide if 1 or 2 tailed
Significant if t value is greater than critical value
When to use the binomial test?
Frequency data
Comparing groups
2 groups
When to use the chi-squared test?
Frequency data
Comparing groups
More than 2 groups
When to use contingency chi-squared?
Frequency data
Relationship between 2 variables
When to do the Z test?
Continuous data
Only one person in sample
When to do the variance test?
Continuous data
Sample size of larger than 1
Comparing variances
When to do the Wilcoxon matched pairs test?
Continuous data Sample size more than 1 Comparing central tendency Related data (repeated measures or within subjects design) Differences between groups not normal
When to do the related t-test?
Continuous data Sample size of more than 1 Comparing central tendency Related sample (within subjects or repeated measures) Differences between groups normal
When to do the Wilcoxon matched sum?
Continuous data
Sample size more than 1
Comparing central tendency
Unrelated sample (independence measures, between subjects design)
Not normally distributed and variances unequal
when to use the unrelated samples t-test?
continuous data
sample size more than 10
unrelated data (independent groups or between subjects)
normally distributed (do histograms for both groups)
variances equal (do variance test and if not significant then equal)
do Wilcoxon rank sum if sample less than 10, not nromally distributed or variances different
how to do an unrelated samples t-test when the sample size is equal in both groups?
1) make sure both groups are normally distributed
2) find the variances of each group using the variance equation
3) check if the variances are roughly equal by doing the variance test (highest/lowest and if not significant then carry on)
4) work out the standard error by doing square root (variance 1 + variance 2 / N of one of the groups)
5) work out the t stat by doing difference in means / standard error
6) calculate the d.f by doing 2N-2
7) significant if the t stat is more than t crit in the table
how to do an unrelated samples t-test when the sample size is unequal in each group?
1) make sure both groups are normally distributed
2) work out the variances of each group
3) check if the variances are roughly equal by doing the variance test (highest v/lowest v) and if not significant then carry on
4) find the pooled variance estimate by doing (d.f of first group x variance) + (d.f of second group x variance) / sum of sample sizes -2
5) calculate standard error by doing square root (pooled variance estimate x 1/N1 + 1/N2)
6) calculate the t state by doing difference in means / standard error
7) calculate the d.f by doing N1 + N2 -2
8) significant if t stat is more than t crit
when to do the Spearman’s test?
continuous data
relationship between 2 variables
sample is more than 1
one group or both groups is not normally distributed
how to do the Spearman’s test?
1) test for normal distribution for both groups
2) rank both variables independently
3) work out the mean of the ranks
4) work out the covariance by calculating X - mean X and Y - mean Y for each observation and then times the corresponding observations together and add them all up. then divide by N-1
5) calculate the standard deviation (s) of both groups independently by doing the total of X - mean X squared, divide by N-1 then square root
6) calculate r by doing the covariance divided by the standard deviation of both groups timesed together
7) decide whether one or two tailed
8) compare to r table and if larger than critical value then significant relationship
when to do the Pearson’s test?
continuous data
sample more than 1
relationship between 2 variables
both groups normally distributed
how to do the Pearson’s test?
1) find if both normally distributed using the histogram method
2) work out the mean for both groups
3) work out the covariance
4) work out the standard deviations for each variable
5) use the covariable and the standard deviations to calculate the r value
6) decide whether 1 or 2 tailed
7) compare the r value to the critical value and if the r value is larger then relationship significant
when to do an outliers test?
do before starting the test unless explicitly told not to do it
t-test
wilcoxon test
variance test
how to do an outlier test for within and between groups designs
within subjects - do outlier check on difference etween groups
between subjects - do outlier check on each group
what are words in questions that also mean ‘variance’?
consistency of scores
dispersion
what are words in questions that also mean ‘relationship’
association
when X increases Y decreases (indicating correlation)
what are words in questions that mean ‘compare central tendency’?
typical value
average value
performance better in one group than another
how to do the outlier test?
Boxplot
1) rank data scores from low (including negative numbers) to high
2) work out the median position of the data and the corresponding score that goes with it
3) calculate the lower quartile by doing the median position + 1 / 2 and find the corresponding score (if the median is a fraction then round down)
4) calculate the upper quartile by doing the number of data + 1 - the lower quartile position and then find the corresponding score
5) calculate the inter-quartile range by doing the upper quartile - the lower qurtile score
6) calculate the length of the whiskers by doing 1.5 x the IQR
7) outliers when data is more than 2 whiskers away from the nearest quartile
8) if any outliers found then remove them from the data set