basic statistics lecture Flashcards
neurophysiological datasets are usually
multidimensional
null-hypothesis (H0)
states no significant differences
estimation
process of inferring an unknown quantity of a population using sample data
parameter
quantity describing the population
three mutually complementary aspects of summary data description
- frequency distributions (shape)
- measures of centre (mean, median, mode)
- measures of dispersion
central limit theorem
the sum or mean of a large number of measurements randomly sampled from any population is approximately normal distributed
sample variability
the variability among random samples from the same population
sampling distribution
probability distribution that characterizes some aspect of sampling variability
increments
sample number and the number of repetitions increases the population mean estimation and the normal shape of the sampling distribution
type 1 error
false positive
(1-alpha)^N
type 2 error
false negative
correcting type 1 error
- Bonferroni correction (very conservative)
alpha*= alpha/number of tests - False discovery rate (FDR)
FDR= number of false positives/number of significant features
permutation based methods (getting rid of type 1 errors)
- neyman-pearson approach
Z, t, F and chi-square are derived from this
base for permutation approach
crucial difference with other methods -> it evaluates test statistics under their sampling distribution
does not easily deal with the multivariate nature of electrophysiological data
randomization
- hypothesis testing on measures association
- mixes the real data randomly
variable 1 from an individual is paired with variable 2 data from a randomly chosen individual –> is repeated - estimate based on randomized data
- whole process repeated numerous times
analysis of variance (ANOVA)
- method to compare group means (>2 groups)
- a generalisation of two sample t-test
- variability between groups => variability within groups
steps to perform non-parimentrical tests
- collect trials of two experimental conditions in a single set
- randomly draw as many trials from this combines set as there were trials in condition 1 and place these in subset 1. place the remaining trials in subset 2. = random partition
- calculate test statistic
- repeat steps 2 and 3 many times and construct a histogram of the test statistics
- use the histogram and calculate the proportion of random partitions that resulted in a larger test statistic than the observed one = p-value
- if p-value < alpha –> the two conditions are statistically significant
non-parametric statistical testing of EEG and MEG data
- EEG and MEG data have spatiotemporal structure
- data are collected in different conditions
- MEG-EEG- data have to deal with the multi comparison problem
- there are a large number of statistical comparisons