Probability, Sampling and Distributions Flashcards
what is a gaussian distribution
a normal distribution
whats positive skew
the mode, median and mean are on the left
whats negative skew
the mean median and mode are on the right
how do you measure skew
pearsons coefficient of skew
uses the difference in mean andf median
when ‘tail’ of data is on left of mean then PCOS (pearsons coefficnet of skew, not the ovary thing) is negative
when tail of data is on right of mean then PCOS is positive
what are parametric tests
use population parameters - estimates of data such as mean and STD DEV
assume the mean and STD DEV accurately represent the population distribution of data
how and why do we transform data
we perform a mathmatical operation on all the data we have
it helps to reduce the impact of outliers and skew
we take the log of each data point then perform a statistical test
also useful in viewing data in a standardised format
what is a z-score
it tells us how many standard devfiations we are above or below the mean value
what is sampling error
we take a sample at random from a population of data in order to estimate parameters of the whole population
but the mean of each sample differs from the true mean of the population
this is sampling error
e.g trying to find out mean age of 50 people in a room by picking 10 at random and asking them
what is the standard error (usually standard error of the mean)
tells us how much that statistic is likely to vary between samples
effectively a measure of confidence that we know the true population mean
its dependent on: variability of the original data (STD dev of the oopulation) and amount of data used to create the sample mean
whats a confidence interval
similar to standard error but more intuitive feel to it
whats a key principle to remember abut error bars
if they overlap then it implies there isnt a significant difference