cleaning, storing and summarising data Flashcards
what 3 things would you do to determine whether data is parametric or non-parametric?
- plot a histogram
- plot a QQ graph
- carry out a Shapiro-wilk test
how does does the Shapiro-wilk test test for parametric/non-parametric data?
accept or rejects normality in data and either accepts or rejects null hypothesis
how do histograms help determine normality of data and what will more data achieve
the distribution of data can be observed- normal bell shape or other?
larger data sets ted to have a smoother curve
should outliers be removed?
only if there is a definite reason for doing so- method was overtly wrong or a freak confounding variable skewed results
how do QQ plots determine normality of data and what would a strong positive correlation suggest
data is plotted against theoretical values. a strong positive correlation would suggest normal data.