lecture 4 - data types, collection, management and exploratory data analysis Flashcards
numeric discrete data -
numbers are regularly gapped
numerical continuous data
numbers can take any value in a continuum
categorical nominal data
categories are arbitrary and can be recorded
categorical ordinal
the order of the categories are important
what is the goal of explatory data analysis
describe dats numerically and visualise it graphically
what are the things used to describe data
- centre - bulk of the data
- spread - consistency of data
- shape - symmetrical or skewed
what is the goal of summary statistics
convey as much information about the data in as few numbers as possible
median
midpoint of the data
what are the measurements of spread?
- percentiles, range and interquartile range
- variance and standard deviation
pth percentile
value at which p% of the data is less than or equal to
Q1
25th percentile
Q2
50th percentile
Q3
75th percentile
what does a 5 number summary contain
min, Q1, median, Q3, max
variance
measure of the amount the data is spread around the mean