3. Representations of data Flashcards
How do you calculate outliers?
Large outliers: Q3 + k (Q3 - Q1)
Low outliers: Q1 - k (Q3 - Q1)
What is the difference between outliers and anomalies?
Outliers: unusual yet legitimate values, could still be correct
Anomalies: clear errors, would mislead the data if it was kept in, can be a result of experimental or recording errors
The process of cleaning anomalies is known as cleaning the data
What are the features of a box plot?
Lowest value Lower quartile Median Upper quartile Highest value Anomalies are represented by a cross on the graph
What are cumulative frequency graphs?
When the frequencies form a line graph
How do you compare data?
Comment on:
A measure of location
A measure of spread
E.g. mean and standard deviation or median and interquartile range
What is the equation for histograms?
Frequency = frequency density X class width