Presentation of Data Flashcards
what are the purposes of screening data
to detect blunders
locate outliers
determine distributional properties
determine number of missing values
how is a small data set screened?
by eye
how is a large data set screened
frequency table or histogram
what are the 2 main types of variables
categorical and quantitative
what are categorical variables
occur when individual falls into a category
divided into nominal (no ordering eg sex)
ordinal (have an ordering eg pain)
what is the frequency distribution
frequency of the occurrence of different values of a variable
what is relative frequency
frequency expressed as a proportion of the total frequency
how are interval scale variables graphically presented
histograms or box plots
what is a box plot
5 point summary of the data consisting of the minimum, 1st quartile, median, 3rd quartile and maximum valueas
what do summary statistics attempt to capture
a typical value (the location) or the spread (or dispersion)
what 2 measurements are used for location
mean and median
what is mean
sum of all the observations divided by the total number of observations
what is the median
middle value if a sample is arranged in increasing order. approx 50% of the sample is less than the median and 50% is greater than the median
what summary statistics measure the spread
range, interquartile range, variance, standard deviation, coefficient of variation
what is the range
difference between the largest and smallest observations in the sample - not recommended as it severely affected by outlying observations