CSCI 343 Quiz 1 Flashcards
continuous data types
float, double
discrete data types
int
categorical data types
specific set, enum (ex: red, orange, yellow; classified, unclassified, part-time)
binary data types
boolean, 0/1, T/F, logical
ordinal data types
categorical with order (ex: rating 1, 2, 3, 4, 5; fresh, soph, jr, sr)
mean (aka average)
sum / count
trimmed mean
drop a few of the high values and the low values and calculate the mean of the remaining (ex: Olympics)
weighted mean (aka weighted average)
sum of all values times corresponding weights divided by sum of weights (ex: calculating grades)
median
middle number (when sorted), if an odd # elements if even # elements, median is the average of the two middle elements
(median/mean) is better for skewed data sets b/c ?
median b/c it won’t include outliers
deviations
difference between the observed values and the median
variance
sum of the squared deviations from the mean, divided by n-1
standard deviation
square root of the variance
range
max - min
percentile
the pth percentile is a value (not necessarily in the data set) such that at least p% of the data items are of this value or less and (100-p)% of the data items are this value or more