Week Three - Data Types/Variables/Descriptive Statistics Flashcards
What are the 4 main data types?
categorical/nominal
ordinal
interval
ratio
Define Categorical/Nominal Variables
Discrete
An arbitrary label (eg., male, non-smoker)
Label can be nominal or numerical
Nominal: vanilla, chocolate, strawberry
Numerical: 1, 18, 7
Only valid mathematical operation is counting
Define key characteristics of an Ordinal Scale
Discrete
Inherent order (ranks)
Some information about quantity
Movement along the scale indicates a change in amount, but doesn’t indicate how much change
Can perform logical operations on this scale
(eg., kinder, primary, high, college, bachelor, masters, phd)
Define key characteristics of an Interval Scale
Interval Scales
Order + equal intervals
Continuous (though measurement may not be)
Mathematical operations (addition, subtraction)
How much more (or less) of something is there?
Does not have true zero
If the scale has zero in it, 0 does not mean absence of the thing.
Eg., Temperature (Celsius)
0 ° vs 5° ; 25° vs 30° : difference is 5° (0 ° C does not mean no heat)
Define the key characteristics of Ratio Scales
Order, equal intervals + a true zero
Physical quantities are ratio scale (mass, length, time, etc.)
0 kg = absence of mass; 0 meters = absence of length
Can calculate ratios of different values
50kg is 2X greater than 25kg
What are the 2 forms of discrete variables?
Categorical and ordinal
What is the Mode?
Most commonly occurring value in a set
Sample can have more than one mode
Bimodal = two modal values
Multimodal > two modal values
What is the Median?
Same number of observations below and above the median (middle number)
What is the Mean?
Value around which scores are distributed (average)
What are the 3 most commonly used measures of spread/dispersion?
Range
IQR
Sample SD
Define the ‘Range’. What happens if a range score is an outlier?
Maximum - Minimum
If min and/or max is an outlier, the range overestimates variability in the data
Range tends to increase as sample size increases
Define ‘quartiles’
Quartiles group the data into four ordered, equal groups
What is the lower quartile?
What is the upper quartile?
25% & 75%
What is the IQR?
What does it measure?
Bigger IQR = ?
The difference between Q3 and Q1
IQR measures how the data is spread out
Bigger IQR = greater dispersion
What is variance? What does it measure?
A measurement of the spread between numbers in a data set.
It measures how far each number in the set is from the mean and therefore from every other number in the set.
Variance is roughly the average of the squared difference to the mean