L14, L16- Statistics and Distribution Flashcards
describe general addition rule (of probabilities)
P(A or B) = P(A) + P(B) - P(A and B)
OR (if A/B are mutually exclusive)
P(A or B) = P(A) + P(B)
describe multiplication rule (of probabilities)
P(A and B) = P(A given B)P(B)
OR (if A/B are independent)
P(A and B) = P(A)P(B)
list types of categorical data
- nominal data
- ordinal data
list types of continuous data
- interval data (equal distance between data points)
- ratio data (interval data with a true 0)
explains the two basic descriptions for numerical data
Measures of Central Tendency: continuous data, mean, median, mode
Measures of Variation: spread of data / distribution
(T/F) every dataset has one mode
F- there can be zero, one, or multiple modes of a dataset
list the measures of variation
- range
- variance
- standard deviation (sqrt(var))
- quartiles
define IQR
inter quartile range (Q3 - Q1)
differentiate between Probability Density and Probability Distribution
Density- for continuous variables, defined as area under the curve (integral)
Distribution- for categorical variables, defined as sum of all possible outcomes
define µ and x̅ (in statistics)
µ- population mean
x̅- sample mean
define Standard Normal Distribution
µ = 0 σ = 1
above µ has +Z-values, below has -Z values
area under the curve = probability
what is the formula to convert normal distribution to a standard normal distribution
Z = (X - µ) / σ
to assure normality of data (normality assumption), (1) and (2) are used to confirm
1- histogram
2- boxplot