stats review Flashcards
exploratory data analysis (EDA)
first step in data analysis - allows for understanding of structure, quality and patterns
numerical data types
continuous: a value within a range
discrete: fixed values, usually counts
categorical data types
nominal: unordered categories
ordinal: ordered categories
research process
Develop a scientific question.
Collect data.
Analyze data (EDA first, then statistical tests).
Make conclusions based on findings.
summarizing numerical data
measures of center: mean, median
measures of spread: range, IQR, standard deviation
five number summary: min, Q1, median, Q3, max
summarizing categorical data
summarized using counts and proportions, visualized with bar charts
unimodal
one peak
multimodal
multiple peaks
right skewed
mean > median
left skewed
median > mean
standard deviation and variance
measures how spread out the data is around the mean, larger standard deviation=wider distribution
BRFSS data set
Age distribution: Right-skewed (more younger participants).
Gender: Slightly more females than males.
General health: Most participants reported good to excellent health.
Weight: Some extreme values, but not biologically unrealistic.