Summarizing Data Flashcards
Definition of quantitative
Anything that can be measured
Definition of continuous
Any value valid between a range
Definition of discrete
Data can only take certain values
Normally integers
Definition of categorical
Individuals fall into different groups
Definition of dichotomous/binary
2 categories
Definition of ordered
+2 categories which are related
Definition of unordered/nominal
+2 categories which are unrelated
Why summaries data
Monitor data quality
Check for invalid/missing entries
Describe characteristics of participants in a study
Before a complex analysis
What are the 2 types of quantitative data
Continuous
Discrete
What are the 3 types of categorical data
Dichotomous
Ordered
Unordered/nominal
Can reclassify quantitative data into categories for ease of reporting
What are the 2 ways of summarizing continuous data
Center of data
Mean
Median
What are the 4 ways of summarizing continuous data
Spread of data
Range
SD
Variance (SD^2)
IQR (used if data skewed)
What is the formula for SD
√(∑(x-x)^2)/n-1 = SD
How would you summarize categorical nominal data
Frequencies in each category
Proportion or %
Avoid excess use of DP
How would you summarize categorical ordinal data
Frequencies in each categories
Proportion or %
Cumulative proportion/%
What are the 2 graphical ways of displaying continuous data
Histograms
Box plots
How do you interpret histograms for continuous data
Shape of distribution =range, middle
Areas in rectangles = proportional to no in category
How do you interpret box plots
Median = horizontal line in box UQ = top edge of box LQ = lower edge of box Max = top of whisker Min = bottom of whisker . = outliers
Describe the 3 shapes of distribution
Symmetric
+ve skew
-ve skew (less common)
Describe the percentages associated with SDs
1SD = 68% of data within +- 1SD 2SD = 95% of data within +- 2SD
Why are bar charts used
Quicker to understand than a table
Shows frequency or % in each category
Why are pie charts used
Why are they not ideal
Size of slice = frequency
Popular but bar charts and tables preferred
Hard to compare size of slices