Describing Data Flashcards
What does descriptive statistics do?
Helps to organise and summarise data in easily communicable mannger.
What are measures of central tendency?
Mean
Median
Mode
Is the mean or median more affected by extreme values?
Mean
What makes the mean more accurate?
Higher number of samples
What is the unit of mean the same as?
The unit of original measure
What is a geometric mean?
When individual observations are log transformed, averaged and then back-transformed using antilog
Advantage of geometric mean?
Will be closer to median if log-transformed data had symmetrical distribution
Difference between mean and geometrical mean?
Geometrical mean will be less
What is weighted mean?
Individual values are multiplied by weights (constants) attached to them before averaging
When is weighted mean used?
When some individual observations are more or less valuable than others
Another name for the median?
50th percentile
What data is median preferable for?
Nominal data when treated as values (not as counts)
What does 5th percentile mean?
The value below which 5% of observations lie
What type of data is mode mostly used for?
Nominal
When can mode be useful for ordinal data?
To understand most common rating obtained
In which type of distribution are the mean, mode and median equal?
Normal, symmetric distribution
Where will median lie in skewed distribution?
Between mean and mode
What happens to mean in positive skew?
Mean will be higher than median
Name some measures of variability
Range
Variance
SD
SE
What is range?
Difference between highest and lowest scores in a distribution
What is the interquartile range?
Difference between 75th and 25t percentile
Why does variance give more information than the range?
Includes scores in a distribution
Formula for variance
Sum of squared differences of individual observations from mean/(number of observations - 1)
What is degrees of freedom?
N-1
When is variance high?
When scores are widely scattered
How is variance expressed?
In squared units of the original measure
What is the formula for SD?
Square root of variance
What is the most commonly used measure of dispersion?
SD
What is coefficient of variation a measure of?
Relative spread of data
How does one calculate the coefficient of variation?
Sd / mean
Unit of coefficient of variation?
Percentage
Formula of SE?
SD / square root of sample size
What leads to smaller SE?
Larger sample
What do authors use SE for?
To describe variability of sample
What does SE give estimate of?
How the mean of the sample is related to the mean of the population
Precision and uncertainty of how study sample represents population
What does SD estimate?
Variability in study sample
What does SE tell us of the mean?
How precise our estimate of the mean is
Graphs used for categorical and discrete numerical data
Bar chart
Pie chart
Graphs for continuous data
Histogram
Dot plot
Scatter diagram
Difference between bar chart and histogram
No gaps between bars so data is continuous
How to draw a dot plot
Dot placed for each observation along one axis
When does dot plot become a scatter gram?
When dot plot is extended to two axes
What measures can be plotted on a scattergram?
Two continuous measures
What happens in a steam and leaf plot?
Plot first few digits of numerical observation along vertical axis
Then add numbers to one or both sides to represent individual values of observations
What is a box whisker plot?
Rectangle drawn encompassing 2nd and 3rd quartile of observations
Median value is the line cutting through the rectangle
What do whiskers in box whisker plot show?
Minimum and maximum values of observation