Chapter 7-8 Flashcards
Compared to the mean, why is the median beneficial?
The median is not influenced by outliers
The mode
is the value that occurs
most commonly; may not be useful
in noninteger continuous data if no
values occur more than once.
Experimental
error:
imprecision • Mistakes • Natural variation • Any kind of variation not explained by treatments
Recall that bias is
caused by
any
factor that
consistently alters
the results.
Biased data are
not….
accurate
Precise means
reproducible
The 50th percentile is the
median
The value of the 25th percentile subtracted from the
value of the 75th percentile is the
interquartile range
Scatter plot: shows
every value More information as long as not too many • Shows exactly how the data are distributed
Box-and-whiskers plots
give a sense of the distribution of data without showing every value. • Good if too many values for a scatter plot.
‘Boxes’ usually represent
the 25th to 75th percentiles
(i.e. the interquartile
range) but can have
different meanings.
Whiskers of then represent
5th and 95th percentiles,
but look carefully at figure descriptions.
The cumulative
frequency
distribution compared to histogram
does not require bins; that the lines are artificially connecting the measuredvalues
Any adjustment of
the data is a great
way to introduce bias.
• Eliminating “impossible” values • Accounting for biased measurement opportunities • Normalizing or standardizing data • Smoothing to make trends more visible (e.g. rolling average) • Ok for graphs but never for statistical calculations
Interval Variables
• The difference of one unit means the same thing at all possible values. • e.g. degrees Celsius • However, the meaning of zero could be arbitrary.• e.g. degrees Celsius 100°C is not twice as hot as 50°C.