Research Flashcards
Name two measures of central tendency.
mean and median.
Write the equation for the mean.
x
Write the equation for the median.
x
Explain the components of the Box plot
x
Write the equation for variance.
x
What is variance?
It is the average squared distance from the mean.
Write the equation for standard deviation.
x
What is standard deviation?
It generally tells us how spread out the numbers - are they tightly clustered around the mean or are they far from the mean.
Why is standard deviation a more desirable measure of spread than variance?
Standard deviation is often a more desirable measure of spread than variance because we are left with non-squared units which may be easier to interpret. It generally tells us how spread out the numbers - are they tightly clustered around the mean or are they far from the mean.
In what category does covariance fall?
Measures of Association
What is covariance used for?
Describes how one sample varies with respect to another.
Write the equation for covariance.
x
What type of plot would be useful to visually estimate the amount of linear correlation of a dataset?
scatterplot
What is correlation used for?
Describes how one sample varies with respect to another.
Write the equation for correlation.
x
Why is correlation typically better than covariance?
The units of covariance makes it’s value difficult to interpret. Correlation statistics are generally more easy to interpret than covariance.
How do you interpret the results of correlation?
Values close to 1 are highly positively correlated which values close to -1 are highly negatively correlated. Values close to 0 show little to no correlation.
Qualitatively describe the correlation equation.
The covariance of the datasets divided by the standard deviation of both datasets.
What can a histogram be used for?
To understand how the data is distributed - IE normal, skewed…
Describe a histogram. How would you know if the data set followed a normal distribution?
A histogram is a frequency diagram which graphs the number of times a value (or range of values) has occurred. If the data is normally distributed then there is an increased frequency of events at the mean with decreasing frequencies with distance from the mean in either direction. It follows the classic “bell curve” pattern.