Descriptive Statistics (B) Flashcards
what is the difference between bar graphs and histograms?
- bar graphs are good when data is in categories
- histograms deal with continuous data
what is continuous data/
- can be measured
- has an infinite scale
- e.g. temperature
- opposite of discrete data
what is a histogram
displays the frequency density of X-values occurring in each class interval of a frequency table
as a series of rectangular bars
what is/how do you work out frequency density?
- measures of how many observations there are per unit of X
- width of a bar is proportional to the width of class interval and the height (area) of the bar is proportional to the density (frequency)
what do you normally use standard deviation for?
- in comparative terms, e.g. one data set is more variable than another, hard to see whether one is high or low
- make claims about the proportion of data values we expect to find within a certain number of standard deviations from the mean
what is the empirical rule?
also known as the 68,95,99.7 rule
68 refers to 68%, amount of data you’ll find will fall within this area
95 wil fall between two standard deviation
99.7 will fall between three standard deviations
creates bell shaped frequency distribution
when can you use the empirical rule?
only the you have normal distribution
what is the coefficient of variation?
with larger means there is normally larger standard deviations, which is hard to compare
it is the ratio of the mean to standard deviation
measure of relative variability
it is independent of units of measurement
the higher the variation the greater level of dispersion around mean
when is the coffeictent of variation useful?
- when you want to compare results from two surveys in which use two measures, e..g tests with different scoring systems
- same variable over time
- international comparisons
how do you work out the coefficient of variation?
standard deviation / mean
what is skewness?
measures the degree of symmetry of a distribution
degree of distortion around the symmetrical bell curve
histogram is useful graphical display for plotting frequency of values against a numerical scale
which way does positive skew go?
left
which way does a negative skew?
right
what is the relative position of mean and median when distribution is symmetrical?
mean = median
what is the relative position of mean and median when data is positively skewed?
mean is greater than the median
what is the relative position of mean and median when data is negatively skewed?
mean is less than the median
what are the ways in which you measure skewness?
1) persons coffectient of skewness (PCS)
2) Bowleys coefficient of skewness (BCS)
what is PCS?
Pearsons coefficient of skewness
work out if the skewness is positive or negative
how do you work out PCS?
= 3(mean-median) /SD
what are the results of PCS?
ranges from -3 to +3
0 = symmetrical (normal)
values outside -1 to +1 can be regarded as highly skewed
when can you not use PCS?
you you don’t know anything in formula
only know quartile, if you have extreme outliers
what is BCS?
more robust that PCS
use when know quartiles or have outliers
focuses on middle 50% of data
what is the equation for BCS?
[(Q3-median) - (median - Q10] / (Q3-Q1)
what are the results for BCS?
if the upper quartile is further from the median than lower then positive skew
0> p
<0 n
what is a scatter plot?
displays the relationship between two continuous variables
why are scatter plots useful?
useful in the early stage of analysis when exploring data and determining whether a linear regression analysis is appropriate
may show outliers
what are the measures of association?
- scatter plot
- correlation coefficient
what does measure of associations show?
analysis of relationships between to variables..
form the next logical step beyond decretive data analysis
what are the measures of correlation?
- pearsons correlation coefficient
- spearmans rank
what does Pearsons correlation show?
measure of how slowly the point on a scatter plot lie on a straight line (range -1 to +1)
association between two numbers
if two numbers move in the same direction = positive correlation
if one moves up and one down = negative correlation
useful for comparative purposes
unit free
what is the calculation for persons correlation coefficient?
covariance (X,Y) / SD(X) x SD(Y)
what is covariance?
shows how two variable X and Y change with eahcothwr (move together)
how do you work out covariance?
sigma
(X-mean of X) (Y-Mean of Y) / n-1
how can you interpret Pearsons correlation?
theoretical / textbook perspective
(-) 0.7 = strong
(-) 0.5) = moderate
(-) = weak
Cohens effect sizes
0.1 = small
0.3 = medium
>0.5 = large