Chapter 3 Flashcards
It is the practice or science of collecting and analyzing numerical data in large quantities especially for the purpose of inferring proportions in a hole from those in a representative sample
Statistics
It is used in almost all analysis in order to derive insights like projections and how different sets of data affect each other
Statistics
It is a central or typical value for a probability distribution
Measures of central tendency
It may also be called a center or location of the distribution
Measures of central tendency
Central tendency often called
Averages
Measures of central tendency
mean
median
mode
It can be calculated for either a finite set of values or for a theoretical distribution such as the normal distribution
Middle tendency
The tendency of quantitative data to cluster around some central value
Central tendency
It is the sum of all measurements divided by the number of observations
mean
An average of the data
Mean
It is the midpoint of our data that separates the upper and lower half of the data set
Median
These are the only measures of central tendency that can be used for ordinal data, in which values are ranked relative to each other but are not measured absolutely
Median and mode
The most frequent value in the data set
Mode
The only central tendency measure that can be used with nominal data which have purely qualitative category assignments
mode
Also called the variability, scatter, spread
Measures of dispersion
It is the extent to which a distribution is stretch or squeezed
Dispersion
Measures of dispersion
range
mad
variance
standard deviation
The difference between the smallest and largest data point in the set
range
It is the average of the absolute deviations from a central point
Mean absolute deviation
It is a summary statistic of statistical dispersion or variability
Mean absolute deviation
In simpler terms it means how far are the data points from the mean
Mean absolute deviation
This is another way of measuring the spread between numbers in a data set
Variance
It measures how far its number in the set is from the mean
Variance
It is simply the square root of the variance
Standard deviation
It is the most commonly used measure to express dispersion
Standard deviation
Are cut points dividing the range of a probability distribution into continuous intervals with equal probability or dividing the observations in a sample in the same way
Quantiles
Quantiles
quartiles
deciles
percentiles
Divide or cut the data into four parts
Quartiles
Are often used as a measure of spread of the data in what is called interquartile range (IQR)
Quartiles
The difference between the third quartile and the first quartile
Interquartile range
Is the median of the first half of the data set and marks the point at which 25% of the data values are lower and 75% are higher
First quartile
Is the median of the second half of the data set and marks the point at which 25% of the data values are higher and 75% lower
Third quartile
cut into 10
deciles
divide into 1% segments
percentiles
are used for larger data sets
deciles and percentiles
It is a way of standardizing scores on the same scale by dividing a scores deviation by the standard deviation in a data set
Standard score (z-score)
It measures the number of standard deviations a given data point is from the mean
Standard score
What it means if it is a negative z score
value is less than the mean
What it means if it is a positive z score
value is greater than the mean
It represents the ratio of the standard deviation to the mean and it is a useful statistic for comparing the degree of variation from one data series to another even if the means are drastically different from one another
Coefficient of variation
It allows investor to determine how much volatility or risk is assumed in comparison to the amount of return expected from investments
Coefficient of variation
Refers to distortion or a symmetry in a symmetrical bell curve or normal distribution in a set of data
Skewness
What it means if it is a positive skew
The mean is greater than the median
What it means if it is a negative skew
The mean is less than the median
It is a statistical measure that expresses the extent to which two variables are linearly related
Correlation
It is a common tool for describing simple relationships without making a statement about cause and effect
Correlation
Correlation coefficient of positive one indicates a
perfect positive correlation or direct relationship
Correlation coefficient of negative 1 indicates a
Perfect negative correlation or inverse relationships
Correlation coefficient near zero indicates a
No correlation