Descriptive Statistics Flashcards
Statistics
both the science of uncertainty and the technology of extracting information from data
- involves collecting, organizing, analyzing, interpreting, and presenting data
- a summary measure of data
Descriptive Statistics
methods of describing and summarizing data using tabular, visual, and quantitative techniques
Discrete Metrics
derived from counting something
- a delivery is on time or not
- not infinite possibilities
Continuous Metrics
based on a continuous scale of measurement
- involving dollars, length, time, volume, or weight
- ex: 1.1 hours of time
Frequency Distribution
a table that shows the number of observations in each of several non-overlapping groups
What is a graphical depiction of a frequency distribution?
histogram
Frequency Distributions for Categorical Data
categorical variables naturally define the groups in a frequency distribution (histograms)
Population
ALL items of interest
- all subscribers to Netflix
Sample
A subset of the population
- a list of people who watched a true crime on Netflix in past year
- cheaper, quicker, availability
Averages
X̅
N=
population number of items
n=
sample number of items/observations
Median
middle value when data are arranged from least to greatest
Mode
observation that occurs most frequently
Measures of Dispersion
degree of variation in the data
- numerical spread of the data
Range
the difference between the maximum and minimum value in the data set
- affected by outliers
Variance
a statistical measurement of the spread between numbers in a data set
- average of squared deviations of the mean
Standard Deviation
square root of the variance
Empirical rules 68
Approx 68% fall within one standard deviation of the mean u-o, u+o
Empirical rules 95
Approx 95% fall within two standard deviations u-2o, u+2o
Empirical rules 99.7
Approx 98% fall within three standard deviations u-3o, u+3o
Measures of Association
Two variables have a strong statistical relationship with one another if they move together
When two variables appear to be related=
cause-and-effect relationship
Covariance
a measure of the DIRECTION of the linear relationship between two variables, X and Y
The covariance between X and Y is the average of the product of the deviations
of each pair of observations from their respective means
Correlation
is a measure of the STRENGTH and DIRECTION of the linear relationship between two variables, X and Y
Statistical Thinking
philosophy of learning based on:
- all work occurs in a system of interconnected processes
- variation exists in all processes