Benchmark info Flashcards
describe statistics
the science of developing methods for collecting, analyzing, interpreting, and presenting data
what is the main goal of statistics
to use information from a sample to make inferences about the population
what is a variable?
a measurable attributs that can vary across entities and is represented by a column
what is an observation?
the complete set of recorded values for each entity, typically encapsulated in a single row of dataset
what is correlation?
it measures the association between two variables
when should you use mean and how is it affected by outliers?
- use with symmetric distributions without extreme values
- mean is highly sensitive to outliers
when should you use median and how is it affected by outliers?
- use for skewed distributions or data with outliers
- median is less affected bu outliers
when should you use mode and how is it affected by outliers?
- use when identifying the most common category or score
- mode is not influenced by outliers
what is variance in technical terms?
- the average of the squared differences from the mean
- the square of the standard deviation
what statistical tests can be used to find outliers?
z-score and IQR
describe histograms
- displays continuous data
- has touching bars representing frequency across intervals
what does skewness measure?
measure asymmetry of distribution
what does kurtosis measure?
“tailedness” or sharpness of the peak of a distribution
describe a scatterplot
displays the relationship between two numerical variables
what are “measures”?
- quantitative way of representing or summarizing aspects of data
- tools used to describe, analyze, and make sense of data
what is internal consistency?
the extent to which different items that are trying to measure the same variable are truly related and report similar results