Summer Vocabulary Flashcards
Individuals
Objects described by a set of data. Individuals may be people, animals, or things.
Variable
Any characteristic of an individual. A variable can take different values for different individuals.
Categorical Variable
Places an individual into one of several groups or categories.
Quantitative Variable
Takes numerical values for which it makes sense to find an average.
Discrete Variables
Variable that cannot take on any value between its minimum value and its maximum value.
Continuous
Variable that can take on any value between its minimum value and its maximum value.
Univariate Data
Data gathered from a study that looks at only one variable.
Bivariate Data
Data gathered from a study that looks at two variables.
Population
The total set of observations that can be made.
Sample
A set of observations drawn from a population.
Census
A study that obtains data from every member of a population.
Distribution
Tells us what values the variable takes and how often it takes these values.
Inference
Drawing conclusions about a population based on a sample of the data.
Frequency Table
A table that shows frequency counts for a categorical variable.
Relative Frequency
A measure of the number of times an event occurs for a subgroup, divided by the number of times an event occurs for the total population.
Table
An arrangement of data in rows and columns for the use of data analysis.
Roundoff Error
The difference between an approximation of a number used in computation and its exact value.
Pie Chart
Displays the distribution of a categorical variable as percents, or slices, of a pie.
Bar Graph
Displays the distribution of a categorical variable by plotting columns and rows.
Two-Way Table
Examines relationships between categorical variables.
Marginal Distribution
The distribution of values of that variable among all individuals described by the table.
Conditional
Distribution of relative frequencies in the body of a two-way table.
Distribution
A listing showing all the possible values of the data and how often they occur.
Segmented Bar Graph
A type of bar graph where columns are stacked and total to 100% of the discrete value.
Side-by-side Bar
A type of bar graph where columns are grouped into pairs to compare two categorical values.
Graph
Diagram showing relation between variables.
Association
Knowing the value of one variable helps predict the value of the the other.
Simpson’s Paradox
An effect that occurs when the marginal association between two categorical variables is qualitatively different from the partial association between the same two variables.
Dotplot
A graphic display used to compare frequency counts within categories or groups, made up of dots plotted on a graph.
Shape
Describes the distribution of data in terms of symmetry, peaks, and skews.
Mode
The most commonly observed value in a set of data.
Center
The middle of a distribution, often measured by the mean or median of a data set.
Spread
The extent to which a distribution is stretched.
Range
The difference between the lowest and highest values.
Outlier
A data point that differs significantly from other observations.
Symmetric
A type of distribution where the left side of the distribution mirrors the right side.
Skewed Right
A measure of asymmetry in which the data has a tail on the right side of the distribution.
Skewed Left
A measure of asymmetry in which the data has a tail on the left side of the distribution.
Unimodal
Distribution with one clear peak or most frequent value.
Bimodal
Distribution with two clear peaks or most frequent values.
Multimodal
Distribution with more than two clear peaks or most frequent values.
Stemplot
A way to plot data where the data is split into stems (the largest digit) and leaves (the smallest digits).
Splitting Stems
Separating the quantitative values into two different digits when displaying them in a stem and leaf plot.
Back-to-back Stem
Used for numerical data where two sets of data use the same set of stems.
Plots
A diagram that displays data for data analysis.
Histogram
A graphical display of data using bars of different heights, grouping numbers into ranges.
Mean
The average value of a set of data.
Median
The measure of central tendency, represented by the middle value in a set of data.
Interquartile Range
A measure of variability, based on dividing a data set into quartiles.
Five-Number Summary
An analysis consisting of the most extreme values in the data set, the lower and upper quartiles, and the median.
Boxplot
A type of graph used to display patterns of quantitative data that splits the data set into quartiles.
Standard Deviation
A numerical value used to indicate how widely individuals in a group vary, measured by how greatly from the mean a value is.
Variance
A numerical value used to indicate how widely individuals in a group vary.