Chapter 1: Exploring data Flashcards by Jasper Harrison

Individuals

Individuals are the objects described by a set of data

How well did you know this?

Not at all

Perfectly

Variable

an attribute that describes a person, place, thing, or idea

How well did you know this?

Not at all

Perfectly

Categorial Variable

categorical variables take on values that are names or labels

How well did you know this?

Not at all

Perfectly

Quantitative Variable

quantitative variables are numerical

How well did you know this?

Not at all

Perfectly

Continuous

continuous distribution is one in which data can take on any value within a specified range

How well did you know this?

Not at all

Perfectly

Univariate Data

a study that looks at only one variable

How well did you know this?

Not at all

Perfectly

Bivariate Data

a study that examines the relationship between two variables

How well did you know this?

Not at all

Perfectly

Population

population refers to the total set of observations that can be made

How well did you know this?

Not at all

Perfectly

Sample

a sample refers to a set of observations drawn from a population

How well did you know this?

Not at all

Perfectly

Census

a study that obtains data from every member of a population

How well did you know this?

Not at all

Perfectly

Distribution

The distribution of a statistical data set (or a population) is a listing or function showing all the possible values

How well did you know this?

Not at all

Perfectly

Inference

inference is the process of using data analysis to deduce properties of an underlying distribution of probability

How well did you know this?

Not at all

Perfectly

Frequency Table

when a table shows frequency counts for a categorical variable, it is called a frequency table

How well did you know this?

Not at all

Perfectly

Relative Frequency

Relative frequency = Subgroup count / Total count

How well did you know this?

Not at all

Perfectly

Table

tables showing the values of the cumulative distribution functions, probability functions, or probability

How well did you know this?

Not at all

Perfectly

Roundoff Error

the difference between an approximation of a number used in computation and its exact (correct) value

How well did you know this?

Not at all

Perfectly

Pie Chart

a circular statistical graphic, which is divided into slices to illustrate numerical proportion

How well did you know this?

Not at all

Perfectly

Bar Graph

a chart that plots data using rectangular bars or columns

How well did you know this?

Not at all

Perfectly

Two-way Table

a statistical table that shows the observed number or frequency for two variables

How well did you know this?

Not at all

Perfectly

Marginal Distribution

marginal distribution is the percentages out of totals

How well did you know this?

Not at all

Perfectly

Conditional

Study These Flashcards

conditional distribution is the percentages out of some column

Segmented Bar Graph

Study These Flashcards

a bar graph with two columns. one of them shows a discrete value (i.e. numbers) while the other one compares the values with different bars in different categories

Side-by-side Bar Graph

Study These Flashcards

the bars are split into colored bar segments

Association

Study These Flashcards

any relationship between two measured quantities that renders them statistically dependent

Simpson's Paradox

when we combine all of the groups together and look at the data in aggregate form, the correlation that we noticed before may reverse itself

Dot Plot

a graph for displaying the distribution of numerical variables where each dot represents a value

Shape

symmetric, how many peaks it has, if it is skewed to the left or right, and whether it is uniform

Mode

a number that appears the most amount of times in a set of data

Center

mean or median of the data

Spread

how similar or varied the set of observed values are for a particular variable (data item)

Range

a simple measure of variation in a set of random variables

Outlier

a data point that diverges greatly from the overall pattern of data is called an outlier

Symmetric

a symmetric distribution can be divided at the center so that each half is a mirror image of the other

Skewed Right

fewer observations on the right (toward higher values) are said to be skewed right

Skewed Left

fewer observations on the left (toward lower values) are said to be skewed left

Unimodal

distributions with one clear peak are called unimodal

Bimodal

distributions with two clear peaks are called bimodal

Multimodal

a probability distribution with more than one peak, or “mode"

Stemplot

the entries on the left are called stems; and the entries on the right are called leaves

Splitting Stems

stem-and-leaf plots that have more than 1 space on the stem for the same interval

Back-to-back Stem

back-to-back stem plots are a graphic option for comparing data from two populations

Plots

a graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables

Histogram

columns are positioned over a label that represents a continuous, quantitative variable, and the height of the column indicates the size of the group defined by the column label

Mean

the average of the data

Median

the middle of all the data points collected

Interquartile Range

a measure of variability, based on dividing a data set into quartiles

Five-number

gives information about the location (from the median), spread (from the quartiles) and range (from the sample minimum and maximum) of the observations

Summary

A summary is a brief statement or restatement of main points

Boxplot

a type of graph used to display patterns of quantitative data

Standard deviation

a numerical value used to indicate how widely individuals in a group vary

Variance

a numerical value used to indicate how widely individuals in a group vary

Chapter 1: Exploring data Flashcards

(51 cards)