Vocabulary Flashcards

1
Q

Individuals

A

The objects described by a set of data. Individuals may be people, animals, or things.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variable

A

Any characteristic of an individual. A variable can take different values for different individuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Categorical Variable

A

Places an individual into one of several groups or categories.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Quantitative Variable

A

Takes numerical values for which it makes sense to find an average.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Discrete Variables

A

If a variable can take on any value between its minimum value and its maximum value, it is called a continuous variable; otherwise, it is called a discrete variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Continuous

A

If a variable can take on any value between its minimum value and its maximum value, it is called a continuous variable; otherwise, it is called a discrete variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Univariate Data

A

When we conduct a study that looks at only one variable, we say that we are working with univariate data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Bivariate Data

A

When we conduct a study that examines the relationship between two variables, we are working with bivariate data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Population

A

In statistics, population refers to the total set of observations that can be made.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sample

A

In statistics, a sample refers to a set of observations drawn from a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Census

A

A census is a study that obtains data from every member of a population. In most studies, a census is not practical, because of the cost and/or time required.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Distribution

A

A variable tells us what values the variable takes and how often it takes these values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Inference

A

The process of using data analysis to infer properties of an underlying distribution of probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Frequency Table

A

Displays the counts of stations in each format category.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Relative Frequency Table

A

Shows the percents of stations in each format category.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Round-off Error

A

Each percent is rounded to the nearest tenth. The exact percents would add to 100, but the rounded percents only come close. This is round-off error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Pie Chart

A

Pie charts show the distribution of a categorical variable as a “pie” whose slices are sized by the counts or percents for the categories. A pie chart must include
all the categories that make up a whole.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Bar Graph

A

Bar graphs represent each category as a bar. The bar heights show the category counts or percents. Bar graphs are easier to make than pie charts and are also easier to read.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Two-Way Table

A

Examining the counts or percents in various categories for one of the variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Marginal Distribution

A

One of the categorical variables in a two-way table of
counts is the distribution of values of that variable among all individuals described by
the table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Conditional

A

the values of that variable among individuals who have a specific value of another variable. There is a separate conditional distribution for each value of the other variable.

22
Q

Distribution

A

A distribution in statistics is a function that shows the possible values for a variable and how often they occur.

23
Q

Segmented Bar Graph

A

A graph of frequency distribution for categorical data set. Each category is represented by a segment of the bar and the segment is proportional to the corresponding frequency or relative frequency.

24
Q

Side-by-side Bar Graph

A

In a side-by-side bar chart, the bars are placed next to each other. Because they are placed next to each other you can easily compare their heights.

25
Q

Association

A

We say that there is an association between two variables if knowing the value of one variable helps predict the value of the other. If knowing the value of one variable does not help you predict the value of the other, then there is no association between
the variables.

26
Q

Simpson’s Paradox

A

Simpson’s paradox, which also goes by several other names, is a phenomenon in probability and statistics in which a trend appears in several groups of data but disappears or reverses when the groups are combined.

27
Q

Dotplot

A

Each data value is shown as a dot above its location on a number line.

28
Q

Shape

A

Measures of shape describe the distribution (or pattern) of the data within a dataset.

29
Q

Mode

A

The mode is the most frequently appearing value in a population or sample.

30
Q

Center

A

The “center” of a data set is also a way of describing location. The two most widely used measures of the “center” of the data are the mean (average) and the median.

31
Q

Spread

A

The measure of how far the numbers in a data set are away from the mean or the median.

32
Q

Range

A

In statistics, the range of a set of data is the difference between the largest and smallest values.

33
Q

Outlier

A

An individual value that falls outside the overall pattern.

34
Q

Symmetric

A

A distribution is roughly symmetric if the right and left sides of the graph are approximately mirror images of each other.

35
Q

Skewed Right

A

A distribution is skewed to the right if the right side of the graph (containing the half of the observations with larger values) is much longer than the left side.

36
Q

Skewed Left

A

It is skewed to the left if the left side of the graph is much longer than the right side.

37
Q

Unimodal

A

A single peak.

38
Q

Bimodal

A

Two clear peaks.

39
Q

Multimodal

A

More than two clear peaks.

40
Q

Stemplot

A

Another simple graphical display for fairly small data sets is a stemplot (also called a stem-and-leaf plot). Stemplots give us a quick picture of the shape of a distribution while including the actual numerical values in the graph.

41
Q

Splitting Stems

A

Stem-and-leaf plots that have more than 1 space on the stem for the same interval.

42
Q

Back-to-back Stem

A

Back-to-back stemplots are a graphic option for comparing data from two populations. The center of a back-to-back stemplot consists of a column of stems, with a vertical line on each side. Leaves representing one data set extend from the right, and leaves representing the other data set extend from the left.

43
Q

Plots

A

A graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables.

44
Q

Histogram

A

One very common graph of the distribution

of a quantitative variable is a histogram.

45
Q

Mean

A

A mean score is an average score, often denoted by X. It is the sum of individual scores divided by the number of individuals. Thus, if you have a set of N numbers ( X1 , X2 , X3 , . . . XN ), the mean of those numbers would be defined as:

X = ( X1 + X2 + X3 + . . . + XN ) / N = [ Σ Xi ] / N

46
Q

Median

A

The median is a simple measure of central tendency. To find the median, we arrange the observations in order from smallest to largest value. If there is an odd number of observations, the median is the middle value. If there is an even number of observations, the median is the average of the two middle values.

47
Q

Interquartile Range

IQR

A

The interquartile range, also called the midspread, middle 50%, or H‑spread, is a measure of statistical dispersion, being equal to the difference between 75th and 25th percentiles, or between upper and lower quartiles, IQR = Q₃ − Q₁.

48
Q

Five-Number

Summary

A

The five-number summary is a set of descriptive statistics that provides information about a dataset.

49
Q

Boxplot

A

A type of chart often used in explanatory data analysis. Box plots visually show the distribution of numerical data and skewness through displaying the data quartiles (or percentiles) and averages.

50
Q

Standard Deviation

A

The standard deviation of a random variable, sample, statistical population, data set, or probability distribution is the square root of its variance.

51
Q

Variance

A

The expectation of the squared deviation of a random variable from its mean.