Chapter 1: Exploring data Flashcards

1
Q

Individuals

A

Individuals are the objects described by a set of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variable

A

an attribute that describes a person, place, thing, or idea

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Categorial Variable

A

categorical variables take on values that are names or labels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Quantitative Variable

A

quantitative variables are numerical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Continuous

A

continuous distribution is one in which data can take on any value within a specified range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Univariate Data

A

a study that looks at only one variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Bivariate Data

A

a study that examines the relationship between two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Population

A

population refers to the total set of observations that can be made

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sample

A

a sample refers to a set of observations drawn from a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Census

A

a study that obtains data from every member of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Distribution

A

The distribution of a statistical data set (or a population) is a listing or function showing all the possible values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Inference

A

inference is the process of using data analysis to deduce properties of an underlying distribution of probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Frequency Table

A

when a table shows frequency counts for a categorical variable, it is called a frequency table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Relative Frequency

A

Relative frequency = Subgroup count / Total count

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Table

A

tables showing the values of the cumulative distribution functions, probability functions, or probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Roundoff Error

A

the difference between an approximation of a number used in computation and its exact (correct) value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Pie Chart

A

a circular statistical graphic, which is divided into slices to illustrate numerical proportion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Bar Graph

A

a chart that plots data using rectangular bars or columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Two-way Table

A

a statistical table that shows the observed number or frequency for two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Marginal Distribution

A

marginal distribution is the percentages out of totals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Conditional

A

conditional distribution is the percentages out of some column

22
Q

Segmented Bar Graph

A

a bar graph with two columns. one of them shows a discrete value (i.e. numbers) while the other one compares the values with different bars in different categories

23
Q

Side-by-side Bar Graph

A

the bars are split into colored bar segments

24
Q

Association

A

any relationship between two measured quantities that renders them statistically dependent

25
Q

Simpson’s Paradox

A

when we combine all of the groups together and look at the data in aggregate form, the correlation that we noticed before may reverse itself

26
Q

Dot Plot

A

a graph for displaying the distribution of numerical variables where each dot represents a value

27
Q

Shape

A

symmetric, how many peaks it has, if it is skewed to the left or right, and whether it is uniform

28
Q

Mode

A

a number that appears the most amount of times in a set of data

29
Q

Center

A

mean or median of the data

30
Q

Spread

A

how similar or varied the set of observed values are for a particular variable (data item)

31
Q

Range

A

a simple measure of variation in a set of random variables

32
Q

Outlier

A

a data point that diverges greatly from the overall pattern of data is called an outlier

33
Q

Symmetric

A

a symmetric distribution can be divided at the center so that each half is a mirror image of the other

34
Q

Skewed Right

A

fewer observations on the right (toward higher values) are said to be skewed right

35
Q

Skewed Left

A

fewer observations on the left (toward lower values) are said to be skewed left

36
Q

Unimodal

A

distributions with one clear peak are called unimodal

37
Q

Bimodal

A

distributions with two clear peaks are called bimodal

38
Q

Multimodal

A

a probability distribution with more than one peak, or “mode”

39
Q

Stemplot

A

the entries on the left are called stems; and the entries on the right are called leaves

40
Q

Splitting Stems

A

stem-and-leaf plots that have more than 1 space on the stem for the same interval

41
Q

Back-to-back Stem

A

back-to-back stem plots are a graphic option for comparing data from two populations

42
Q

Plots

A

a graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables

43
Q

Histogram

A

columns are positioned over a label that represents a continuous, quantitative variable, and the height of the column indicates the size of the group defined by the column label

44
Q

Mean

A

the average of the data

45
Q

Median

A

the middle of all the data points collected

46
Q

Interquartile Range

A

a measure of variability, based on dividing a data set into quartiles

47
Q

Five-number

A

gives information about the location (from the median), spread (from the quartiles) and range (from the sample minimum and maximum) of the observations

48
Q

Summary

A

A summary is a brief statement or restatement of main points

49
Q

Boxplot

A

a type of graph used to display patterns of quantitative data

50
Q

Standard deviation

A

a numerical value used to indicate how widely individuals in a group vary

51
Q

Variance

A

a numerical value used to indicate how widely individuals in a group vary