Unit6Vocabulary Flashcards

Question 1

Q

Skewed Left

Answer

A

Also known as negatively skewed, the bulk of the data items are clustered on the positive end of a graph with the long tail to the left.

Question 2

Q

Mean

Answer

A

The average value of all the data in a dataset. Calculated by adding up the values of all data items and then dividing by the number of items in the dataset.

Question 3

Q

z-score

Answer

A

A value indicating the number of standard deviations a data item is from the mean of its dataset.

Question 4

Q

Box and Whisker Plot

Answer

A

A graphical representation of the five number summary.

Question 5

Q

Upper Quartile

Answer

A

The median of the upper half of a dataset.

Question 6

Q

Bivariate

Answer

A

Two datasets used to measure correlation.

Question 7

Q

Strong Positive Correlation

Answer

A

Indicated by a correlation coefficient as defined below:

{ r | 0.7 < r < 1 }

Question 8

Q

Weak Negative Correlation

Answer

A

Indicated by a correlation coefficient as defined below:

{ r | -0.1 < r < -0.3 }

Question 9

Q

Weak Positive Correlation

Answer

A

Indicated by a correlation coefficient as defined below:

{ r | 0.1 < 0.3 }

Question 10

Q

Maximum

Answer

A

The largest data value in a dataset.

Question 11

Q

Neutral Positive Correlation

Answer

A

Indicated by a correlation coefficient as defined below:

{ r | 0.4 < r < 0.6 }

Question 12

Q

Correlation Coefficient

Answer

A

A statistical measure of how linear a bivariate dataset is. Typically represented with a lowercase r:

{ r | -1 < r < 1 }

Question 13

Q

Lower Quartile

Answer

A

The median value of the lower half of a dataset.

Question 14

Q

Skewed Right

Answer

A

Also known as positively skewed, the bulk of the data items are clustered on the negative end of a graph with the long tail to the right.

Question 15

Q

Histogram

Answer

A

A graphical representation of the clustering of a dataset based on a specified bin width and the number of data items within each bin.

Question 16

Q

Bell Curve

Answer

A

A graphical representation of the spread of a normal dataset indicating 1, 2, and 3 standard deviations from mean.

Question 17

Q

Median

Answer

A

The middle data item in a dataset. When the number of items is even, the median is calculated by taking the middle 2 terms and averaging them.

Question 18

Q

Standard Deviation

Answer

A

A statistical measure of the average distance the data items within a dataset are from the mean.

Question 19

Q

Strong Negative Correlation

Answer

A

Indicated by a correlation coefficient as defined below:

{ r | -0.7 < r < -1 }

Question 20

Q

Causation

Answer

A

In a bivariate data analysis, high correlation is often cited as an indication of a causal relationship. Causation is when it is proven that one thing causes a change in another thing. Correlation does not imply causation.

Question 21

Q

No Correlation

Answer

A

Indicated by a correlation coefficient near or equal to zero.

Question 22

Q

Five Number Summary

Answer

A

A measure of a dataset’s spread and distribution accomplished by partitioning the data into quarters:

Minimum
Lower Quartile
Median
Upper Quartile
Maximum

Question 23

Q

Minimum

Answer

A

The data item with the smallest value in a dataset.

Question 24

Q

Interquartile Range

Answer

A

The difference between the upper and lower quartiles of a dataset.

Question 25

Q

Outlier

Answer

A

A data item within a dataset whose value is far from the bulk of the other data item’s values.

Question 26

Q

Normal Distribution

Answer

A

A dataset whose histogram maps closely to a bell curve.

Question 27

Q

Mode

Answer

A

The number(s) that appear the most in a dataset. If all items appear only once, then there is no mode defined for that dataset.

Question 28

Q

Neutral Negative Correlation

Answer

A

Indicated by a correlation coefficient as defined below:

{ r | -0.4 < r < -0.6 }

Question 29

Q

Data Spread

Answer

A

A measure of the range of a dataset.

Question 30

Q

Percentile Ranking

Answer

A

The percentage of data items whose values are less than the item being ranked.

Question 31

Q

Range

Answer

A

The width of a dataset’s values. It is calculated as the difference between the maximum and minimum of a dataset.

Question 32

Q

Data Distribution

Answer

A

A measure of a datasets clustering and spread.