Module 3: Data Visualization Flashcards

1
Q

Data visualization

A

A distribution of a variable is a description of the values it can take on, & a count of how often each value occurs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sample distributions

A

The distribution of a sample we obtain from a population - lists the possible outcomes & the number of times each occurs in this sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

One observation

A

One unit in the sample (or single coin flip result)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Assess the distribution

A

We want to describe the distribution in a useful way to enable subsequent analysis - shape, extreme values, centre, & spread

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Shape of distribution

A

Left/negatively skewed: the mean is less than the median Right/positively skewed: the mean is greater than the median symmetric: the mean & the median are equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Centre of the distribution

A

Use a measure of central tendency: the mean, median or mode. Report the median if you have a skewed distribution - report the mean if you have a symmetric distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Spread of distribution

A

Use a measure of variation: the range, mean absolute deviation(MAD), variance, & standard deviation - for boxplot use the interquartile range (IQR)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Spread (visually)

A

Evaluated visually by looking at the relative heights of the bars

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The 5 number summary

A

-the minimum (min) -Q1(the 25th percentile) -the median (Q2, the 50th percentile -Q3 (the 75th percentile) -the maximum (max)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

EXCEL percentiles

A

Q1=percentile.exc(values, 0.25) - Q2=percentile.exc(values, 0.50) - Q3=percentile.exc(values, 0.75)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Percentiles

A

The pth percentile of a data set is the value such that p percent of the observations are less than or equal to the value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Quartile

A

A quartile of a data set is the 25th percentile (Q1, Q2, Q3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The interquartile range (IQR)

A

A measure of spread of a data set, & is the difference between Q3 & Q1 - IQR= Q3-Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The inner fence

A

1.5 x IQR. - if a sample value is greater than Q3 + inner fence, or less than Q1 - inner fence it is extreme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Lower whisker

A

(Q1 - IF, Q1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Upper whisker

A

Q3, Q3 + IF)

17
Q

Histogram

A

Used to portray the frequency of sample values: the number of times a value within a certain interval occurs on the data set

18
Q

Table of relative frequency

A

A numerical summary of a data set, related to a histogram

19
Q

Class

A

Refers to each bin in the corresponding histogram

20
Q

Frequency

A

The number of occurrences of a class in a data set

21
Q

Relative frequency

A

The fraction of the number of occurrences of a class in a data set, out of the total number of elements on the data set, expressed as a decimal

22
Q

Cumulative relative frequency

A

The relative frequency of each class, added to the relative frequency of all previous classes in the table

23
Q

Contingency table

A

Used to compare the distributions of 2 or more categorical variables