Week 5 - Visualising Variability Flashcards

1
Q

What is a random variable?

A

A quantity with values not known with certainty.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define variation in statistics.

A

The difference in a variable measured over observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does a frequency distribution describe?

A

The values of a variable and how often they appear in the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a categorical variable?

A

Data consisting of labels or names for which arithmetical manipulation is impossible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a quantitative variable?

A

Data consisting of numerical values for which arithmetical manipulation is possible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a sample?

A

A subset of a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is relative frequency?

A

The proportion of items belonging to a class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How is percent frequency calculated?

A

Relative frequency multiplied by 100.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does a probability distribution characterize?

A

The variability of a random variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does Benford’s law state?

A

In many data sets, the proportion of observations in which the first digit is 1, 2, 3, 4, 5, 6, 7, 8, 9 follows a specific distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define a histogram.

A

A column chart with no spaces between the columns, representing frequency of bins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the recommended number of bins in a histogram?

A

Between 5 and 20, depending on the number of observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How is approximate bin width calculated?

A

Largest value minus smallest value divided by the number of bins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a kernel density chart?

A

A continuous alternative to a histogram that uses kernel density estimation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does skewness represent in a distribution?

A

The lack of symmetry in a quantitative distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a frequency polygon?

A

A visualization tool that uses lines to connect the counts of observations in bins.

17
Q

What is a trellis display?

A

A vertical or horizontal arrangement of individual charts that differ only by the data they display.

18
Q

Define mean.

A

Sum of the values divided by the sample size.

19
Q

How is the median determined?

A

Average of the two middle points if the sample size is even; middle number if odd.

20
Q

What is mode?

A

The most frequent value in a data set.

21
Q

How is range calculated?

A

Largest value minus smallest value in the set.

22
Q

What is standard deviation?

A

Based on average deviation from the mean.

23
Q

Define percentile.

A

The pth percentile is a value that exceeds p% of the observations in the set.

24
Q

What is the interquartile range?

A

Q3 minus Q1.

25
Q

What is statistical inference?

A

The process of collecting sample data to make estimates or draw conclusions about a population.

26
Q

What is a confidence interval?

A

A parameter estimate such as the mean or proportion of a population.

27
Q

What does margin of error represent?

A

The uncertainty on the parameter estimate at a given confidence level.

28
Q

What factors influence the margin of error for a confidence interval on a mean?

A
  1. The confidence level
  2. The variability of sample values (s.d.)
  3. The sample size
29
Q

What is time series data?

A

A sequence of observations on a variable measured at successive points in time.

30
Q

What is a time series chart?

A

A line chart with time units on the horizontal axis and variable values on the vertical axis.