Organizing, Describing and Visualizing Data Flashcards

1
Q

Values that can be counted or measured are called _____ data.

A

Numerical (or Quantitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Discrete and Continuous data are types of _____ data.

A

Numerical (or Quantitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data that is countable, such as the months, days, or hours in a year is called _____ data.

A

Discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data that can take any fractional value is called _____ data.

A

Continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data that consist of labels that can be used to classify a set of data into groups is called _____ data.

A

Categorical (or Qualitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Nominal and ordinal data are type of _____ data.

A

Categorical (or Qualitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Labels that cannot be placed in order logically is called _____ data.

A

Nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data that can be ranked in a logical order is called _____ data.

A

Ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A a set of observations taken periodically, most often at equal intervals over time is called _____.

A

Time series

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A set of comparable observations all taken at one specific point in time is called _____.

A

Cross-sectional data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The combination of time series and cross-sectional data, often presented in tables is called _____.

A

Panel data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Time series, cross-sectional, and panel data, organized in a defined way, are examples of _____ data.

A

Structured data (ex: market data, fundamental data, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Information that is presented in a form with no defined structure is refered to as _____ data.

A

Unstructured data (ex: management commentaries, must be transformed into structured data to be analyzed)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A time series is an example of a _____ array.

A

One-dimensional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

For any frequency distribution, the interval with the greatest frequency is referred to as the _____ interval.

A

Modal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The _____ frequency is the percentage of total observations falling within each interval.

A

Relative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The _____ frequency is the number of observations falling within an interval.

A

Absolute

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

A _____ is a two-dimensional array with which we can analyze two variables at the same time.

A

Contingency table (ex: Accidents by intersection and day of week)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

One kind of contingency table is a 2-by-2 array called a _____.

A

Confusion matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

To analyze three variables at the same time, an analyst can create a _____.

A

Scatter plot matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The most effective chart types for visualizing RELATIONSHIPS are _____.

A

Scatter plots, scatter plot matrices, and heat maps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The most effective chart types for COMPARING CATEGORIES are _____.

A

Bar charts, tree maps, and heat maps

23
Q

The most effective chart types for COMPARING OVER TIME are _____.

A

Line charts, dual-scale line charts, and bubble line charts

24
Q

The most effective chart types for visualizing DISTRIBUTIONS of NUMERICAL DATA are _____.

A

Histograms, frequency polygons, and cumulative distribution charts

25
Q

The most effective chart types for visualizing DISTRIBUTIONS of CATEGORICAL DATA are _____.

A

Bar charts, tree maps, and heat maps

26
Q

The mean that excludes a stated percentage of the most extreme observations (ex: discard the lowest 0.5% and the highest 0.5% of the observations) is called the _____ mean.

A

Trimmed

27
Q

The mean that substitute a value for the highest and lowest observations is called the _____ mean.

A

Windsorized

28
Q

The trimmed and winzorized means are used to control for _____.

A

Outliers

29
Q

The midpoint of a data set when the data is arranged in ascending or descending order is called the _____.

A

Median

30
Q

The value that occurs most frequently in a data set is called the _____.

A

Mode

31
Q

The mean to use for estimating the next observation, expected value of a distribution is the _____ mean.

A

Arithmetic

32
Q

The mean to find the compound rate of returns over multiple periods is the _____ mean.

A

Geometric

33
Q

The mean to use for estimating the mean without the effects of a given percentage of outliers is the _____ mean.

A

Trimmed

34
Q

The mean to use for estimating the mean while decreasing the effects of a given percentage of outliers is the _____ mean.

A

Winzorized

35
Q

The mean to use to calculate the average share cost from periodic purchases in a fixed dollar amount is the _____ mean.

A

Harmonic

36
Q

The difference between the third quartile and the first quartile (25th percentile) is known as the _____.

A

Interquartile range

37
Q

To visualize a data set based on quantiles, we can create a _____ plot.

A

Box and whisker

38
Q

The _____ is the distance between the largest and the smallest value in the data set.

A

Range

39
Q

The average of the absolute values of the deviations of individual observations from the arithmetic mean divided by the sample size is called the _____.

A

Mean absolute deviation (MAD)

40
Q

The coefficient of variation (CV) is computed as the _____ of X divided by the _____ of X.

A

Standard deviation, Average value

41
Q

One measure of downside risk that involves choosing a target value against which to measure each outcome and only include deviations from the target value is called _____.

A

Target downside deviation (or Target semideviation)

42
Q

_____ refers to the extent to which a distribution is not symmetrical.

A

Skewness (or Skew)

43
Q

For a _____ distribution, the mean, median, and mode are equal.

A

Symmetrical

44
Q

For a positively skewed, unimodal distribution, the _____is less than the _____, which is less than the _____.

A

Mode, median, mean

45
Q

Among median, mean, and mode, the _____ is the most affected by skewness.

A

Mean

46
Q

_____ is a measure of the degree to which a distribution is more or less peaked than a normal distribution.

A

Kurtosis

47
Q

LEPTOKURTIC describes a distribution that is _____ peaked than a normal distribution, whereas PLATYKURTIC refers to a distribution that is _____ peaked than a normal distribution.

(Fill the blanks with ‘‘more’’ or ‘‘less’’)

A

More, less

48
Q

A LEPTOKURTIC return distribution will have _____ returns clustered around the mean and _____ returns with large deviations from the mean.

(Fill the blanks with ‘‘more’’ or ‘‘less’’)

A

More, more

49
Q

A distribution is said to exhibit _____ if it has either more or less kurtosis than the normal distribution.

A

Excess kurtosis

50
Q

Excess kurtosis = Sample kurtosis − X

Find ‘‘X’’.

A

3

51
Q

_____ is a measure of HOW two variables move together.

A

Covariance

52
Q

_____ measures the STRENGHT of the linear relationship between two random variables.

A

Correlation

53
Q

_____ correlation refers to correlation that is either the result of chance or present due to changes in both variables over time that is caused by their association with a third variable.

A

Spurious