Organizing, Visualizing, and Describing Data Flashcards

1
Q

Data that is measured or counted

A

Numerical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

2 types of numerical data

A

Continuous and discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data that can be measured and can take on any value in a range of values

A

Continuous numerical data - FV of an investment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Numerical data that result from a counting process

A

Discrete numerical data - the frequency of discrete compounding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

2 types of data

A

Numerical and Categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data that describe a characteristic or quality

A

categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Other names for Numerical and Categorical data

A

Quantitative and Qualitative data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Categorical data not amenable to a logical order

A

nominal - stock sectors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Categorical data able to be logically ordered

A

ordinal data - ratings for investment funds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

science of dealing with collection, analysis, interpretation, and presentation of numerical data

A

statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  1. the study of how large datasets can be effectively summarized
  2. studies of central tendency and variation of data
A

descriptive statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

making extrapolations, estimates, forecasts about a large group from a smaller group

A

statistical inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

the complete group (objects, persons, items of interest) being studiued

A

population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

a portion of the group being studied

A

sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

parameter vs statistic

A

a descriptive measure of a population vs a sample, respectively (p&s)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

even distance between (consecutive) numbers

comment on zero

A

interval
zero is arbitrary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

multiple data units at a given time

A

cross-sectional data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

one unit of data across multiple time aliquots

A

time-series data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

data that is patterned vs unpatterned

A

structured vs unstructured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

examples of structured data

A

market data - stock prices

fundamental data - financial statement data

analytics - cash flows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

examples of unstructured data

A

produced by individuals - social media, posts, web searches

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

rank measures from more useful to less useful (interval, nominal, ordinal, ratio)

A

ratio, interval, ordinal, nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

format for representing one variable

A

one-dimensional array

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

format for representing more than one variable via rows/columns

A

two-dimensional array

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is another name for a two-dimensional array?

A

data table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

another name for a frequency distribution

A

one-way table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

tool for summarizing data into groups or bins for display

A

frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

GICS stands for

A

Global Industry Classification Standard

30
Q

real or actual frequency

A

absolute frequency

31
Q

frequency as a percent number of observations

A

relative frequency

32
Q

interval data where zero is an absolute number

A

ratio

33
Q

raw data or non-summarized data

A

ungrouped data

34
Q

data in a frequency distribution

A

grouped data

35
Q

depiction of frequency distribution

A

histogram

36
Q

also known as a 2 way table

A

contingency table

37
Q

displays 2 or more categorical variables

A

contingency table

38
Q

frequency at the intersection of a particular row and column

A

joint frequency

39
Q

sums of joint frequencies

A

marginal frequency

40
Q

2x2 contingency table in matrix form revealing actual and fake predictions within classes

A

confusion matrix

41
Q

histogram with line graph showing relative frequencies

A

frequency polygon

42
Q

frequency polygon with cumulative frequencies

A

ogive

43
Q

circular depiction of data as a percent

A

pie chart

44
Q

steps of creating a frequency distribution

A
  1. sort into ascending order
  2. range
  3. choose # of bins (k)
  4. bin width = range/k
  5. place the observations in the bins
  6. construct a table of bins from smallest to largest
45
Q

test of association between 2 categorical variables

A

chi-square test

46
Q

arranges data by left digit and right digit to present data concentrations

A

stem and leaf plot

47
Q

used in quality control to tally qualitative issues

A

pareto chart

48
Q

2-variable numeric chart used to show correlation

A

scatter plot

49
Q

measures of where data tends to cluster

A

measures of central tendency

50
Q

mathematical average influenced by outliers

A

mean

51
Q

middle value in an array, not affected by the magnitude of extreme values

A

median

52
Q

most frequent value (2 or more frequent values in data,

A

mode

53
Q

central tendency measurement commonly used with ordinal data

A

median

54
Q

central tendency measurement commonly used with interval/ratio data

A

mean

55
Q

central tendency measurement commonly used with nominal data

A

mode

56
Q

measures of how spread out data is

A

measures of dispersion

57
Q

sum of the absolute values of differences between observation and sample mean

A

mean absolute deviation (MAD)

58
Q

sum of the squared differences between the sample and the mean

A

variance

59
Q

measures variability of the dataset

A

variance

60
Q

percentage of variation with respect to the mean

A

coefficient of variation

61
Q

if data is in a roughly normal distribution, than it will be deposited in certain areas

A

empirical rule

62
Q

describes how much of a distribution is off center

A

skewness

63
Q

describes relationship of tails of a distribution to its center

A

kurtosis

64
Q

describe the mean and median in a normal distribution

A

they are even

65
Q

tall and skinny distribution

A

leptokurtic

66
Q

wide and flat distribution

A

platykurtic

67
Q

range between quartiles (50% of the middle distribution)

A

interquartile range

68
Q

what is coefficient of variation (CV) used for?

A

to compare datasets with different scales

69
Q

what is population vs sample coefficient of variation

A

pop CV = sigma / mu
sample CV = s / x bar

70
Q
A