Vol. 1 LM2 Data Types Flashcards by John Waller

Concept

are data that can be measured or counted quantities as a number

p. 61

numerical data
OR
quantitative data

How well did you know this?

Not at all

Perfectly

Describe

numerical data

p. 61

are data that can be measured or counted quantities as a number

How well did you know this?

Not at all

Perfectly

Concept

are data that can be measured and can take on any numerical value in a specified range of values

continuous data

How well did you know this?

Not at all

Perfectly

Describe

continuous data

p. 61

are data that can be measured and can take on any numerical value in a specified range of values

How well did you know this?

Not at all

Perfectly

Concept

are numerical values that result from a counting process.

p. 61

discrete data

How well did you know this?

Not at all

Perfectly

Describe

discrete data

p. 61

are numerical values that result from a counting process.

How well did you know this?

Not at all

Perfectly

Concept

are categorical values that are not amenable to being organized in a logical order

p. 61

nominal data

How well did you know this?

Not at all

Perfectly

Describe

nominal data

p. 61

are categorical values that are not amenable to being organized in a logical order

How well did you know this?

Not at all

Perfectly

Concept

are categorical values that can be logically ordered or ranked

p. 62

ordinal data

How well did you know this?

Not at all

Perfectly

identify data type

Cash dividends per share paid by a public company. Note that cash divi- dends are a distribution paid to shareholders based on the number of shares owned.

p. 63

Cash dividends per share are continuous data since they can take on any non-negative values.

How well did you know this?

Not at all

Perfectly

Identify data type

Credit ratings for corporate bond issues. As background, credit ratings gauge the bond issuer’s ability to meet the promised payments on the bond. Bond rating agencies typically assign bond issues to discrete categories
that are in descending order of credit quality (i.e., increasing probability of non-payment or default).

p. 63

credit ratings are ordinal data

How well did you know this?

Not at all

Perfectly

Identify data type

Hedge fund classification types. Note that hedge funds are investment ve- hicles that are relatively unconstrained in their use of debt, derivatives, and long and short investment strategies. Hedge fund classification types group hedge funds by the kind of investment strategy they pursue.

p. 63

Hedge fund classification types are nominal data. Each type groups together hedge funds with similar investment strategies. In contrast to credit ratings for bonds, however, hedge fund classification schemes do not involve a ranking. Thus, such classification schemes are not ordinal data.

How well did you know this?

Not at all

Perfectly

Another data classification standard is based on how data are collected, and it cate- gorizes data into three types

p. 63

cross-sectional data
time series data
panel data

How well did you know this?

Not at all

Perfectly

Concept

is a characteristic or quantity that can be measured, counted, or categorized and is subject to change.

p. 63

variable

How well did you know this?

Not at all

Perfectly

Describe

variable

p. 63

is a characteristic or quantity that can be measured, counted, or categorized and is subject to change

How well did you know this?

Not at all

Perfectly

Concept

are a sequence of observations for a single observational unit of a specific variable collected over time and at discrete and typically equally spaced intervals of time

p. 64

time-series data

How well did you know this?

Not at all

Perfectly

Describe

time-series data

p. 64

are a sequence of observations for a single observational unit of a specific variable collected over time and at discrete and typically equally spaced intervals of time

How well did you know this?

Not at all

Perfectly

Concept

are a list of observations a specific variable from multiple observational units

p. 64

cross-sectional data

How well did you know this?

Not at all

Perfectly

Describe

cross-sectional data

p. 64

are a list of observations a specific variable from multiple observational units

How well did you know this?

Not at all

Perfectly

Concept

are a mix of time-series and cross-sectional data that are frequently used in financial analysis and modeling.
These data consist of observations through time on one or more variables for multiple observational units

p. 64

panel data

How well did you know this?

Not at all

Perfectly

Concept

the observational data in this data type are usually organized in a matrix format called a data table

p. 64

panel data

How well did you know this?

Not at all

Perfectly

Concept

are highly organized in a pre-defined manner, usually with repeating patterns

p. 64

structured data

How well did you know this?

Not at all

Perfectly

Describe

structured data

p. 64

Study These Flashcards

are highly organized in a pre-defined manner, usually with repeating patterns

Concept

typical format of this type of data is a one-dimensional array or a two-dimensional table or matrix

p. 64

Study These Flashcards

structured data

# Concept are data that do not follow any conventionally organized forms, such as financial news or company filings. ## Footnote p. 65

**unstructured data**

# Describe **unstructured data** ## Footnote p. 65

* are data that do not follow any conventionally organized forms * DAGs are format for dealing with unstructured data * JSONs for semi-structured data

1. Which of the following is most likely to be structured data? A. Social media posts where consumers are commenting on what they think of a company’s new product. B. Daily closing prices during the past month for all companies listed on Japan’s Nikkei 225 stock index. C. Audio and video of a CFO explaining her company’s latest earnings announcement to securities analysts. ## Footnote p. 67

B. Daily closing prices represent structured time-series data

Which of the following statements describing panel data is most accurate? A. It is a sequence of observations for a single observational unit of a specific variable collected over time at discrete and equally spaced intervals. B. It is a list of observations of a specific variable from multiple observational units at a given point in time. C. It is a mix of time-series and cross-sectional data that are frequently used in financial analysis and modeling. ## Footnote p. 67

C. it is a mix of time-series and cross-sectional data

Which of the following data series is least likely to be sortable by values? A. Daily trading volumes for stocks listed on the Shanghai Stock Exchange. B. EPS for a given year for technology companies included in the S&P 500 Index. C. Dates of first default on bond payments for a group of bankrupt European manufacturing companies.

C. dates are ordinal data that can be sorted by chronological order, but not by value

Which of the following best describes a time series? A. Daily stock prices of the XYZ stock over a 60-month period. B. Returns on four-star rated Morningstar investment funds at the end of the most recent month. C. Stock prices for all stocks in the FTSE100 on 31 December of the most recent calendar year. ## Footnote p. 67

A. a time series is a sequence of observations of a speicific variable collected over time (60 months)

# Concept data available in their original format, typically unusable by humans or computers ## Footnote p. 67

**raw data**

# Concept the simplest format for representing a collection of data of the same data type, which is suitable for a single variable ## Footnote p. 68

**one-dimensional array** ex. vectors

# Concept summarizes central tendency and spread variation in the data's distribution ## Footnote p. 68

**descriptive statistics**

# Describe **descriptive statistics** ## Footnote p. 68

summarizes central tendency and spread variation in the data's distribution

# Concept is a tabular display of data constructed either by counting the observations of a variable by dinstict values or groups or by tallying the values ## Footnote p. 71

**frequency distribution**

# steps Constructing a frequency distribution of a categorical variable ## Footnote p. 71

1. count the number of observations for each unique value of the variable 2. construct a table listing each unique value and the corresponding counts, and then sort the records

# Concept the raw frequency that is the actual number of observations counted for each unique value ## Footnote p. 71

**absolute frequency**

# Describe **absolute frequency**

the raw frequency that is the actual number of observations counted for each unique value

# Concept is calculated as the absolute frequency of each unique value of the variable divided by the total number of observations

**relative frequency**

# Describe **relative frequency**

is calculated as the absolute frequency of each unique value of the variable divided by the total number of observations

# pitfalls binning data and constructing intervals ## Footnote p. 74

* if we use too few bins, we wil summarize too much and may lose pertinent characteristics * if we use too many bins, we may not summarize enough, and potentially introduce noise into the data

# Concept adds up the absolute frequencies as we move from the first bin to the last bin ## Footnote p. 74

**cumulative absolute frequency**

# Describe **cumulative absolute frequency** ## Footnote p. 74

* adds up the absolute frequencies as we move from the first bin to the last bin * for the last bin, the cumulative absolute frequency will equal the number of observations in the dataset

# Concept is a sequence of partial sums of the relative frequencies ## Footnote p. 74

**cumulative relative frequency**

is a sequence of partial sums of the relative frequencies

# Concept is a tabular format that displays the frequency distributions of two or more categorical variables simulatneously and is used for finding patterns between the variables ## Footnote p. 77

**contingency table**

# Concept a **contingency table** for two categorical variables ## Footnote p. 77

two-way table

# Concept A **contingency table** having *R* levels of one variable in rows and *C* levels of the other variable in columns ## Footnote p. 77

*R* x *C* table

# name the data representation ## Footnote p. 78

*5* x *3* contingency table

# Name the data type ## Footnote p. 78

**joint frequencies**

# Name the data type ## Footnote p. 78

**marginal frequencies**

# Name the table ## Footnote p. 80

**Confusion Matrix** for Bond Default Prediction Model

# Describe **chi-square test of independence** ## Footnote p. 80

* A way to test for a potential association between categorical variables * the procedure involves constructing a **contingency table** * the actual values and expected values are used to derive the chi-square test statistic

# Concept the actual values and expected values from a contingency table are used to derive this value ## Footnote p. 80

chi-square test statistic

Describe how the contingency table is used to set up a test for independence between fund style and risk level. ## Footnote p. 81

Vol. 1 LM2 Data Types Flashcards

(55 cards)