Vol. 1 LM2 Data Visualization Flashcards
Benefit
word cloud
Data Visualization
allows us to quickly perceive the most frequent terms among the given text to provide information about the nature of the text
identify data visualization
a chart to depict the frequency of unstructured data—particularly, textual data
Data Visualization
word cloud
identify data visualization
it consists of a set of colored rectangles to represent distinct group, and the area of each rectangle is proportional to the value of the corresponding group
Data Visualization
tree-map
identify data visualizaton
alternative form for presenting the joint frequency distribution of two categorical variables
Data Visualization
stacked bar chart
identify data visualizaton
two categorical variables
Data Visualization
- enhanced version of the bar chart, grouped bar chart
- also known as a clustered bar
Concept
The frequency distribution of categorical data can be plottetd in this type of graph
Data Visualization
bar chart
Describe
line chart
Data Visualization
- is a type of graph used to visualize ordered observation
- a line chart is often used to display the change of data servies over time
identify data visualization
a type of graphy for visualizing the joint variation in two numerical variables
Data Visualization
scatter plot
identify data visualization
is a useful tool for organizing scatter plots between pairs of variables, making it easy to inspect all pairwise relationships in one combined visual
Data Visualization
scatter plot matrix
Identify data visualization
is a type of graphic that organizes and summarizes data in a tabular gormat and represents them using a color spectrum
Data Visualization
heat map
Evaluating data visuals
You are examining a scatter plot of monthly stock returns, similar to the one in Exhibit 31, for two technology companies: one is a hardware manufac- turer, and the other is a software developer. The scatter plot shows a strong positive association among their returns.
Describe what other information the scatter plot can provide.
Data Visualization
Besides the sign and degree of association of the stocks’ returns, the scatter plot can provide a visual representation of whether the association is linear or non-linear, the maximum and minimum values for the return observations, and an indication of which observations may have extreme values (i.e., are potential outliers).
What to explore or present
questions: explore comparison
Data Visualization
comparison among categories vs. over time
What to explore or present
explore comparison: among categories
Data Visualization
- bar chart
- tree map
- heat map
What to explore or present
explore comparison: over time
Data Visualization
- line chart (two variables)
- bubble line chart (three variables)
What to explore or present
question: explore distribution
Data Visualization
is the distribution over:
* numerical data
* categorical data
* unstructured data
What to explore or present
explore distribution: numerical data
Data Visualization
- histogram
- frequency polygon
- cumulative distribution chart
What to explore or present
explore distribution: categorical data
Data Visualization
- bar chart
- tree map
- heat map
What to explore or present
explore distribution: unstructured data
Data Visualization
word cloud
What to explore or present
present a relationship
Data Visualization
p. 98
- scatter plot
- scatter plot matrix
- heat map
Selecting visualization types
Explain which type of chart can best provide a quick view of trading volume for the given period.
A portfolio manager plans to buy several stocks traded on a small emerging market exchange but is concerned whether the market can provide sufficient liquidity to support her purchase order size. As the first step, she wants to analyze the daily trading volumes of one of these stocks over the past five years.
P. 99
The five-year history of daily trading volumes contains a large amount of numerical data. Therefore, a histogram is the best chart for grouping these data into frequency distribution bins and for showing a quick snapshot of the shape, center, and spread of the data’s distribution.
Selecting visualization types
An analyst is building a model to predict stock market downturns. Accord- ing to the academic literature and his practitioner knowledge and expertise, he has selected 10 variables as potential predictors. Before continuing to construct the model, the analyst would like to get a sense of how closely these variables are associated with the broad stock market index and wheth- er any pair of variables are associated with each other.
P. 99
To inspect for a potential relationship between two variables, a scatter plot is a good choice. But with 10 variables, plotting individual scatter plots is not an efficient approach. Instead, utilizing a scatter plot matrix would give the analyst a good overview in one comprehensive visual of all the pairwise associations between the variables.
Selecting visualization types
Central Bank members meet regularly to assess the economy and decide on any interest rate changes. Minutes of their meetings are published on the Central Bank’s website. A quantitative researcher wants to analyze the meeting minutes for use in building a model to predict future economic growth.
p. 99
Since the meeting minutes consist of textual data, a word cloud would be the most suitable tool to visualize the textual data and facilitate the researcher’s understanding of the topic of the text as well as the sentiment, positive or negative, it may convey.