Data Exploration and Visualisation Flashcards

1
Q

What is the primary goal of data exploration and visualization?

A

To understand your data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two main categories in data exploration?

A
  1. Data visualization
  2. Summary statistics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is data visualization important in AI?

A

It communicates complex information effectively and helps identify summaries, structures, relationships, differences, and abnormalities in data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What did F.J. Anscombe emphasize in 1973 about data analysis?

A

“Make both calculations and graphs. Both sorts of output should be studied; each will contribute to understanding.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Name one experiment related to design elements of graphs.

A

In the 1980s, William Cleveland and Robert McGill measured how accurately humans perceive quantitative information from different graphical cues.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What types of chart are suitable for showing relationships?

A

Scatter plots and line charts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When should pie charts be used?

A

To show the composition of categorical data as a proportion of the whole.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why is learning the grammar of graphics important?

A

It helps create and think about new, improved graphics, providing a theoretical foundation instead of relying on special cases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the ggplot2 package in R used for?

A

Creating plots using the Grammar of Graphics framework.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How can you install the ggplot2 package in R?

A

or

install.packages(“tidyverse”)
or
install.packages(“ggplot2”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is feature transformation?

A

Mapping a set of values for a feature to a new set of values to simplify data representation for analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are some common data quality issues?

A

Missing values
Duplicate data
Inconsistent data
Noise
Outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Name one example of data wrangling.

A

Removing data with missing values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a dataset in the context of diamonds?

A

A dataset containing prices and attributes of diamonds, like carat, cut, color, clarity, and dimensions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly