Data Analysis week 2 Flashcards

1
Q

Why would we want to visualize data

A

To show patterns in the data and to summarize large quantities of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When is a datapoint considered as an oulier in R

A

If a point is more than 1.5 interquartile ranges lower than the first or larger than the third quartile.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the difference between a stripchart and a beeswarm

A

The point density of the beeswarm is displayed in a better way. The amount of jitter and the y of the datapoints is adjusted based on the point density. In a stripchart, the datapoints have a random y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What properties of data does a boxplot show

A

The range, the quartiles, and outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the box of the boxplot itself represent

A

The interquartile range (the middle 50% of the data). The left of the box is the first quartile and the right of the box is the third quartile. The middle line is the second quartile, the median.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are three ways of transforming data

A

Binning, log-transform and logit transform

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are four binning methods

A

Equal-width binning, frequancy binning, custom binning and quantile binning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is binning

A

Binning turns numeric data into ordered data and it divides continuous numeric data into intervals (bins)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does log-transform do (especially to the axis)

A

Shows data on a log scale. If the axis says 3.5, the actual value is 10^3.5.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why would you want to transform data

A

It can reveal more information about skewed data that otherwise has ‘weird’ distributions. For example when one extreme outlier hides the rest of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does logit transfrom do

A

Is applied to fractions. Can show small and large fractions in the same graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are three unwritten rules for visualizing data

A

You should maintain graphical integrity, the data-ink ratio should be good and you shouldn’t have chartjunk.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are two points in maintaining graphical integrity

A

You should preserve proportionality and barcharts should start at 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly