Lecture 8 Flashcards

1
Q

Multivariate Data

A

Univariate: Analysis are made only based on one variable
Bivariate: Analysis are made based on two variables
Multivariate: Analysis are made based on more than two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Point Based Techniques

A

Project records from an n-dimensional data space to an arbitrary k-dimension display space , such that data records map to k-dimensional point
For each record, a graphical representation or mark is drawn at its associated k-dimensional point.

This can be achieved in two ways:
Scatterplots and Scatterplot Matrices
Force based technique

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Scatterplots and Scatterplot Matrices

A

The choice of visual analysis in scatter plots consists of:

Dimension sub setting – Allowing the user to select a subset of the dimension

Dimension reduction – Using techniques such as principal component analysis to transform the high-dimensional data to data of lower dimension

Dimension embedding – Mapping dimension to other graphical attributes besides the position such as color, size or shape

Multiple Display – Showing, either superimposed or juxtaposed, several plots each of which contains some of the dimension.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Scatterplots and Scatterplot Matrices (Continued)

A

Scatterplot matrix uses multiple display
This consists of a grid of scatterplots, with the grid having N-Squared cells, where N is the number of dimension.
Thus, every pair wise plot will be shown twice, differing by a 90-degree rotation.
This can be understood clearly from the scatterplot matrix on the next slide, which shows the plot of a very famous dataset – iris_dataset. It has four variable, sepal_length, sepal_width, petal_length and petal_width. This plot is used to identify three types of iris flowers – Setosa, Versicolor and Virginica

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Study Scatterplots and Scatterplot Matrices Graphic

A

Do it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Force Based Method

A

The key goal for projecting high dimensional points to 2D and 3D display is to maintain the dimensional features and characteristics of data throughout projections.
This, however, is not always possible when the dimension of data is very high.
Even though, we use some force based scaling to reduce the data.
Multidimensional scaling(MDS) is one of the method to do so.
The stress, difference between the properties of original dimension and scaled dimension is also calculated at the end of this process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Study Force Based Method Graphic

A

Do it Slide 8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Line Based Technique

A

In line-based method, points corresponding to a particular record or dimension are linked together with straight or curved lines.

These lines not only reinforce the relationship among the data values, but also convey perceivable features of the data via slopes, curvature, crossings etc.

Popular line based technique to represent multivariate data are:
Line Graphs
Parallel Coordinates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Line Graphs

A

A line graph is a univariate visualization technique but it can be extended to multivariate data either by superimposing or juxtaposing the visual representation of individual variable.

For a modest number of data dimension, the line plot can be drawn on a common set of axes, differentiating the dimensions using color, line style, width or other graphical attributes.

As the dimension increases, or the dimensions have significant overlap, superimposing becomes more problematic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Study Line Graphs Graphics

A

Slides 11 and 12

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Study Parallel Coordinates Coordinates Graphic

A

Slide 14

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Parallel Coordinates

A

Used extensively for multivariate data analysis
The basic idea is that axes, rather than being orthogonal, are parallel, with evenly spaced vertical or horizontal lines representing a particular ordering of the dimension.
A data point is plotted as a polyline that crosses each axis at a position proportional to its value for that dimension.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Region Based Techniques

A

In region-based techniques, filled polygons are used to convey values, based on shape, size, color, or other attributes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Heat Map

A

Heatmaps are created by displaying the table of record values using color rather than text.

For this visualization technique, all data values are mapped to the same normalized color space, and each is rendered as a colored square or rectangle.

Using different colors enhances the usefulness of this technique.

It has the option to reorganize rows and columns to expose features of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly