Exploring Data and R Flashcards

1
Q

a preliminary exploration of the data to better understand its characteristic.

A

Data Exploration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

are numbers that summarize properties of the data.

A

Summary Statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

is the percentage of time the value occurs.

A

Frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

is the most frequent attribute value.

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

2 MEASURES OF LOCATION

A
  1. Mean
  2. Median
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

is the most common measure of the location of a set of points.

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

alternative of mean since it is very sensitive to outlier.

A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

2 WAYS TO MEASURE SPREAD

A
  1. Range
  2. Variance of Standard Deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

is the difference between max and min.

A

Range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

is the most common measure of the spread of a set of points.

A

Variance of Standard Deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

is the conversion of data into a visual or tabular format so that the characteristics of the data and the relationships among data items or attributes can be analyzed or reported.

A

Visualization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

12 VISUALIATION TECHNIQUES / METHODS

A
  1. Representation
  2. Arrangement
  3. Selection
  4. Histogram
  5. Box Plots
  6. Two Dimensional Histograms
  7. Scatter Plots
  8. Contour Plots
  9. Matrix Plots
  10. Parallel Coordinates
  11. Star Plot
  12. Chernoff Faces
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

is a visualization technique which is the mapping of information to a visual format.

A

Representation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

is the placement of visual elements within a display.

A

Arrangement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

is the elimination or the deemphasis of certain objects and attributes.

A

Selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

usually shows the distribution of values of a single variable.

A

Histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

simplified version of a PDF/histogram.

A

Box Plots

18
Q

shows the joint distribution of the values of two attributes.

A

Two Dimensional Histograms

19
Q

attributes values determine the position.

A

Scatter Plots

20
Q

useful when a continuous attribute is measured on a spatial grid. They partition the planes into regions of similar values.

A

Contour Plots

21
Q

can plot a data matrix.

A

Matrix Plots

22
Q

used to plot the attribute values of high-dimensional data.

A

Parallel Coordinates

23
Q

similar approach to parallel coordinate, but axes radiate from a central point.

A

Star Plot

24
Q

approach associates each attribute with a characteristic of a face.

A

Chernoff Faces

25
Q

is a language use statistics system. It is an environment within which many classical and modern statistical techniques have been implemented. for

A

R

26
Q

Who developed R

A

Ross Ihaka & Robert Gentlemen

27
Q

is a powerful and productive 3rd party user interface for R.

A

RStudio IDE

28
Q

RSTUDIO USER INTERFACE

A
  • Console Pane
  • Source Pane
  • Environment Pane
  • Files Pane
29
Q

this is where you can type and execute command.

A

Console Pane

30
Q

a text editor or the script window where you can edit and save a collection of command.

A

Source Pane

31
Q

contains object like dataset loaded into R as well as history of all commands executed.

A

Environment Pane

32
Q

open files, view plots, install and load packages.

A

Files Pane

33
Q

is used for storing data tables. It is a list of vectors of equal length.

A

Data Frames

34
Q

2 PLOTTING COMMANDS

A
  • High-Level Plotting Function
  • Low-Level Plotting Function
35
Q

is a plotting commands that creates a new plot on the graphics device.

A

High-Level Plotting Function

36
Q

is a plotting commands that adds more information to an existing plot, such as extra points, lines, and labels.

A

Low-Level Plotting Function

37
Q

is the most frequently used plotting function.

A

plot() Function

38
Q

offers a powerful graphics language for creating elegant and complex plots.

A

ggplot2 Package

39
Q

Hadley Wickham

A

created the ggplot2 package.

40
Q

is where ggplo2 package was based on.

A

Grammar of Graphics