Topic 2 - Data and Graphical Summaries Flashcards

1
Q

LO

A

LO3 Produce, interpret and compare graphical and numerical summaries, using base R and ggplot.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

IDA

A

Initial Data Anaylysis
- The first general look at the data without formal

**Involves: **
- Data background
- Data structure
- Data wrangling
- Data summaries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Variables

A
  • Measures/ describes some attribute of the subjects
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Qualitative (categorical)

A

Ordinal
- Has a natural order (numbers)

Nominal
- Has no natural order (colours)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Quantitative (Numerical)

A

Discrete
- Clear space between numbers

Continuous
- Data that falls into constant sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Simple barplot

A

Summarizes 1 qualitative variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Double barplot

A

2 qualitative variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data cleaning

A

Involves changing the format of the data, but not the essence of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Simple Histogram

A

Used for Qualitative data to see how a variable is distributed across different ‘bins’

Standard Histogram and Probability/ Density Histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Density histogram

A
  • The area on the graph = 100%
  • The height is found by dividing the % of subjects by length of bin
  • Density Histograms dont need a y axis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Sliced Histogram

A
  • Slicing by another variable
  • Allows the addition of a qualitative variable

2 histgrams on one graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Simple Box Plot

A
  • Shows distribution of a single Quantitative variable, based on its percentiles
  • ‘Box’ = 50% of the data
  • Lines outside the box = 50% of data

IQR represents the length of the box plot

Upper threashold = 75% + 1.5 x IQR
Lower threashold = 25% - 1.5 x IQR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Comparative boxplots

A

Adds a new qualitative variable
- eg. Gender

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Simple scatterplot

A

Used when there are 2 Quantitative variables
- For x and y axes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Filtered scatterplot

A

Adding a qualitative variable to a simple scatterplot, can add many new variables if wanted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly