V6 Flashcards

ggplot2

1
Q

pros of ggplot 2 compared to other packages in R

A
  • is one of the most elegant and most versatile
  • layered, customisable plots
  • implements the grammar of graphics, a coherent system for describing and building graphs
  • faster by learning one system and applying it in many places
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

format needed for ggplot2

A
  • must be in data.frame

- more rows than columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

important terms :

  • data
  • aesthetics
  • geometries
  • facets
  • statistics
  • coordinates
  • themes
A
  • data: dataset being plotted
  • aesthetics: scales onto which we map our data
  • geometries: visual elements used for our data
  • facets: plotting small multiples
  • statistics: representation of our data to aid understanding
  • coordinates: space on which the data will be plotted
  • themes: all non-data ink.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

plot basics

A
  • all ggplot2 plots with a call to ggplot(), supplying default data and aesthetic mappings, specified by ads()
  • you then add layers, scales, coords and facets with +
  • to save a plot to disk, use ggsave()
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

layer: geoms

A
  • a layer combines data, aesthetic mapping, a geom(geometric object), a stat(statistical transformation), and a position adjustment
  • typically you will create layers using a geom_ function overriding the default position and stat if needed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

layer: stats

A

stat_function
statistical transformation rather than the visual appearance
e.g. -> str.identity() -> leave data as is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

layer: position adjustment:

A

resolves overlapping gems; overrides the default of the geom_ or stat_ function
e.g. position_jitter() -> jitter points to avoid overplotting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

layer: annotations

A

special types of layer: they don’t inherit global settings from the plot. they are used to add fixed reference data to plot
e.g. annotate() -> create an annotation layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

layer: scales

A

control the details of ow data values are translated to visual properties
e.g. scale_colour_continuous() -> apply a continuous colour scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

bar graphs - bar heights ?

A
  • two different things that the heights of bars represent
  • the count of cases for each group -> stat_bin()
  • the value of a column in the data set -> stat_identity() leaves the y values unchanged

-> default is stat_bin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

make basic bar graph with ggplot (with value not number of cases in each group)

add or remove colour and legend + black outline

add x, y main labels

A

library(ggplot2)

# simple plot
ggplot(data=dat, aes(x=time, y=total_bill)) + geombar(stat ="identity)
# add colour and legend + black outline
ggplot(data=dat, aes(x=time, y=total_bill, fill = time)) + geombar(colour="black", stat ="identity)
# remove legend (if redundant)
\+ guides(fill=FALSE)

add x, y, main labels
+ xlab(“”)
+ ylab(“”)
+ggtitle(“”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

code count of cases in bar graph

A

+ geom_bar(stat = “count”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how to code bar graph with multiple variables bar

time : xaxis
sex: color fill
total bill: y-axis

how to change colour

A

ggplot(data=tips, aes(x=time, y=total_bill, fill=sex)) + geom_bar(stat=”summary”,position = position_dodge())
-> make 2 bars right next to each other for each sex

# change colour
\+ scale_fill_manual(values=c("",""))
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

code line graph

A

ggplot(data=tips, aes(x=time, y=total_bill, fill=sex)) + geom_line(stat=”summary”, size = 1.5) +
geom_point(stat=”summary”, size = 3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how to add different symbols into plot

A

+ scale_shape_manual(values=c(“º”, “Ω”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how to add error bars

A

+ geom_errorbars(aes(ymin=total_bill-sd, ymax=total_bill+sd),
width =0.1, position=position_dodge(1))

17
Q

draw a histogram with black outline and white fill

draw a density curve

A

ggplot(dat, aes(x=rating)) +
geom_histogram(binwidth=0.5, colour = “black”, fill = “white”)

ggplot(dat, aes(x=rating)) + geom_density()

18
Q

histogram add mean line

A

ggplot(dat, aes(x=rating)) +
geom_histogram(binwidth=0.5, colour = “black”, fill = “white”) + geom_vline(aes(xintercept = mean(rating, na.rm=T)),
color =”red”, livetype = “dashed”, size = 1

19
Q

overlaied histograms (two seethrough

A

ggplot(dat, aes(x=rated, fill=cond)) + geom_histogram(binwidth=.5, aplha=.5, position = “identity”)

20
Q

interleaved histograms

A

ggplot(dat, aes(x=rating, fill=cond)) + geom_histogram(binwidth=.5, position=”dodge”)

21
Q

make a basic boxplot

A

ggplot(dat, aes(x=cond, y=rating)) + geom_boxplot()

22
Q

scaterplot points and with linear regression

A
  1. ggplot(dat, es(x=xvar, y=yvar)) + geom_point(shape=1)
  2. ggplot(dat, aes(x=xvar, var)) + genom_point(shape = 1) + geom_smooth(method=lm) -> by default includes 95% confidence interval
23
Q

how to alter theme (the way it looks / lines)

A

theme(plot.title = element_text(lineheight=.8, face=”bold”))

24
Q

facets

A

can divide by levels
-> levels of sex

sp(the normal plotting stiff) + facet_grid(. ~sex)

25
Q

how to divide vertically or horizontally

A

sp + facet_grid(sex~day)