Data Vis: Plotting Flashcards
plot()
creates a scatter plot, the most used plotting function with many possible parameters to set.
pch =
When you create a plot with plot() (or points with points()), you can specify the type of symbol with the pch argument. You can specify the symbol type in one of two ways: with an integer, or with a string. If you use a string (like “p”), R will use that text as the plotting symbol. If you use an integer value, you’ll get the symbol that correspond to that number. See Figure for all the symbol types you can specify with an integer.
hist()
Histograms are the most common way to plot a vector of numeric data. To create a histogram we’ll use the hist() function. The main argument to hist() is a x, a vector of numeric data. If you want to specify how the histogram bins are created, you can use the breaks argument.
hist(x = ChickWeight$weight,
main = “Chicken Weights”,
xlab = “Weight”,
xlim = c(0, 500))
barplot()
A barplot typically shows summary statistics for different groups. The primary argument to a barplot is height: a vector of numeric values which will generate the height of each bar.
barplot(height = 1:5, # A vector of heights
names.arg = c(“G1”, “G2”, “G3”, “G4”, “G5”), # A vector of names
main = “Example Barplot”,
xlab = “Group”,
ylab = “Height”)
clustered bar plot
If you want to create a clustered barplot, with different bars for different groups of data, you can enter a matrix as the argument to height. R will then plot each column of the matrix as a separate set of bars.
barplot(height = swim.data,
beside = TRUE, # Put the bars next to each other
legend.text = TRUE, # Add a legend
col = c(transparent(“green”, .2),
transparent(“red”, .2)),
main = “Swimming Speed Experiment”,
ylab = “Speed (in meters / second)”,
xlab = “Clothing Condition”,
ylim = c(0, 4))
pireateplot
contained within the “yarrr” package, shows raw data, descriptives, and inferential statistics in one plot
yarrr::pirateplot(formula = weight ~ Time, # dv is weight, iv is Diet
data = ChickWeight,
main = “Pirateplot of chicken weights”,
xlab = “Diet”,
ylab = “Weight”)
more on pirateplots
these are infinitely customizable, check chapter 11 for all of the examples
customization should be for function, not style
low level plotting functions
allow you to add elements like points, or lines, to an existing plot
low-level plotting functions
points()
To add new points to an existing plot, use the points() function. The points function has many similar arguments to the plot() function, like x (for the x-coordinates), y (for the y-coordinates), and parameters like col (border color), cex (point size), and pch (symbol type).
low-level plotting functions
abline()
Add horizontal line at mean height
adds straight lines to a plot
plot(x = pirates$weight,
y = pirates$height,
xlab = “weight”,
ylab = “height”,
main = “Adding reference lines with abline”,
pch = 16,
col = gray(.5, .2))
abline(h = mean(pirates$height),
lty = 2) # Dashed line
low-level plotting functions
segments()
creates straight lines with defined start and end points
low-level plotting functions
lm()
Add a regression line to a scatterplot
adds a regression line
plot(x = pirates$height,
y = pirates$weight,
pch = 16,
col = transparent(“purple”, .7),
main = “Adding a regression line to a scatterplot()”)
abline(lm(weight ~ height, data = pirates),
lty = 2)
low-level plotting functions
text()
With text(), you can add text to a plot. You can use text() to highlight specific points of interest in the plot, or to add information (like a third variable) for every point in a plot
plot(1,
xlim = c(0, 10),
ylim = c(0, 10),
type = “n”)
text(x = c(1, 5, 9),
y = c(9, 5, 1),
labels = c(“Put”, “text”, “here”))
text() arguments
paste()
Create the plot
combines text and numbers in a plot environment
plot(x = ChickWeight$Time,
y = ChickWeight$weight,
col = gray(.3, .5),
pch = 16,
main = “Combining text with numeric scalers using paste()”)
abline(h = mean(ChickWeight$weight),
lty = 2)
text(x = 3,
y = mean(ChickWeight$weight),
labels = paste(“Mean weight =”,
round(mean(ChickWeight$weight), 2)),
pos = 3)
saving plots
Step 1: Call the pdf command to start the plot
pdf(file = “/Users/ndphillips/Desktop/My Plot.pdf”, # The directory you want to save the file in
width = 4, # The width of the plot in inches
height = 4) # The height of the plot in inches
plot(x = 1:10,
y = 1:10)
abline(v = 0) # Additional low-level plotting commands
text(x = 0, y = 1, labels = “Random text”)
dev.off()
arranging multiple plots in same plotting space
R makes it easy to arrange multiple plots in the same plotting space. The most common ways to do this is with the par(mfrow) parameter, and the layout() function
par(mfrow)
Plot 1
The mfrow and mfcol parameters allow you to create a matrix of plots in one plotting space.
If you have four plots you want to cluster together, you start those lines of code with a par(mfrow) function to signal that you want everything that follows to be clustered
par(mfrow = c(2, 2)) # Create a 2 x 2 plotting matrix
#The next 4 plots created will be plotted next to each other
hist(rnorm(100))
plot(pirates$weight,
pirates$height, pch = 16, col = gray(.3, .1))
pirateplot(weight ~ Diet,
data = ChickWeight,
pal = “info”, theme = 3)
boxplot(weight ~ Diet,
data = ChickWeight)
complex layouts with the layout() function
While par(mfrow) allows you to create matrices of plots, it does not allow you to create plots of different sizes. In order to arrange plots in different sized plotting spaces, you need to use the layout() function. Unlike par(mfrow), layout is not a plotting parameter, rather it is a function/object all on its own.
We’ll begin by creating the layout matrix, this matrix will tell R in which order to create the plots:
layout.matrix <- matrix(c(0, 2, 3, 1), nrow = 2, ncol = 2)
layout.matrix
## [,1] [,2]
## [1,] 0 3
## [2,] 2 1
Looking at the values of layout.matrix, you can see that we’ve told R to put the first plot in the bottom right, the second plot on the bottom left, and the third plot in the top right. Because we put a 0 in the first element, R knows that we don’t plan to put anything in the top left area.
adding background colors to plots
To change the background color of a plot, add the command par(bg = col) (where col is the color you want to use) prior to creating the plot. For example, the following code will put a light gray background behind a histogram:
par(bg = gray(.9)) # Create a light gray background
hist(x = rnorm(100), col = “skyblue”)