Lesson2 RStudio Basics Flashcards
console
lower left, with prompt to type commands
workspace
upper right panel, shows history of commands
R Markdown
Upper left panel, document
Lower right panel
plots show here, browse files, access help, manage packages
R Packages
Collections of R functions, data and compiled code in a well-defined format, we use statsr, dplyr, ggplot2
Install package commands
install.packages and install_github
Command for loading packages in working environment
library
e.g library(dplyr)
You only need to install package once but load each time you relaunch RStudio
How to run commands from either Red file or console
-click on the green arrow of the code chunk in the R Markdown(Rmd) file, or
-highlight these lines, and hit the Run button, or
-type the code in the console
command to load data
data()
eg. data(arbuthnot)
objects
as you work with R, you will create a series of objects, sometimes you will load them with data command, sometimes you create them yourself as byproduct of a computation or analysis
How to use dataviewer in the Environment
upper right window, click on name of data set that lists objects in your workspace
e.g. arbuthnot
data frame
a kind of spreadsheet or table where R stores data
dim() command
command for asking for dimension of data, eg. 82 rows and 3 columns
names()
command for asking for variable names
how is invoking R commands like math class
invoking R commands means supplying a function with some number of arguments, eg. such as a single argument of the name of a data frame
up and down arrow keys at prompt
scroll through your previous commands
R Markdown is solution for what problem
documenting your code
Knit button on RStudio
type the code for the questions in the code chunks provided in the Rmd document and knit the document to see results
$
Access the data in a single column of a data frame separately using a command eg. arbuthnot$boys
“go to the data frame that comes before me, and find the variable that comes after me.”
vectors
most common data structure in R, one dimensional array of elements, R will print out a single variable vector with a a number added in brackets, along the left side to indicate locations within the vector, ordered by entry number
recursive vector
a list is a recursive vector, a vector that can contain another vector or list in each of its elements
or ##
comment out in R
ggplot() function, what does it build and what is the format?
function to build plots
ggplot(data= arbuthnot, aes(x = year, y= girls)) + geom-point()
looks like a function, arguments separated by commas
1st argument = dataset
2nd provide the variables to the aesthetic elements of the plot e.g. the and y axes.
3rd use another layer, separated by a + to specify the geometric object for the plot, scatterplot is geom_point
how to ask for help for a function
?ggplot
a help file will replace files in the lower right panel, with a new tab
how to add vectors
use variable names and addition sign and R will compute all sums simultaneously
eg arbuthnot$boys + arbuthnot$$girls
how to save new vector
use piping operator %>%, takes the output of the current line and pipes it into the following line of code
eg. arbuthnot <- arbuthnot %>%
mutate(total = boys + girls)
What does this piping code say in English?
rbuthnot <- arbuthnot %>%
mutate(total = boys + girls)
Take the arbuthnot dataset and pipe it into the mutate function. Using this mutate a new variable called total that is the sum of the variables called boys and girls. Then assign this new resulting dataset to the object called arbuthnot, i.e. overwrite the old arbuthnot dataset with the new one containing the new variable.
what does >- symbol do?
assignment, taking the output of one line of code and saving it into an object in your workspace
rbuthnot <- arbuthnot %>%
mutate(total = boys + girls)
in this example, you already have an object called arbuthnot, so this command updates that data set with the new mutated column
How to make a line plot instead of a scatter plot
using geom_line() instead of geom_point()
ggplot(data = arbuthnot, aes(x = year, y = total)) +
geom_line()
How to do both a scatter plot and a line plot?
List both
ggplot(data = arbuthnot, aes(x = year, y = total)) +
geom_line() +
geom_point()
How to use greater than, less than and equality symbols?
arbuthnot <- arbuthnot %>%
mutate (more_boys = boys > girls)
generates a column of true/false answers