R Flashcards
assign a value to a variable
use
determining that the object (murders dataset) is of the “data frame” class
class(murders)
finding out more about the structure of the object
str(murders)
showing the first 6 lines of the dataset
head(murders)
obtain a specific column (ex population)
murders$population
$ is called accessor operator
displaying the variable names in the murders dataset
names(murders)
determining how many entries are in a vector
pop
obtaining the levels of a factor
levels(murders$region)
obtaining the number of levels of a factor
nlevels()
create vectors of class numeric or character
use “concentrate function”
ex: codes
name the elements of a numeric vector
codes
name the elements of a numeric vector in a different way
codes
access specific elements of a vector
Using square brackets
The function ____ sorts a vector in increasing order.
sort()
The function _____ produces the indices needed to obtain the sorted vector
order()
The function ____ gives us the ranks of the items in the original vector
rank()
The function ____ returns the largest/smallest value
max() and min()
The function ____ returns the index of the largest/smallest value
which.max() and which.min()
The name of the state with the maximum population is found by doing ??
murders$state[which.max(murders$population)]
how to obtain the murder rate
murder_rate
ordering the states by murder rate, in decreasing order
murders$state[order(murder_rate, decreasing=TRUE)]
Creating a logical vector that specifies if the murder rate in that state is less than or equal to 0.71
index
Determining which states have murder rates less than or equal to 0.71
murders$state[index]
Calculating how many states have a murder rate less than or equal to 0.71
sum(index)
efining an index and identifying states with both conditions true
# creating the two logical vectors representing our conditions west
The function _______ gives us the entries of a logical vector that are true.
which()
The function _______ looks for entries in a vector and returns the index needed to access them.
match()
if we want to know whether or not each element of a first vector is in a second vector
function %in%
Change a data table by adding a new column, or changing an existing one
dplyr package:
mutate() function
To filter the data by subsetting rows
dplyr package: function filter()
To subset the data by selecting specific columns
dplyr package:
select() function
To perform a series of operations by sending the results of one function to another function
using the pipe operator, %>%.
murders %>% select(state, region, rate) %>% filter(rate <= 0.71)
create data frames
data.frame()
By default, the data.frame() function turns characters into factors. To avoid this, we utilize the stringsAsFactors argument and set it equal to false.
create a simple scatterplot # a simple scatterplot of total murders versus population
function plot() x
Histograms. Ex: a histogram of murder rates
hist()
hist(murders$rate)
Boxplots. Ex: the rate per region in the murders dataset
boxplot()
boxplot(rate~region, data = murders)
works on vectors by examining each element of the vector and returning a corresponding answer accordingly.
ifelse() function
takes a vector of logicals and returns true if any of the entries are true
any()
takes a vector of logicals and returns true if all of the entries are true
all()
define a new function
function()
general form of a for-loop
For i in [some range], do operations
for(i in 1:5){
print(i)