Functions: Matrices and Dataframes Flashcards

1
Q

exploring the data frame called bsale

A

head(bsale) # Show me the first few rows
str(bsale) # Show me the structure of the data
View(bsale) # Open the data in a new window
names(bsale) # What are the names of the columns?
nrow(bsale) # How many rows are there in the data?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

calculate descriptives from column vectors

A

mean(bsale$age) # What was the mean age?
table(bsale$color) # How many boats were there of each color?
max(bsale$price) # What was the maximum price?

notice you have to specify both the dataframe as a whole and the column of data you want a statistic of

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

adding new columns to a data frame

A

bsale$id <- 1:nrow(bsale)
bsale$age.decades <- bsale$age / 10
bsale$profit <- bsale$price - bsale$cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What was the mean price of green boats?

A

with(bsale, mean(price[color == “green”]))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

matrix

A

can contain either character or numeric columns

combinations of vectors of the SAME LENGTH

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

data frame

A

can contain BOTH character or numeric columns

the more flexible and widely used type of data file in R

combinations of vectors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

common functions to create matrices and data frames

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

cbind(), rbind()

A

cbind() and rbind() both create matrices by combining several vectors of the same length. cbind() combines vectors as columns, while rbind() combines them as rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

matrix()

A

The matrix() function creates a matrix form a single vector of data. The function has 4 main inputs: data – a vector of data, nrow – the number of rows you want in the matrix, and ncol – the number of columns you want in the matrix, and byrow – a logical value indicating whether you want to fill the matrix by rows.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

data.frame()

A

survey <- data.frame(“index” = c(1, 2, 3, 4, 5),
“sex” = c(“m”, “m”, “m”, “f”, “f”),
“age” = c(99, 46, 23, 54, 23))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

functions for previewing matrices and data frames

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

changing a column name in a data frame

A

names(df)[names(df) == “old.name”] <- “new.name”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

add a new column in a data frame

A

survey$sex <- c(“m”, “m”, “f”, “f”, “m”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

slicing with [ , ]

A

Return row 1

df[1, ]

df[, 5]

df[1:5, 2]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

slicing example

A

Give me the rows 1-6 and column 1 of ToothGrowth

ToothGrowth[1:6, 1]
##[1] 4.2 11.5 7.3 5.8 6.4 10.0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

slicing example 2

A

Give me rows 1-3 and columns 1 and 3 of ToothGrowth

ToothGrowth[1:3, c(1,3)]
##len dose
##1 4.2 0.5
##2 11.5 0.5
##3 7.3 0.5

17
Q

slicing example 3

A

Give me the 1st row (and all columns) of ToothGrowth

ToothGrowth[1, ]
##len supp dose
##1 4.2 VC 0.5

18
Q

slicing with logical vectors

A

Create a new df with only the rows of ToothGrowth

#where supp equals VC

ToothGrowth.VC <- ToothGrowth[ToothGrowth$supp == “VC”, ]

#where supp equals OJ and dose < 1

ToothGrowth.OJ.a <- ToothGrowth[ToothGrowth$supp == “OJ” & ToothGrowth$dose < 1, ]

19
Q

subset()

A

Get rows of ToothGrowth where len < 20 AND supp == “OJ” AND dose >= 1

one of the most useful functions with using data frames

subset(x = ToothGrowth,
subset = len < 20 &
supp == “OJ” &
dose >= 1)
##len supp dose
##41 20 OJ 1
##49 14 OJ 1

20
Q

create a subset data frame

A

oj <- subset(x = ToothGrowth,
subset = supp == “OJ”)

21
Q

with()

A

The with() function helps to save you some typing when you are using multiple columns from a dataframe. Specifically, it allows you to specify a dataframe (or any other object in R) once at the beginning of a line – then, for every object you refer to in the code in that line, R will assume you’re referring to that object in an expression.

22
Q

with() examples

A

health <- data.frame(“age” = c(32, 24, 43, 19, 43),
“height” = c(1.75, 1.65, 1.50, 1.92, 1.80),
“weight” = c(70, 65, 62, 79, 85))

									 # Save typing by using with() with(health, height / weight ^ 2) ##[1] 0.00036 0.00039 0.00039 0.00031 0.00025