Importing data into R Flashcards
function to load in CSV files
read.csv(“data.csv”, stringsAsFactors = FALSE)
data must be in your working directory, or the path must be specified. Strings as factors default is TRUE, sets strings as factors
List the files in your working directory
import tab delimited data
read.delim(x, sep = “/t” (space), header = TRUE)
import any tabular data
read.table(x, sep = “”, header = FALSE, stringsAsFactors = TRUE, col.names = “”)
returns the index of the smallest value in a vector
ex: cars[which.min(cars$MPG),] will return the value which the minimum MPG in the cars vector
an argument in the read.delim & read.table functions. Use this argument to specify the data class of the variables you are importing
ex: read.delim(x, sep = “”, colClasses = c(“character”, “logical”, “numeric”))
Hadley’s data import package
readr version of read.csv
loads data as a “tibble”
read.delim for readr
read_tsv (tab seperated value)(“potatoes.txt”, col_names = c(“type”, “weight”))
argument to specify the variable classes in readr package
the main import function in the readr package. Similiar to read.table
Must specify the file and delim arguments
ex/ read_delim(“cars.txt”, delim = “/t”, col_names = c(“automaker”,”mpg”))
skip rows in your import functions.
ex: skip = 5 will skip the first 5 rows and then begin reading in data
specifies the number of rows you want to read in, often used with skip
ex: read_delim(“cars.txt”, delim = “/”, skip = 2, n_max = 3)
skips the first two rows and reads in rows 3,4, and 5
Haddley’s excel data import package
function to list different sheets in excel: readxl package