V3 Flashcards

Data analysis using R

1
Q

data from R packages

A
  • can be loaded with the data() function

- example : data (USArrests)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how to load a small part of data set (code)

A
  • head(dataset)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

read data (code)

A
  • read.table() and read.csv()

my data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

row.names()

A

a vector containing the row names

a single number giving the column of the table which contains the row names

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

col.names()

A

a vector giving the column names

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

check.names()

A

names of the variables in the data frame are checked to ensure that they are syntactically valid variable names

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

skip()

A

the number of lines of the data file to skip before beginning to read data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how to define what classes we expect in the different colums (code)

A
  • numerical, logical, factor, character

- colClasses = c(“Character”, “factor”, …)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how to get dimension of table (code)

A

dim()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

load in text files on a line by kine basis, why and (code)

A

readLines(“my text.txt”, n = 10) - read first 10 lines
readLines(“my text.txt”, n = -1) - read everything
- for big amounts of row wise data , data might be to big to load at once

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

function to read zip file

A

gzfile(“data.gz”, “r”) - “r” for read (w for write)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how to use url as a connection

A

url()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how to read excel file

A
  • no native suport for xlsx in R
  • available in libraries : xlsx - use4 java to read and write xlsx files
  • slow and unreliable
  • just export xlsx files as csv
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

loading binary files

A
readBin()
- 2 dimensional array of pixels
- each pixel defined by 3 busted (B, G, R)
readBin, before remove header
image
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

%in%

A
  • allows you to filer A %in% B
  • which columns of matrix A in matrix B
    AinB
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

which function

A

transforms a logical vector into a numeric one (by index)

before TRUE, FALSE, TRUE, TRUE, FALSE
code : which (AinB)
after: 1, 3 ,4

17
Q

subset()

A
  • creates a subset

subset(air quality, Temp > 80, select = c(Ozone, Temp))

18
Q

saving data

A

write.table(data, file=”data.txt”, sep”\t”, row.names = FALSE)

19
Q

biomaRt

A
  • a community-driven project to provide unified access to distributed research data to facilitate the scientific discovery process
  • most of not all biological relevant databases
20
Q

important concepts in biomaRt

A

mart - link to a database : ListMarts()
attributes - what do we want to retrieve: ListAttributes()
filters - what do our values mean : ListFilters()