Week 10 Flashcards
Which of the below-mentioned tools are being used for differential gene expression analysis?
A. DESeq2
B. edgeR
C. gprofiler2
D. MAGIC
A and B
Which command is used to call any packages in R?
A. call
B. import
C. library
D. View
C
Which command is being used to set the working directory in R?
A. getwd
B. os.chdir
C. setwd
D. pathwd
C
Which command can be used to see only the .csv files of a folder in R?
A. list.files()
B. list.files(pattern = “*.csv”)
C. dir
D. dir.files()
B
Which command is being used to know the version of an installed package in R?
A. packageVersion
B. version
C. version.package
D. package –version
A
You have a CSV file named ‘data.csv’, and you want to read the data file from the 1st column in R. What
should be the command? (Assume that you are already in the working directory, and you don’t need to give
the full path).
A. read.csv(‘data.csv’, row.names =1)
B. read.csv(‘data.csv’, col.names = 2)
C. pd.read_csv(‘data.csv’,usecols = 1)
D. read_csv(‘data.csv’,rownames=1)
A
Which of the tools is being used for “GC” content normalization?
A. edgeR
B. clusterProfiler
C. EDASeq
D. DESeq2
C
Now, if we want to know the expression difference between healthy and cancerous samples and we also
know that the major sources of variation include sex and age, what will be our design formula for
differential gene expression analysis?
A. design = ~ sex + age + Sample type
B. design = ~ Sample type
C. design = ~ sex + age
D. design = ~ sex + Sample type
A
Let’s consider that you have drawn an MA plot, but you want to zoom it along the y axis, between log2 (fold
change) values between -2 to 2 to see the points in more detail. Which of the below options will be correct to
run in R?
A. plotMA(res, ylim=c(-2,2))
B. plotMA(res, xlim=c(-2,2))
C. plotMA(res, ylim=c(-2,2), xlim=c(-2,2))
D. plotMA(res)
A
Which of the tool is being used to remove unwanted variation from count data?
A. DESeq2
B. RUVSeq
C. SCDE
D. edgeR
B
You have a data file named ‘count_data.txt’, which contains the count data of 5000 yeast genes and some
ERCC spike-in genes. Let’s say you want to see all the ERCC spike-in names that are mixed with the gene
names. You have read the count_data.txt and saved it in a variable named ‘data’, and all the spike-in as well
as gene names are present in the rows of ‘data’. How will you find out only ERCC-spike in names from the
‘data’? [spike in names are started with “ERCC”].
Find out the right option for the answer -
A. rownames(data)
B. rownames(data)[grep(“ERCC”, rownames(data))]
C. grep(“ERCC”)
D. rownames(data)[grep(“ERCC”)]
B
To use heatmap.2 function which of the packages is being used in R?
A. heatmap
B. gplots
C. pheatmap
D. heatmap.2
B
Let’s consider that you are drawing a volcano plot without the help of any in-built package and you want to
see all the points that are significant, as well as the have log2 (FoldChange) > 2 in up and down directions, in
blue color but transparent. What should be color argument while using plot function in R?
A. col = rgb(1,0,0,0.3)
B. col = rgb(0,1,0,0.3)
C. col = rgb(0,0,1,0.3)
D. col = rgb(1,0,0,0)
C
In principal component analysis, which principal components contain the highest variance of the data?
A. PC2
B. PC1
C. UMAP1
D. Both PC1 and PC2
B
Which package is being used for the classification and clustering of RNA sequencing data using the Poisson
model?
A. RUVSeq
B. PoiClaClu
C. EDAseq
D. DESeq2
B