Translational Bioinformatics (Sandra Hellberg) Flashcards

1
Q

RNA-sequencing analysis pipeline

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Raw count matrix

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Filtering and normalization of RNA-seq data

A

Filtering: We need to remove genes which have no expression in our data. There will be a lot of genes in the count matrix which have 0 genes. Removing genes with low counts decreases the problem with multiple testing. If we have 20,000 genes, you need to run 20,000 statistical tests and then a problem could be that you could get many false positives. Genes that are unexpressed in all samples have no biological meaning, so you remove the low count genes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Batch effects and biological confounders

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Batch correction and covariates

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Properties of RNA-seq data

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Differential expression analysis

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Multiple testing problem

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

False positives (calculate fraction of false positives

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Multiple testing correction (FWER, Bonferroni, Benjamini Hochberg, FDR)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Nominal p-values, adjusted p-values, q-values

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

High-dimensonal data analysis (PCA, MDS, SVD, tSNE, K-Means, hierarchical clustering)

A

High dimensional data refers to a dataset in which the number of features is larger than the number of observations. The problem with this type of data is that it is very huge and quite computer heavy. Excel is not compatible with this. A lot of this sequencing data is stuff that you cannot understand.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Biological pathways

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Gene enrichment analysis (pathway and gene set)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Disease enrichment

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Different annotation data bases (KEGG, GO, DisGeNET, REACTOME)

A
17
Q

Measures of enrichment (effect size, fold enrichment, odds ratio)

A
18
Q

Fisher’s exact test

A