Microarray Flashcards
Details about the microarray choosen
Technique for transcriptome profiling –> mRNA
Thermofisher gene 2.0 ST array
- whole transcript which means it looks at 28,000 coding and 7,000 non-coding
-different types of non-coding RNA (long intergenic non-coding RNA, recent)
-databases: RefSeq, Ensembl, lncRNA, Genbank
-median of 22 unique probes per transcript. Each probe is 25-mer (bases). Total 698,000 probes
-controls:(Poly-A controls dap, lys, phe, thr
Hybridization controls bioB, bioc, bioD, creX) genes
How is the data statistically analyzed in the microarray
Uses LIMMA (linear models for microarray) –> reads, normalized, and dose differential expression analysis for large-scale studies, hierarchal model that operates on a matrix of expression values
○ Statistical test –> Empirical bayes method: function used to rank genes in order of evidence for differential expression
What is a gene ontology analysis?
“Given a list of genes found to be differentially expressed in my phenotype (e.g. disease) vs. control (e.g. healthy), what are the biological processes, cellular components and molecular functions that are implicated in this phenotype?”
the premise here is that if many of the genes associated with a given biological process are differentially expressed in the given disease, that biological process is implicated in that disease.
gene ontology analysis: Over-representation analysis (ORA) or enrichment analysis
gene ontology analysis: Functional Class Scoring (FCS)
gene ontology methods: elim and weight
Log fold change vs. log 2 fold change
prefer log2 fold change, because of the symmetry: +1 is twofold up, and -1 is twofold down, etc. But many biologists are not comfortable thinking in log space and prefer just fold changes. Either way, it’s the same information.
If you want to report non-log fold changes but still preserve the symmetry, you can convert “2” to “2-fold up” and “0.5” to “2-fold down”.
A fold change describes the ratio of two values (not the difference). i.e. (expression condition 1)/(expression condition 2)
The log2 fold changes are the log-of-the-fold-changes i.e. log2(condition1/condition2)
Because log(A/B) = log (A) - log(B), many statistical programs will calculate the Log2FC = log2(condition1) - log2(condition2), but this is mathematically identical to Log2FC = log2(condition1/condition2)
Biostats programs will often estimate log2(condition1) using mean(log2(condition1)). This is equivalent to taking the geometric mean of the original data. Thus, Log2FC = mean(log2(condition1)) - mean(log2(condition2)) is the same as Log2FC = log2(geo_mean(condition1)/geo_mean(condition2)).
Log2 fold changes are used/plotted in graphs as those are nicer to show because they center around 0, giving reductions a negative value and increments a positive value
log2 fold change values (eg 1 or 2 or 3) can be converted to fold changes by taking 2^1 or 2^2 or 2^3 = 1 or 4 or 8
You can interpret fold changes as follows:
if there is a two fold increase (fold change=2, Log2FC=1) between A and B, then A is twice as big as B (or A is 200% of B).
If there is a two fold decrease (fold change = 0.5, Log2FC= -1) between A and B, then A is half as big as B (or B is twice as big as A, or A is 50% of B).