Tonon: Lecture III Flashcards
Hierarchical Clustering and Partitioning Methods
What are the 2 types of clustering?
hierarchical clustering and partitioning
What is hierarchical clustering?
tree where the length of the branches reflects the degree of similarity between objects
What is Partitioning Method?
it is a clustering method where the amount of clusters are chosen
What is a node?
when 2 rows that are very similar are combined to form a single entity
What do the shortest branches of the tree represent?
genes with the highest correlation
What is Genomic Nonnegative Matrix Factorization?
tool originally developed for face recognition, but it was used by the Professor for genomic data to identify the number of subsets in a specific dataset
In relation to genomic data, what did Nonnegative Matrix Factorization provide?
reliability, aka Cophenetic correlation or robustness of the number of clusters
How do we define whether 2 genes are differentially expressed?
fold-change
What is the student t-test?
statistical significance test
Why would we adjust the p-value?
to make it more reliable by taking into account the chance of random effect
What is the most brutal adjusted p-value method?
Bonferroni because many times no results are found due to its strict limitations
What is the most used p-value for T-tests?
0.05 normal
0.01 strict
What is peculiar about this graph?
all red dots are on the same side, which means they are down-regulated, but this is very rare because there are activators and inhibitors in all pathways that would cause a different distribution
Explain the Graph
used for gene expression difference
in enriched, all the genes are skewed to one of the ends
in unenriched, all the genes stay closer to the line
graph is obtained with genes in DNA repair out of the cell cycle are placed without threshold
Why is GSEA important?
we can obtain meaningful results we would not get with fold-change