Omics experiments and algorithms Flashcards

Question 1

Q

Timeseries experiments

Answer

A

Take a cell or tissue sample
Apply some change to the environment
Take n samples at given time points
Measure each sample
Analyse how things changed over time

Question 2

Q

Cell type experiments

Answer

A

Take one or more tissue samples
Extract similar cells by morphology or fluorescent tagging
Measure each cell group (proteome, metabolome, transcriptome)
Analyse how things are different between cells

Question 3

Q

Spatial analysis experiments

Answer

A

Take a tissue sample
Either carry out in-situ hybridisation to probes or microdissection followed by sequencing/hybridisation
Measure each sample
Categorise where the same came from

Question 4

Q

Applications of dendrograms

Answer

A

Phylogeny
Clustering biological entities

Question 5

Q

How do we measure distance?

Answer

A

Number of substitutions
Estimate the distance given observed differences and apply a nucleotide/amino acid substitution model
Euclidean/hamming/cosine distance between feature vectors

Question 6

Q

Distance matrix

Answer

A

An all-against-all matrix which catalogues all scores and measures how far apart all pairs of entities are. All scores on the diagonal must be zero.
A distance measure can be used as is
A similarity measure must be inverted in some way

Question 7

Q

Tree clustering algorithms

Answer

A

Distance-based (UPGMA)
Maximum parsimony trees
Maximum likelihood trees

Question 8

Q

What is UPGMA?

Answer

A

Unweighted Pair-Group Method with Arithmetic mean
Unweighted: All pairwise distances contribute equally
Pair-group: Groups are combined in pairs
Arithmetic mean: Pairwise distances between groups are means to all group members

Question 9

Q

How does UPGMA work?

Answer

A

Form a cluster for each leaf node
Find the 2 closest clusters given the average distance between those clusters
Merge C1, C2 into a single cluster C
Form a node for C, connecting it to C1 and C2. Set the age of C as Davg(C1,C2)/2
Eliminate columns for C1 and C2 in D, add a row/column for C and compute the average distances between clusters once again
Iterate steps 2-5 until you reach a single cluster containing all clusters

Question 10

Q

UPGMA properties

Answer

A

Time complexity O(n^2 logn)
A unique tree
A rooted tree
An ultrametric tree (all the leaves are equidistant from the root)

Question 11

Q

Node

Answer

A

A vertex which represents an entity that we wish to model that can have a defined relationship with other nodes

Question 12

Q

Edge

Answer

A

A connection between two nodes that specifies some relationship between them

Question 13

Q

Adjacency

Answer

A

Two nodes are adjacent if connected by an edge

Question 14

Q

Typical experimental design

Answer

A

Time-series transcriptomics
Data pre-processing
Inference methods
Network inference
Validation
Modelling/simulations

Question 15

Q

CLR algorithm

Answer

A

Take all transcription data
Calculate mutual information between expression levels of all pairs of genes
Build MI matrix
Calculate the z-score for each putative transcription factor and putative target
Calculate joint z score
Accept any zi,j that is above a given threshold indicating regulation

Omics experiments and algorithms Flashcards

Week 8 Lecture 3 (15 cards)