W7L2 system genomic Flashcards
breaking biological systems into single components
- Biological systems are complex, consisting of different componenets that are individually more simple and similar to or share properties with each other
- Complex systems dynamics is not easily understood by humans who need to “single out” the effect of each independent feature
-They exhibit emergent properties, whose attributes cannot always be understood through decomposition and reductionist approach
-
Modelling the biologicals properties from complex system
-Biological systems exhibit non-linear dynamics
Non-linear dynamics can emerge from the interaction of the multiple elements
Multiscale structure combining molecular, cellular, physiological and behavioural components
- not alway understood by the reductionist approach
basic information that one should consider for system genomic
- complexity is not proportional to size: us highway can be easily understood but structure of a spider web is still unknown
-mutliscale response in a biological system and emergent properties
Properties of biological network: Patern of gene expression
- Analysis of transcriptomic data, differential expression using RNA-seq
-identify differential expressed gene over a period of time
-separate into gene cluster that have similar expression - track how similar the expression between two gene is, building a network of co-expression
Properties of biological network: analysis of network, graph theory
-Pairwise differences/similarity in gene expression can be formally represented as adjacency matrices. (can by repesented visually in a network graph)
- Some core topology of the network include: the degree of a network: the number of edge connected to a node. Centrality: how important nodes and edges are for the connectivity of the network
different type of graph network
scale free network: lots of hub but low number of edge per node
-transivity: node are more internally connected into cluster
Properties of biological networks: multi omics data
-Framework can be integrative of multiple analytical technology metabolomics and proteomics, transcriptomic and genomics
descriptive analysis
- using resources from public database: homogenised annotations for a great number of genes can be retrieved. These are gene ontology terms structured as trees
-these GO terms have been slimmed to GO-slim terms for a more comprehensive assessment
descriptive analyses: dimensionality reduction
-each molecule can be characterized by a large number of factor or features, potentially with complex effects.
-looking if we can represent the dataset on a low-dimensional space with a single metric, identify which variable drove the flattening (important gene)
Machine learning in system biology decision tree
-predictive modelling in systems biology uses a collection of supervised learning approaches to connect complex input data to a simple output
-mostly use decision trees to identify what class does a gene belong in. the final model is the average number of trees built using a dataset and drop in a subset of predictor.
-not sensitive to overfitting (too many info which lead to low statical power)
Machine learning model: deep learning
using artifical neural network, have an input layer with many hidden layer which transform the input. after the hidden layer is the output layer
-use to predict the state of output layer which is flexible and robust
-lacking biologicals specificity and perimeter interpretability
machine learning feature: developing feature
-development of very large number of feature that are used as predictors, desceiptive of the system
-overfitting is well handeled
-but correlation of features is a problem, ML work better with independent feature
Problem of ML
-No established framework for parameter tuning
-Ad hoc design (number of layer, convolution, pooling)
-multiple Optimisation criteria (leading to accuracy change, loss)