Network Biology Flashcards
Quantitative Measurements, Comparative Statistics, Clustering, Gene sets, Pathways, Network
Isolated data points, Isolated lists, isolated groups, functional groups, functional organization, systems organization
Experimental data set for cancer
Tissue sample of tumor and healthy -> gene expression raw counts -> differential gene expression: comparison between groups
GeneID, GeneName
Identifier in online database and official gene symbol
Mapping of online identifier to official gene symbol for simplicity
log2FC
log2 of fold change: ratio of the difference between cancer and healthy sample. Log2 is easier to interpret
Is the gene more or less expressed in the cancer sample ?
negative: down-regulation in cancer
positive: up-regulation in cancer
zero: no change
p value
significance level of comparison. adj: corrected for multiple testing
Biological Pathways
Model for computational analysis and interpretation of large-scale experimental data
Puts data in a biological context -> analyze on a functional level.
More efficient than 1 gene at a time, groups genes, proteins, etc: intuitive, simple
Perform pathway statistics
Nodes: genes/proteins (black in PathVisio), metabolites (blue in PathVisio)
Edges: interactions
signaling pathway: starting point of all process/pathway
Metabolic pathway: energy produced/stored/…
Gene regulation pathway: transcription factors activated to produce protein
Genes database
Protein coding genes
Disease genes
Metabolic process genes
Know the most about metabolism and least about protein coding
ORA
Overrepresentation Analysis -> Pathway analysis method
Input list(R): significantly up/down-regulated
Background list(N): all measured genes
genes in pathway(n)
changed genes in pathway(r)
Statistical test
Z-score > 1.96 or look at pathway for conclusion !
Biological Networks
Study biological complexity, more efficient than tables, good data integration, intuitive visualization
Broad coverage, low resolution: don’t know if relevant
Molecular networks: protein-protein interaction (always undirected), metabolic network, regulatory network
Cell-cell communication
Nervous system
Human disease
Graph: path
sequence of edges to go from node A to B
Can pass to same node and edges repeatedly
distance: number of edges in the path connecting 2 nodes
shortest path: minimum number of edges
Graph: adjacency matrix
aij = 1 edge between node i and j. 0 no edge
if weighted links then the weights are defined in the matrix. Can represent strength of flow: taking a longer path might become better.
if undirected: matrix symmetrical
Centrality measure
Indication of the most important nodes/edges
Degree centrality: node degree = number of edges connected to a node. If directed: in/out-degree. High degree tend to be essential -> hub nodes
Betweenness centrality: number of shortest path through a node. 0 means none. 1 means all go through it. Information load on a node, connection of 2 subnetworks.
Clustering coefficient: how many of the neighbors are connected to each other
Hub
greatly exceed average degree. Found in scale free networks, not random ones. The larger the network, the larger the hub node will be.
essential proteins tend to cluster in densely interconnected subnetworks that are hub rich
Important for structure because mean shortest path can increase and diameter aswell. More nodes might become unreachable. Also functional importance: tend to be essential for survival
average degree
Random network
Try to replicate scale free network
Poisson distribution: bell curve.
No hub nodes, most nodes have same degree.
Generate more essential node and edges compared to scale free