Complexity of Biological Systems & Molecular Functions (1) Flashcards
What is the equation for complexity?
C= f (S; E)
S: adjacency matrix
E: entropy matrix
Measured in cbit: complexity bit
What is Maxam and Gilbert DNA sequencing?
Chemical sequencing- based on chemical modification and scission
Involved multiple steps, including radio-labelling
Dependent on other DNA technology
What is Sanger DNA sequencing?
Chain termination- based on a biological activity (DNA replication)
Used a processive enzyme (DNA polymerase)
Simple and amenable to improvement
Amenable to automation
First of “sequence-by-synthesis” (SBS) technologies
How has Sanger sequencing been improved and automated?
Fluorescently labelled chain terminators (single lane)
Capillary electrophoresis separation (extension)
Automated data transfer to computer (collection)
Incorporation of PCR technology (sensitivity)
What are examples of first generation genomic sequencing?
Maxam & Gilbert sequencing
Sanger sequencing
How is sequence data from Maxam and Gilbert sequencing interrogated?
Identify what you need to sequence then sequence it
Restriction mapping
Fragment isolation
End-labelling
Two secondary cleavages
Chemical sequencing
Find sequence overlap
How is sequence data from Sanger sequencing interrogated?
Sequence randomly and puzzle data together
Sequencing
Sequence assembly
What is pairwise sequence alignment?
Compare 2 similar sequences
Simple dot matrix: match/mismatch -> scoring system
What algorithms are used in bioinformatics?
Dynamic (recursive) algorithms: break down the problem into multiple steps and consider all possible alignments
Heuristic algorithms: find best answer using a scoring system (variable)
Scoring system: values for matches and mismatches, penalties for gaps
Must be efficient: computing time and memory
How does global alignment with a linear gap penalty work?
Matrix-based algorithm, recursive for each cell in matrix
Alignment of two sequences over entire length
Fill table from top left and read from bottom right
Sequence broken and regions of similarity identified
How does local alignment with a linear gap penalty work?
Looks for high-scoring matches between regions of two sequences
Fill from top left and read from the highest value
What are the terms of the Needleman-Wunsch algorithm?
H (i,j) =
diagonal: H (i-1, j-1) + S (ai, bj)
vertical: H (i-1, j) - g
horizontal: H (i, j-1) - g
IN LOCAL ONLY: start again: 0
What are the differences between global and local Needleman-Wunsch alignment?
Initialisation:
In global, first row and column could be subject to gap penalty.
In local, first row and column are 0
Scoring:
In global, scores can be negative
In local, any negative scores are set to 0
Traceback:
In global, read from bottom right to top left
In local, start at highest score and end at 0
What is the PAM matrix?
Point accepted mutation matrix
Used for aligning protein sequences
What is combined opening and extending gap penalty?
W (l) = g open + g ext (l-1)
g ext > g open usually
Trace between 3 matrices