Sequencing Flashcards
Genome Sequencing Methodology
5-10 times the number of anonymous participants as needed provided DNA samples
Taken from local sites. DNA extracted from blood
Sequenced from composite of genomes of fraction of participants, known by nobody
BACS libraries
Bacterial Artificial Chromosomes
Sorted chromosomes from which DNA is isolated
Restriction Enzymes cut specific palindromic sequences
Restriction enzymes cut isolates DNA into multiple fragments
Creation of BACS libraries
DNA fragments inserted into circular DNA and included into bacteria (BACS)
Single sequences called CONTIGS
BACS clones
Dilute solution of bacteria can be cultured on agar plate and the colonies produced are clones
Single colony contains clones of DNA sequence
Clones then used for sequencing
BACS automation
Automated massively parallel creation of BACS
Copied DNA isolated and sequenced
Computational tools applied to obtain the physical map
Production of physical map
Select clones for sequencing (overlapping)
Sequence to at least draft coverage
Merge data
Order and orient with mRNA, paired end reads and other data
Genetic mapping
Produced using a physical map by assessing the location of the genes.
Genes on same chromosome are ‘linked’.
More recently. Position of genes is determined by the exact frequency of recombination has occurred.
FISH mapping
Fluorescence in situ hybridization
Attach fluorescent labels to DNA sequences
Process chromosomes on glass so location of specific genes within the chromosome can be identified
Sequencing developments
Can do 20kb with 99.5% accuracy
Can sequence mRNA directly
Only suitable for a single strand of DNA
Current sequencing methods
PacBio HiFi - Mid length, Mid accuracy
Illumina - Low length, High accuracy
Oxford Nanopore - High length, Low accuracy
Not available during Human Genome Mapping Project
PacBio Hifi
Polymerase enzyme, nano-sized hole
Single strand of DNA introduced
Fluorescent nucelotides emit light as they are ‘stitched’ into the complementary double strand
Colour of light emmission provides accurate sequencing
Illumina Sequencing
Individual pieces of DNA attached to glass surface
Sequencing by synthesis
As complementary nucleic acid attached, fluorescence produced
Oxford Nanopore
Double strand of DNA unzipped
Single strand inserted into protein nanopore
Electric current created by flow of ions which is a function of the nucleic acid base
Current as a function of time provides sequence information
Linkage distance
Distance in bp between genes on the same chromosome
Smaller linkage distance = more likely to be inherited together
Make up of Human Genome
Only 2% contains exons
26% introns
Only recently been able to understand role of other sequence information (lots of repetitive sequences)
Sequence reassembly - Reducing computational efforts
Sequencing a large array of overlapping short fragments (contigs) created from the BACS
Short sequences are called reads
Gel electrophoresis
Comparing size of fragments/contigs
Fragments migrate in an applied electric field
Shortest move the fastest
Digital Trees/Trie
Multiway tree often used for storing large sets of words
Trees with a possible branch for every letter of an alphabet
Words end with $
Trie usage
Implementation of sets
Quicker insertion, deletion and find
Quicker than binary trees and hash tables
Spell checkers, completion algorithms, longest-prefix matching, hyphenation
Search finds longest match between words in set and query