Lecture 22 - Genomics and Bacterial Evolution Flashcards
Functional genetics
Work out how a protein works from the genetic code, and experimental data.
Study of a single genome
Genomic analysis
Study of several genomes
Comparative genomics
What did Fred Sanger initially sequence?
PhiX bacteriophage
Size of PhiX
~5Kb
Sanger sequencing method 1) 2) 3) 4)
1) Break sequence of interest into fragments
2) Place in test tube with dideoxynucleotides, each with an individual dye. ddnucleotides terminate chain elongation
3) Run fragments on a polyacrylomide gel, which can resolve to individual base pair level
4) The dye colour of nucleotides is read
Read size of sanger sequencing
~600bp
Read
The length of a single piece of DNA that can be sequenced by a particular method
Read assembly
Reads are placed together, according to consensus sequences.
This forms a contig, which is a sequence of reads
Contig
Where read sequences overlap, make a sequence of consensus sequences
Gap
When a computer can’t find a match in reads to make a contig
Why can gaps occur?
1)
2)
1) DNA polymerase can’t extend sequence for some reason
2) If there is a repeated region, and the read size is smaller than the size of the repeat.
Automated Sanger sequencing method
Capillary electrophoresis
Illumina sequencing 1) 2) 3) 4) 5) 6) 7) 8) 9)
1) Break DNA of interest into fragments
2) Adaptors of known sequence are added, ligate to the ends of dsDNA
3) A glass slide is prepared, with sequences complementary to primers adhered to surface
4) Hybridisation of primers, adhered complementary sequences
5) Add unlabelled nucleotides, DNA polymerase. Bridge amplificaiton
6) DNA synthesis, bridges become double stranded
7) Denaturation, to ssDNA
8) PCR to make high-density DNA clusters
9) Bases tagged with fluorescent dyes added. When a base is added, emits fluorescence which is detected.
Key difference between Sanger and Illumina
Illumina sequencing can continue on same strand after dye-tagged base is added.
Fluorescent part is cleaved off when base is incorporated, so it doesn’t interfere with further elongation
MiSeq output per run
15Gb
NextSeq500 output per run
120Gb
HiSeq2500 output per run
1000Gb
MiSeq read number
25 million
NextSeq500 read number
400 million