Chaudhari Flashcards
give 2 examples of first generation sequencing and whether they are used or not
maxam gilbert (not used) Sanger (used)
give 2 examples of second generation sequencing technologies
illumina
ion torrent
give 2 examples of 3rd gen sequencing technologies
pacific biosciences
oxford nanopore
what is sanger good at sequencing
low volume targetted sequences (not good for genome sequencing but useful for smaller reads)
what kind of reads does illumina produce
many short (50-300bp) reads
what kind of reads do oxford nanopore generate compared to illumina
fewer reads but very long reads (>10kb)
what are the 3 `advantages of second gen sequencing (illumina) over sanger
massively parallel - a single gene run generates millions of sequences
much cheaper per base
built in shot gun sequencing without a cloning step
what are the disadvantages of second gen sequencing technologies compared to sanger
library prep is expensive and slow
amplification of DNA fragments is required which can introduce biases
read lengths are quite short
what is good about third gen
good for finished gnomes (give n gaps) and base modifications (epigenetics)
individual reads in third gen have a quite high error rate T/F
T
what are the applications of illumina
draft genome sequencing,
resequencing
functional genomics
what are the applications of Pac bio
complete genome sequencing,
detection of DNA methylation
what are the applications of oxford nanopore
complete genome sequencing,
epigenetics
direct RNA-seq
metagenomics
what are the key steps to illumina sequencing
- extract genomic DNA
- fragment genomic DNA
- add linkers
- add input library to flow cell
- Amplification (bridge amplification)
- sequencing
- image taken
each flow cell cluster corresponds to what
a separate read
patterned flow cells make the system less prone to what
overclustering
what is a patterned flow cell
there are individual nanowells tha contain the primers for bridge amplification. A cluster is generated within each well and cant spread outside the well so overlapping clusters do not form
what is meant by phasing when referring to illumina
In sequencing-by-synthesis chemistry like Illumina (sorry, Solexa!) phasing is the rate at which single molecules within a cluster loose sync with each other. Phasing is falling behind, pre-phasing is going ahead and together they describe how well the chemistry is performing
once you get above 100 bases in illumina there is a lot of confusion and phasing. this is why illumina reads are limited to being short T/F
T
phasing problems result in a reduction of quality towards the end of reads T/F
T
what is 2 colour illumina sequencing
4 bases are sequenced using only 2 colours eg red and green 1st base - green 2nd base - red 3rd base - both 4th base -non
what are the benefits of 2 colour illumina sequencing
it allows simplified optics so has lower costs
how do oxford nanopore sequence
a few bases are inserted into the pore, there is a change in electrical potential and the sequence is determined
what is the highly portable version of the oxford nanopore known as
minION
what is the promethION
essentially lots of minIONs in a suitcase
what is GridION
somewhere in between minION and promethION - 5 minION flow cells
when was the first bacterial genome sequence published
1995
what was the first bacterial genome sequencing project to be INITIATED
E.coli K-12
which were the first bacterial genomes to be sequenced and how was this done
Haemophilus influenzae and mycoplasma genitalium
shotgun sequencing
what does shotgun sequencing rely on
computational assembly of sequence from random clone libraries, randomly sequence part of the genome and then place together in a contig sequence
what is de novo assembly
process of merging overlapping
sequence reads into contiguous sequences (contigs) without the use of any reference genome as a guide
what is the best reference to use to order contigs
usually the most closely related bacterium with a ‘finished’ genome
Due to evolutionary differences between the reference
and novel genome, the presence of (often mobile) repeat elements such as prophages, and the very nature of short-read assemblers, there will almost certainly be assembly errors present within the contigs.
T
Once the ordered set of contigs has been obtained, what is the next step
to annotate the draft genome -the process of ‘gene’ finding
Multi-locus sequence typing (MLST) is a widely used,
sequence-based method for typing of bacterial species and plasmids T/F
T
what are gaps in the draft genome from shotgun sequencing due to
- repeats
how can you fill in the gaps in a draft genome
PCR from the contigs
how were genes that are toxic to bacteria identified
Looked at data from previous experiments for multiple microbial genomes that used sanger in clone by clone approach
The gaps in this sequence must have been there for a reason (if clone contained that bit of genome may be toxic to cell your growing plasmid up in)
Identified compounds from these gaps that were toxic to e.coli
where was e.coli k12 originally isolated from
a convalescent diptheria patient in 1922 (part of the guts flora)
how many protein coding genes does e.coli k12 have
4288
how were regions of low GC content acquired in e.coli k12
horizontal transfer