Lecture 1 Flashcards
what was the first genome sequence composed of
many genome donors all together
when do we care about insects enough to sequence their genomes
- model organisms
- or ones that spread disease/ affect us
- or have agriculture relevance (like bee pollinators)
- pest control
why were the first dna sequencing machines so expensive
- cause you had to develop the technology,
- later ones were cheaper since you already had a reference
true/false the health and ancestry commercial dna analysis kits are for whole genome sequencing
- false
- theyre for genotyping
how do the health and ancestry dna analysis kits genotype
- they use “gene chips” that detect single nucleotide polymorphisms (SNPs)
- the more snps in common, the more related
true/false we can sequence DNA without breaking it up
false
briefly describe the shotgun strategy
- dna extraction
- dna fragmentation
- clone into vectors
- transform bacteria, grow and isolate vector dna
- sequence the library
- assemble contiguous fragments
how do we sequence the library in the shotgun strategy
- randomly
- we’ll figure out how they all relate to each other later on
what strategy requires assembly of reads into contigs
shotgun
what is a contig
a series of overlapping dna sequenced used to make a physical map that reconstructs the original dna sequence of a chromosome or a region of a chromosome
what strategy is often used to close the gaps in shotguun sequencing
“primer walking” strategy
what strategy is often used to obtain the sequence of a short region of DNA
“primer walking” strategy
if you only want to synthesize 1kB of DNA, what should you do
use “primer walking” strategy
true/false “primer walking” strategy is often used to sequence full genomes
false
briefly describe “primer walking” strategy
- start sequencing from specific site in genomic DNA or chromosome
- design primer at a site based on sequence info obtained
- start sequencing w newly designed primer
- repeat 2 and 3
describe the relationship between the “shotgun” and “primer walking” strategies
- shotgun is done to get most of it
- primer walking comes in to fill in the gaps
which is more orderly between “shotgun” and “primer walking” strategies
primer walking
true/false in primer walking you always know what came before it and what comes after
true
how do you decide primers for primer walking
as you go
sanger’s sequencing is based on what kind of synthesis f DNA
in vitro
true/false sangers sequencing is still frequently used today
- false
- has nasty chemicals and hard to scale up
- hardly used
what would happen in sanger sequencing if too much ddA is present in the A sample
all the resultant DNA strands would be very short since the chain termination would occur very early in the reaction
what happens when modifed nucleotides are added during DNA synthesis in sanger sequencing
causes chain termination
describe sanger sequencing
- you’ll have a pool of normal ATGC
- and a tiny amount of the dideoxy (modified) ones
- anneal a primer and add polymerase to add its studd
- once the ddATP gets added, we know what base is there (cause if we only have modified As, then an A must have gone there)
- these strands are separated by size via gel electrophoresis
- then repeat with the other nucleotides to see the other bases
is DNA sequencing read from the bottom up or top down
bottom up
is the gel used for sanger sequencing the same as what we use in lab
- no
- this gel (polyacrylamide) can separate the strands by just one nucleotide
- way more sensitive
how many primers are annealed to the DNA strand in sanger sequencing
one only
when cannot we design a primer in sanger sequencing
if no previous sequence info is known about the dna template
when do we need a new template in sanger sequencing
for every new DNA template to be sequenced
how will we know what the primer should be for the plasmid vector in sanger sequencing
we’ll know the entire sequence of the plasmid so we can make primers for whatever region we’re interested in
how do we go from the plasmid vector to the DNA of interest
- cut the plasmid with RE X
- insert DNA
- denature plasmid DNA for sequencing by heating it up
- anneal one primer to the region of interest (that we’ll know because we know the entire sequence of the plasmid)
- split samples into 4 tubes and do the steps
why can’t we anneal more than 1 primer to the plasmid vector in sanger sequencing
- because you won’t be able to identify which segments are from which plasmid
- they might be the same length which will confuse you when you try to run the gel
- overlapping bands
what are the 2 primers for the plasmid vector in sanger sequencing
- M13 forward
- M13 reverse
what would happen if you anneal forward and reverse primers at the same time in a single sample
you get sequences from both strands of the DNA template
why would you anneal forward and reverse primers at the same time in a single sample
can verify that they’re the reverse complement of each other, if they’re not… smth isn’t right
what is the problem w manual sanger sequencing
- can only read 150-200 nucleotides per gel
- its okay w short sequences but not big
- very labour intensive and time consuming
what is the main difference in automated sanger sequencing over manual
they use a diff colour fluorescent dye to tag each ddNTP
how many tubes are used in manual sanger sequencing
4
how many tubes are used in automated sanger sequencing
1
how does automated sanger sequencing work
- when nucleotides are added to the chain, a fluoresence is released based on the ddNTP
- the fluorescence detector relays its signals to the computer which interprets the colours to ddNTP
what does it mean in the chromatograph for automated sanger sequencing when a peak has two colours
could be evidence of a heterozygote (one allele is G, the other A)
what are the advantages of automated over manual sanger sequencing
- Can read up to 900 nucleotides per Rx
- Allows automated reading and recording of results
- Cost effective
- Can perform all 4 ddNTPs Rx in one sample and load in same lane of gel
- Can sequence up to 384 diff DNA samples simultaneously by using gels formed within capillary tubes
what is the most popular 2nd gen dna sequencing
sequencing by synthesis- SBS (Illumina)
whatre the disadvantages of 2nd gen methods over sanger
- geared towards large # of samples, highly impractical for small samples
- requires high computing and data storage capacity
in next gen sequencing, what can be eliminated
- interting/ cloning the DNA into a vector
- transformation of vector into bacteria
- isolation of plasmid from transformed bacteria
what is done instead of all the plasmid steps in next gen sequencing
ligating “adapter sequences” to each end of DNA fragment and PCR amplify
how do you prep a genomic DNA library for next gen sequencing
- isolate genomic dna
- fragment genomic dna
- ligate dna primers/ adapters/ tags to each end of genomic dna fragments
- attach tagged dna fragments to slide
- pcr amplify to obtain large # of molecules of the same fragment in clusters or spots on the slide
- perform da sequencing reaction directly on the slide by passing diff solutions over slide
- record signal after each slide
why is a pcr step added in the prep of a genomic DNA library for next gen sequencing
we need more material to build up the signal
why are adapters added in the prep of a genomic DNA library for next gen sequencing
- you want to PCR, so you need the primer and if you add the adapters you can know the sequence of the adapters
- also helps you anchor your dna samples onto a solid suuport
true/false in illumina sequencing, only modified nucleotides are used
true
what dna sequencing technique is done by synthesis, but without permanent chain termination
illumina
where on the nucleotides in illumina is the fluorescence added
on the base
where on the nucleotides in illumina is the block added
the sugar
does illumina have multiplexing ability
yes, up to 50 million spots can be analyzed at the same time
how many altered nucleotides are added for illumina sequencing
all 4 are added in each cycle
what is the label for the 3 phosphates on the nucleotides
- alpha (closest to the rest of nucleotide)
- beta
- gamma
when the phosphates get cleaved (for the addition of the nucleotide) which one remains
alpha
in each cycle of illumina, how many nucleotides are detected
one
would a signal with 2x the average intensity be likely due to 2x incorporation of the same nucleotide in the same cycle in illumination sequencing?
- no
- the 3’ block stops more than one nucleotide from being added during a single cycle
- this cannot be done
the identify of each base of a cluster/ spot is read off what for illumina
read off sequential images/ photos taken after each sequencing cycle
what are 2 3rd gen dna sequencing methods
- single molecule real-time sequencing (SMRT)
- nanopore sequencing
what sequencing method doesn’t require any dna synthesis
nanopore sequencing
what is one instance where you can bring a sequencing device to the scene
the small portable nanopore sequencing device (MinION)
how does nanopore sequencing work
- we monitor changes to an electrical current as a single-strand DNA moved through a tiny pore in a membrane
- each nucleotide base causes a characteristic change to the current
why do 3rd gen sequencing have easier genome/contig assembly
because they can do really long reads (its like do a 10 piece puzzle, rather than 1000 piece)
what is the disadvantage of SMRT over illumina method
- higher error rate
- increased cost
- not as many DNA sequencing centers have access to this technology
what are 3rd gen very useful for
- tracking outbreaks
- used for covid to see what outbreaks are out there
what is HiFi PacBio
- like SMRT but you sequence many times, rather than just once
- makes it easier to detect errors
why is SMRT very fast
- it is uninterrupted synthesis by a single DNA polymerase
- no pcr amplification of template DNA is required
how does SMRT work
- we have a bunch of microwells, that have a single DNA polymerase attached to the bottom
- the DNA template is gonna be sequenced, and is held by the polymerase
- there are pools of dNTPs in the wells
- each labeled w a diff colour fluorescent tag attached the the gamma phosphate
- only the dNTP help by the polymerase as its about to be added will be dected by the laser/ detector at the bottom
- they record the pulse of light coming from the fluorescence
- when the dNTP is added, the phosphate (and fluorescence) is cleaved off, and the light is lost
- the next dNTP is added and it continues
what do the pulses of diff colour correspond to in SMRT
diff dNTPs being incorporated
briefly describe this
yellow means C is being held, and therefore incorporated, blue is the A being held and incorporated after
explain the diff peaks
no correlation to nucleotide being added
- very random
what does the personal genome project do
- sequenced the genomes of 100 000 people
- to correlate genotypes w health, physical and ancestry info provided by participants
what does the cancer genome project do
- sequenced the dna from primary tumours and normal genomic dna from the same people for 1000 cancer patients
- evaluated 350 human cancer cell lines and their response to 18 drugs to correlate drug sensitivity w specific genotypes
what does the pediatric cancer genome project do
sequenced and compared normal and tumor tissue samples from 600 pediatric cancer patients to find genetic causes of childhood cancers
true/false dna sequences for both strands of a double stranded dna template are usually obtained to check for any technical errors in the sequencing procedure
true
how many potential protein sequences can be derived from a double-stranded dna fragment
6
once you have obtained the sequnce of a DNA fragment, how do you locate the genes/ which of the reading frames is the right one
look for the longest “open” reading frame within DNA sequence
what is an open reading frame
the sequence of DNA that encodes a
continuous stretch of AA before encountering a stop codon
what are the 3 stop codon
- UAA
- UAG
- UGA
what amount of codons are stop codons
3/ 64 OR 1/21.3
which reading frame is best
2
which reading frame is best
3
what do the green boxes mean
stop codons
what do the red bars mean
continuous open reading frames
1
which reading frame is best
3
true/false we can assume that adjacent genes are in the same reading frame
false
true/false genes can be coded on either strand of the dna
true
what is the other name for pacbio
SMRT