Chapter 4: Sequencing Genomes Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Module 4

The techniques for DNA sequencing in use today can be divided into two categories:

A
  • The chain-termination method first devised by Fred Sanger and colleagues in the mid-1970s
  • Next-generation sequencing, which is a collection of methods, each of which utilizes a massively parallel strategy in order to generate millions of sequences at the same time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Module 4

Nowadays, genome projects rely much more on nextgeneration techniques, which enable vast amounts of sequence to be obtained much more rapidly, but the chain-termination method is still performed in most molecular biology labs as a means of sequencing short DNA molecules such as _____ products and _____ _____ cloned in _____ or ______ vectors

A
  • PCR
  • small inserts cloned
  • plasmid
  • bacteriophage
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

polyacrylamide gel electrophoresis

A
  • single-stranded DNA molecules that differ in length by just a single nucleotide can be separated from one another
  • carried out in a capillary tube 50–80 cm in length w/a bore of 0.1 mm
  • possible to resolve a family of molecules representing all lengths up to 1500 nucleotides
  • single-stranded molecules emerging one after another from the end of the capillary
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Module 4

Chain-termination sequencing

A
  • Early version of dideoxy sequencing
  • Used Klenow (no 5’→3’ exo), later used Sequenase (no 5’→3’ exo, no 3’→5’ exo so high processivity)
  • ONE short oligonucleotide is annealed to the template DNA and acts as a primer for a DNA
  • The four deoxynucleotide triphosphates (dNTPs): dATP, dCTP, dGTP, and dTTP are added for strand synthesis
  • small amounts of the four dideoxynucleotide triphosphates (ddNTPs):ddATP, ddCTP, ddGTP, and ddTTP) are also added with a fluorescent marker
  • Four reactions needed to be separately carried out, each with a different ddNTP, but the same label
  • One primer, NOT PCR, single run, little product synthesized
  • Single strand template needed to avoid stem loop
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Module 4

dideoxynucleotide

A
  • chain-elongating inhibitors of DNA polymerase, used in the Sanger method for DNA sequencing
  • known as 2’,3’ because both the 2’ and 3’ positions on the ribose lack hydroxyl groups, and are abbreviated as ddNTPs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Module 4

Chain-termination sequencing

DNA polymerase doesn’t discriminate between dNTPs and ddNTPs. Once a ddNTP is incorporated, it blocks further ___ ____ because it lacks the ___ group needed to form a connection w/the next nucleotide. Because ___ are present in larger amts the strand synthesis doesn’t always terminate close to the ___. The result is different length molecules ending in a _____ whose identity indicates the nucleotide

A
  • strand elongation
  • 3ʹ-hydroxyl
  • dNTPs
  • primer
  • dideoxynucleotide
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Module 4

Chain-termination sequencing

To identify the _____ at the end of each chain-terminated molecule the DNA mixture is loaded onto the _____ _____, and _____ is carried out to separate the molecules by lengths. After, the molecules are run past a _____ _____, to determine the dideoxynucleotides, and thus whether each molecule ends in A, C, G, or T. The sequence can be printed or entered directly into a storage device for future analysis.

A
  • dideoxynucleotide
  • capillary gel
  • electrophoresis
  • fluorescence detector
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Module 3

Three criteria in particular must be fulfilled by a sequencing enzyme (polymerase)

A
  • High processivity
    • length of polynucleotide that is synthesized before the polymerase terminates through natural causes
    • so that it does not dissociate from the template before incorporating a dideoxynucleotide
  • Negligible or zero 5ʹ → 3ʹ exonuclease activity
    • exonuclease activity is a disadvantage
    • removal of nucleotides from the 5ʹ-ends of the newly synthesized strands alters the lengths of these molecules, making it impossible to determine the correct sequence
  • Negligible or zero 3ʹ → 5ʹ exonuclease activity
    • so the polymerase does not remove the dideoxynucleotide at the end of a completed strand and the strand might be further extended
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Module 3

most sequencing today makes use of the Taq DNA polymerase, which has _____ _____ and _____ _____ ______ enabling sequences of _____ bp and longer

A
  • high processivity
  • no exonuclease activity
  • 750
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Module 4

The chain-termination method that uses Taq polymerase is called _____ ______ ______

A

thermal cycle sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Module 4

thermal cycle sequencing

A
  • carried out in a similar way to PCR
  • reaction mixture includes the four dideoxynucleotides
  • just one primer is used
  • Because there is only one primer is used, only one strand of the starting molecule is copied, and the product accumulates in a linear fashion, not exponentially
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Module 4

thermal cycle sequencing

If two separate reactions are carried out, one with each of the two PCR primers, then _____ and _____ sequences are obtained. This is an advantage if the PCR product is more than _____ bp and hence too long to be sequenced completely in one experiment

A
  • forward
  • reverse
  • 750
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Module 4

thermal cycle sequencing

Forward, reverse, and internal primers enable _____ ______ of a PCR product to be sequenced.

A

different sections

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Module 4

thermal cycle sequencing

universal primers

A
  • anneals to the vector DNA adjacent to the position at which new DNA is inserted
  • A single universal primer can be used to sequence any DNA insert
  • used when sequencing an entire clone
  • genomic library created using the same vector to clone DNA inserts so one universal primer can be used by all the clones for sequencing
  • mainly used for contig clone approach
  • clones anchored to genome and then sequenced
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

thermal cycle sequencing

Automated sequencers with multiple capillary gels working in parallel can read up to _____ different sequences in a _____ period

A
  • 384
  • one-hour
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Module 3

no sequencing method is entirely accurate, so it is necessary to sequence each region of a genome multiple times, in order to identify errors present in individual sequence _____. With the chaintermination method, to ensure that errors are identified, at least 5× sequence _____ or _____ is needed, meaning that every nucleotide is present in _____ different reads.

A
  • reads
  • depth
  • coverage
  • five
17
Q

Module 4

Next-generation sequencing

A

the term applied to a variety of methods that enable thousands or millions of DNA fragments to be sequenced in parallel in a single experiment

18
Q

Module 4

Next-generation sequencing

The preparation and use of this sequencing library is the distinctive feature that distinguishes these methods from _____-______ _____, which is able to sequence only individual DNA fragments, each one obtained by a different _____ or from a different _____. Next-generation methods therefore enable the vast amounts of data needed to assemble an entire genome sequence to be obtained much more rapidly than with the chain-termination approach.

A
  • chain-termination sequencing
  • PCR
  • clone
19
Q

Module 4

Next-generation sequencing methods

The common feature of the various next-generation sequencing methods is the prior preparation of a _____ of DNA fragments that have been _____ on a solid support in such a way that multiple sequencing reactions can be carried out side by side in a massively _____ _____ format. The fragments are usually ___ - ___ bp in length

A
  • library
  • immobilized
  • parallel array
  • 100–500
20
Q

Module 4

Next-generation sequencing methods

sonication

A
  • most popular way of breaking genomic DNA down into fragments of 100–500 bp
  • uses high-frequency sound waves to make random cuts
  • Random breakage is important because each fragment will be sequenced from its ends
  • With next-generation methods it is not possible to direct the sequencing toward the middle of a fragment so the ends must be randomly distributed throughout the starting DNA molecule in order to ensure that the entire molecule is sequenced
21
Q

Module 4

Next-generation sequencing methods

Two different immobilization methods commonly used

A
  • solid support is a glass slide
  • small metallic beads
22
Q

Module 4

Next-generation sequencing methods

immobilization methods: glass slide used for solid support

A
  • glass slide used for solid support
  • has been coated with many copies of a short oligonucleotide
  • Adaptors are ligated to the ends of the DNA fragments
  • DNA is denatured
  • resulting single-stranded molecules attach to the glass slide by base pairing between their adaptor sequences and the immobilized oligonucleotides
  • adaptors provide the annealing sites for the primers for PCR
  • PCR products become attached to adjacent oligonucleotides, so each starting fragment is amplified into an immobilized cluster of identical fragments
23
Q

Module 4

Next-generation sequencing methods

adaptors

A

short pieces of double-stranded DNA whose sequences match that of a oligonucleotide

24
Q

Module 4

Next-generation sequencing methods

immobilization methods: small metallic beads

A
  • solid support is provided by small metallic beads that are coated with the protein streptavidin
  • DNA fragments are ligated to adaptors adaptors that carry a biotin label attached to their 5ʹ-ends
  • Biotin binds strongly to streptavidin, so the fragments become attached to the metallic beads by biotin– streptavidin linkages
  • just one fragment becomes attached to each bead
  • beads are then shaken in an oil–water mixture to generate an emulsion resulting in one bead in each aqueous droplet within the emulsion
  • Each aqueous droplet is then transferred into a different well in a multiple array on a plastic strip
  • adaptors provide the annealing sites for the primers for this PCR
  • PCR is carried out in the oil emulsion
  • products PCR are retained within their own water droplet, prior to deposition of those droplets into the wells on the plastic strip
25
Q

next-generation sequencing

reversible terminator sequencing / Illumina sequencing method

A
  • Currently the most popular method
  • w/next-generation sequencing, process is initiated by a primer that anneals to the adaptor sequence
  • There are no normal deoxynucleotides present in the reaction mixture
  • uses modified nucleotides w/a removable chemical group attached to the 3ʹ-carbon of the modified nucleotide
  • removable blocking group is a fluorescent label, a different one for each of the four nucleotides
  • the modified nucleotides block strand synthesis when incorporated at the end of synthesized molecule
  • the attached chemical group is from the modified nucleotide once the identity of the nucleotide has been confirmed making termination step reversible
  • each step in strand synthesis is accompanied by a pause and an optical device detects the fluorescent label
  • enzyme then removes the label
  • the next terminator nucleotide is added and the detection process repeats
26
Q

next-generation sequencing

in reversible terminator sequencing every cluster of fragments in the library is sequenced at the same time generating relatively short sequence reads with a maximum length of _____ bp but is so massively parallel that up to _____ Mb of sequence can be obtained per run.

A
  • 300
  • 2000
27
Q

Module 4

next-generation sequencing

pyrosequencing method

A
  • reaction mixture contains only deoxynucleotides
  • there is no artificial termination
  • Each deoxynucleotide is added individually in a repetitive series, along with a nucleotidase enzyme that degrades the deoxynucleotide if it is not incorporated into the strand being synthesized.
  • A flash of chemiluminescence is generated by the enzyme sulfurylase when pyrophosphate is released each time DNA polymerase adds a deoxynucleotide
  • this flash signals the successful copying of one position in the template molecule
  • pattern of light emissions is used to deduce the order in which nucleotides are incorporated
28
Q

Module 4

next-generation sequencing

ion torrent

A
  • similar approach to pyrosequencing
  • uses repetitive series of nucleotides flowed over an immobilized fragment library on acrylamide beads
  • detection system uses hydrogen ions that, along with pyrophosphate, are released every time a nucleotide is incorporated into the growing strand
  • each bead in a well lined with an ion-sensitive field effect transistor
  • generates an electronic pulse each time it detects hydrogen ions
  • pulses relate to the flow of nucleotides over the well in order to deduce the sequence of the immobilized fragments
  • Read lengths of up to 400 bp are possible
  • electronic detection system has lower construction and running costs compared with the optical detectors
29
Q

next-generation sequencing

sequencing by oligonucleotide ligation and detection (SOLiD)

A
  • a primer is attached to the template DNA
  • DNA ligase and a set of 1024 oligonucleotides, representing each of the possible five-nucleotide sequences is added
  • one oligonucleotide hybridizes and attaches adjacent to the primer by DNA ligase
  • The process of hybridization–ligation continues for a set number of cycles until 50–75 nucleotides of the template have been covered
  • computationally intensive
30
Q

Module 4

next-generation sequencing

  • One limitation of the three sequencing-by-synthesis methods described above is ,,,
  • With the reversible terminator method, the delay is caused by ,,,
  • during pyrosequencing and ion torrent sequencing, the delay occurs because ,,,
  • This _____ increases the period of time needed to complete a sequence read and also decreases the ______ of the polymerase, limiting the ______ of those reads.
A
  • that detection of each nucleotide addition requires a brief delay in the DNA polymerization process
  • the need to remove the 3ʹ-blocking group after each nucleotide addition, and
  • each nucleotide is presented individually to the polymerase.
  • delay
  • processivity
  • lengths
31
Q

Module 4

third-generation sequencing

A
  • methods that avoid a delay at the nucleotide detection step and enable a sequence to be read during the normal, unimpeded progression of the polymerase along the template
  • sequencing in real time
32
Q

Module 4

third-generation sequencing

single-molecule realtime sequencing / PacBio sequencing method

A
  • most promising
  • uses zero-mode waveguide, a sophisticated optical system to observe the copying of a single DNA template
  • nucleotide substrates are labeled with fluorescent markers w/out a blocking group
  • the optical system detects the sequence during synthesis
  • fluorescent markers is removed immediately after nucleotide incorporation
  • strand synthesis progresses without interruption
  • Read lengths of up to 20,000 bp have been reported
33
Q

Module 4

fourth-generation sequencing

A
  • dispenses with the strand-synthesis step
  • reads the sequence of a DNA molecule directly without copying that molecule in any way
34
Q

Module 4

fourth-generation sequencing

nanopore sequencing

A
  • uses a synthetic membrane with small pores just large enough for DNA to pass through
  • electrical current w/a positive on one side & negative on the other side of the membrane is set up
  • electrophoresis causes dsDNA to approach a nanopores
  • helicase enzyme by the nanopore breaks the base pairs so one ssDNA passes through the pore
  • sequence is read because each nucleotides has a different shape and occludes the pore in a different way, resulting in a slightly different perturbation of the flow of ions passing through the membrane
  • perturbations are measured to get sequence
  • Because no synthesis is involved, the length of the sequence is not limited by polymerase processivity
  • reads of up to 50 kb have been reported
  • limited because accuracy of sequence identification is hampered by the speed at which ssDNA passes through
  • looking into improvements by modifying the pore structure so passage of DNA is slowed down
35
Q

Module 4

Since the 1990s, when the chain-termination method was first automated, the actual generation of sequence data has not been a limiting factor in genome sequencing projects. Instead, the main challenge lies with _____ ______

A

sequence assembly

36
Q

Module 4

shotgun method

A
  • involves breaking the genome into a collection of small DNA fragments that are sequenced individually
  • A computer program looks for overlaps in the DNA sequences and uses them to place the individual fragments in their correct order to reconstitute the genome
37
Q

reference genome

A
  • the existing sequence is used as a reference genome for assembly of additional genome sequences from the same species
  • Rather than looking for overlaps among the sequence reads for the genome that is being assembled, individual reads are simply placed on to the reference sequence by looking for regions of sequence identity or similarity
  • recombination can cause a problem when a reference genome is used
38
Q

Module 4

chromosome walking

A
  • begins with one clone from a library
  • the insert from this clone is used as a hybridization probe to screen and identify a 2nd clone whose insert overlaps with the insert in the 1st clone
  • if match is made, this 2nd clone DNA insert can be used as a new probe to continue the walk
  • then identify a 3rd clone whose insert overlaps with the 2nd clone, and so on.
39
Q

Module 4

Th effect of heterozygosity on sequence assembly

A
  • In this region of the assembly, some of the reads covering a particular position identify the nucleotide as C, whereas a similar number of reads identify the nucleotide as T
  • This is because of heterozygosity: one member of the pair of homologous chromosomes has a C at this position, and the other has a T.