01: DNA Sequencing Flashcards

1
Q

true/false The “health and ancestry” commercial DNA analysis available to the public are for whole genome sequencing rather than genotyping

A
  • false
  • the other way around
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

true/false Most current methods of manipulating DNA, RNA, and proteins rely on prior
knowledge of the nucleotide sequence of the genome of interest

A

true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the most widely used method to determine nucelotide sequences in a genome of interest

A
  • dideoxy sequencing
  • aka sanger sequencing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is used in sanger sequencing

A
  • DNA polymerase
  • dideoxyribonucleoside triphosphates (special-terminating nucelotides)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

how does sanger sequencing work

A
  • they produce a collection of different DNA copies that terminate at every position in the original DNA sequence
  • these are then visualized to see where each nucleotides are
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the key difference between how sanger sequencing used to work, and how it does now

A
  • originally 4 diff sequencing reactions were performed, each w a diff dideoxyribonucleotide
  • the DNA copies were labeled with radioactivity
    and separated on polyacrylamide gels
  • these were then exposed to film to produce
    four ladders of bands that were read manually to reveal the sequence
  • now robotic devices mix the reagents, including the four different chain-terminating dideoxyribonucleotides,
  • each one is tagged with a different-coloured fluorescent dye
  • these are loaded onto capillary gels, which separate the reaction products into
    a series of distinct bands
  • A detector then records the colour of each band, and a computer translates the information into a nucleotide sequence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Automated dideoxy sequencing was used to determine the nucleotide sequences of which genomes

A
  • e coli
  • fruit flies
  • nematode worms
  • humans
  • many others
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Due to _______ the cost of sequencing DNA has decreased dramatically, and the number of sequenced genomes has increased enormously

A

“second-generation sequencing technologies”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what do second-generation sequencing technologies allow us to do

A
  • multiple genomes to be sequenced in a matter of weeks
  • catalog the variation in nucleotide sequences from people around the world
  • uncover the mutations that increase the risk of various diseases, from cancer to autism
  • made it possible to determine the genome sequence of extinct species
  • helped us understand the molecular basis
    of key evolutionary events in the tree of life
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the most common second-generation sequencing method

A

illumina sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how does illumina sequencing work

A
  • begins with the construction of libraries of small DNA fragments that represent the entire genome
  • this is made via PCR amplification
  • it is done in a way that keeps all of the produced DNA fragments close to the original fragment
  • sequencing is done with chain-terminating nucleotides w uniquely coloured fluorescent tags
  • DNA polymerase adds the fluorescent nucleotide
  • a photo of the reaction records the colour to reveal the identity of the nucleotide that was added
  • coloured label and chain-terminating group are removed, allowing the polymerase to add the next nucleotide
  • this cycle is repeated hundreds of times
  • the computer stiches together all the fragments, using the overlaps between them as guides, to reconstruct the full genome sequence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

true/false similar to conventional dideoxy sequencing, the fluorescent tag and the chemical group that blocks elongation are both removable in illumina sequencing

A
  • False
  • this is true for illumina, but not for dideoxy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is special about third-generation sequencing methods

A

capable of sequencing much longer DNA molecules

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the 2 promising third-generation sequencing methods

A
  • single-molecule real-time (SMRT) sequencing
  • Nanopore sequencing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

describe single-molecule real-time (SMRT) sequencing

A
  • carried out in an array of tiny wells, each containing a single DNA polymerase anchored to the bottom
  • it uses deoxyrubonucleoside triphosphates where the fluorescent dye is attached to the terminal phosphate
  • as the polymerase copies the template DNA, the binding of a fluoresent nucleotide generated a colour signal to allow us to identify it
  • the signal disappears when the terminal phosphate is released during its incorporation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

true/false it is possible to use circular DNA templates that are sequenced repeatedly on both strands with single-molecule real-time (SMRT) sequencing

A
  • true
  • this greatly improves the accuracy of the
    resulting sequence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

describe nanopore sequencing

A
  • involves the transport of a single-strand DNA molecule through a tiny protein pore in a membrane
  • voltage is applied across the membrane, resulting in current through the pore
  • the passage of the nucleotides through the pore results in tiny shifts in electrical current across the membrane
  • measurement of these tiny current changes reveals the identity of each nucleotide
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

which form of sequencing does not require DNA synthesis

A

nanopore sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

using which sequencing methods can very long DNAs be sequenced

A
  • SMRT
  • nanopore
20
Q

what are unique advantages to nanopore sequencing, that do not exist with SMRT

A
  • can identify modified nucleotides
  • their effect on the current differs slightly from that of the unmodified
  • can be performed with portable, handheld instruments that can be taken into the field
21
Q

in SMRT sequencing, how are circular DNA templates used

A
  • by attaching hairpin adaptor DNAs to each end of the DNA to be sequenced
  • a primer is used that matches the adaptor
  • an enzyme called strand-displacing polymerase separates the double-stranded DNA as it moves along the template, allowing it to continue around the entire molecule many times
22
Q

what allows the experimenter to eliminate sequence errors that arise from random mistakes made by the polymerase.

A

the fact that both strands of the DNA are sequenced repeatedly

23
Q

true/false sequencing genomes has gotten more expensive with these new methods

A
  • false
  • its gotten cheaper
24
Q

how is RNA sequencing done as of right now

A
  • by converting the RNA to cDNA (via reverse transcriptase)
  • and then one of the methods we’ve learnt about for DNA
25
Q

what is a valuable tool for annotating genomes

A

RNA-seq

26
Q

true/false long strings of nucleotides, at first glance, reveal nothing about how this genetic information directs the development of a living organism

A

true

27
Q

what does the process of genome annotating attempt to do

A
  • attempts to mark out all the genes (both protein-coding and noncoding) in a genome and ascribe a role to each
  • also tries to understand the more subtle types of genome information
28
Q

what is an example of the more subtle types of genome information

A
  • the cis-regulatory sequences that specify the time and place that a given gene is expressed
  • whether its mRNA undergoes alternative splicing to produce diff protein isotopes
29
Q

what is the first step in trying to make sense of a genome sequence

A

to translate in silico the entire genome into protein

30
Q

how many different reading frames are there for any piece of double-stranded DNA

A

6

31
Q

how many different reading frames are there for any piece of single-stranded DNA

A

3

32
Q

what are open reading frames (ORFs)

A

protein coding regions, with much longer stretches without stop codons (longer than 20 AA)

33
Q

open reading frames (ORFs) often signify what

A

bona fide protein coding genes

34
Q

how is the determination of an ORFs typically double-checked

A
  • by comparing the ORF AA sequence to the many databases of documented proteins from other species
  • if a match is found (even imperfect) then its very likely that the ORF will code for a functional protein
35
Q

when does the “double-checking” strategy work best

A

for compact genomes (where introns are rare and ORFs extend for many hundreds of AA)

36
Q

when does the “double-checking” strategy not work too well

A
  • since it works best w compact genomes, when it’s not compact it;s not as effective
  • the average exon size is 150–200 nucleotide pairs for many animals and plants, and additional information is usually required to unambiguously locate all the exons of a gene
37
Q

what do we do when the genome is not compact, and we want to sequence it

A
  • can search genomes for splicing signals and other features to help identify codons
  • most powerful method though is to sequence all RNA produced
38
Q

what can RNA-seq information be used to accurately locate

A

all introns and exons of even complex genes

39
Q

true/false RNA-seq identifies noncoding RNAs produced by a genome

A

true

40
Q

what is the main reason for why we only know the approx. number of genes in the human genome

A

The existence of the many noncoding RNAs and our relative ignorance of their function

41
Q

we know from __________ that many organisms share the same basic set of proteins

A

comparative genomics

42
Q

true/false the functions of few identified proteins remain unknown

A
  • False
  • a very large number are unknown
43
Q

approximately how many proteins encoded by a sequenced genome do not clearly resemble any protein that has been studied biochemically

A

approx. one third

44
Q

what is a key limitation regarding the emerging field of genomics

A
  • comparative analysis of genomes reveals a great deal of information about the relationships between genes and organisms
  • BUT it often does not provide immediate information about how these genes function or what roles they have in the physiology of an organism
45
Q
A