Lecture 1 - Sequencing Technologies Flashcards

1
Q

What are nucleotides? What do they have on the 5’, 3’ and 1’ carbon? What are purines? What are pyrimidines?

A

-individual subunits of DNA
-5’ carbon has a phosphate group
-3’ carbon has an oh
-1’ carbon has a nitrogenous base
-purines - adenine and guanine
-pyrimidines - cytosine and thymine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What biochemical enzyme partakes in DNA synthesis?

A

DNA polymerase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What biochemical enzyme partakes in ligation in which you take two pieces of DNA and join them together?

A

DNA ligase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are protein nanopores?

A

a single molecule technique in which a polymer is thread through a nanopore which is a nanometer sized protein channel electrophretically and a sensory measures changes in ionic current as the molecule moves through the sequence to infer the sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How are nucleotides polymerized?

A

via phosphodiester bonds linking a 3’ OH to a 5’ phosphate group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is DNA polymerase?

A

a molecule that is given a template strand of DNA and a given set of nucleotides will make a reverse complement strand if a sequence of DNA so there is a template strand and growing strand of DNA and you have a primer sequence of DNA which is made from RNA and then replaced later by DNA polymerase by DNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does DNA polymerase work?

A

-forms a new phosphodiester bond between a dsDNA fragment and a free nucleotide for DNA replication
-the added nucleotide is complementary to the base adjacent to the last base pair

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a primer?

A

a short piece of DNA which is 11-17 bases that is complementary to part of a longer single stranded DNA molecule and can be added to make a double stranded DNA that can be polymerized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does DNA ligase do?

A

links two DNA fragments together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you melt DNA?

A

separate double stranded DNA molecules by heating it in solution because double stranded DNA is linked through H bonds and not covalent bonds which are intermolecular forces not intramolecular forces - when it is heated in solution the two strands separate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you anneal DNA?

A

to combine single stranded DNA it will naturally anneal because dsDNA is more energetically stable once you lower the temperature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is PCR or polymerase chain reaction?

A

-exponentially creates copies of DNA using DNA polymerase and primers and cycles of heating and cooling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a nucleotide analog?

A

has a shared structure deoxyribose and a base with natural nucleotides
-the analog has modification that adds another functional group with different functional groups key to multiple sequencing technologies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What has changed to improve sequencing?

A

-human genome can be sequenced in one day for $100-200
-high throughput sequencing took over - can sequence multiple DNA fragments in parallel enabling hundreds of DNA molecules to be sequenced at the same time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a high throughput sensing device we all have now?

A

the phone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What has enabled sequencing in hundreds of millions of fragments of DNA at a time?

A

-due to digital imaging with the exception of certain sequencing
-use digital devices to measure sequencing reaction is shrunk dow to pico scale reactions (1) need digital sensitive reaction (2) small reactions so the camera can capture it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the three major steps in high-throughput shotgun sequencing?

A
  1. Break DNA from many copies of a genome into many small fragments
  2. select million of fragments randomly
  3. read the sequences of fragments in parallel
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are some other terms for highthroughput sequencing?

A

also called massively parallel sequencing or second gen sequencing or third gen sequencing or next gen sequenicng

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the three properties of the analog which is the reversible dye terminator in illumina sequencing?

A

-there is a dye attached to the phosphate that can be detected by light
-the dye prevents incorporating an additional analog so only one analog at a time
-the dye may be cleaved and repaired to a natuve nueclotide allowing an additional polymerization reaction to happen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What happens in illumina sequencing?

A

-use DNA polymerase to add a reaction terminating analog then digital imaging to record what base was added
-run a reaction to remove terminator
-repeat
-add all four bases and take a pic after each one and whichever one fluoresces that is the base

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the reversible dye terminator analog used in illumina sequencing?

A

-chimeric molecule where one side is a normal nucleotide and the other side there is an added piece or extra molecular structure and that does some work for sequencing the DNA - often have a dye added onto the end - 4 different dyes so each nucleotide to have a specific dye so use DNA polymerase to add the dye molecule and then zoom in closely with camera and take a picture of that dye; the dye has a terminator which prevents the polymerization - this is known as the reversible dye terminator meaning you can use chemistry to remove terminator once you have taken a picture of that dye

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How do the fluorophores work?

A

by absorbing light at one frequency and emitting it at another frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are some technological challenges to enable illumina sequencing?

A
  1. need to be able to take a picture of the molecule multiple time - if the moleucles were in solution they would be diffusing randomly
    solution: chemically link DNA to a glass slide and run sequencing reaction on the slide
  2. one molecule fluoresces to little for digital imaging
    solution: amplify molecules using bridge PCR - 1000 copies of DNA sequence mounted at the same position on glass slide which magnifies the light or signals to noise ratio so you can be sure of the light you saw or did not see
24
Q

What is the first step in bridge PCR?

A

ligate the Y adapter which is a pair of sequences that are partially complementary but partially not
-the index 1 and index 2 are used to identify reads pooled from different experiments (barcodes)

25
Q

What is the first step in the amplification of bridge PCR?

A

DNA fragment with y adapters is added the melted on the chip and re-annealed to the turf
-float DNA over a slide that has a bunch of adaptor on a slide which is complementary to the Y adaptor on the sequence you just ligated so this means that the DNA is semi fixed meaning if you melt the DNA it will go away because the strand is not covalently bound to the slide

26
Q

What is the second step in the amplification of bridge PCR?

A

-the reverse complement is replicated and make a new copy of DNA that is polymerized to the chip which means the new copy is bound this means if you heat up the fragments they go away

27
Q

What is the third step in the amplification of bridge PCR?

A

-the slide is heated to melt the double stranded DNA so only the covalently bound strand remains and the original strand is gone

28
Q

What is the fourth step in bridge PCR?

A

-the DNA is cooled so that the adapter chemically ligated to the chip and anneals to another adapter

29
Q

What is the fifth step in bridge PCR?

A

-the single stranded DNA is replicated starting from the red primer and if you add polymerase it will replicate it

30
Q

What is the final step in bridge PCR?

A

-heat again to unanneal the noncovalently bound bridge and now get two strands that are on the slide and the next round of this will yield four then eight and more so

31
Q

What is the illumina sequencing approach for one read?

A

-a bunch of molecules of DNA that are all fragments of the one DNA that came from one chip and can add adapters back to the solution and polymerize with fluorophores to sequence the DNA

32
Q

How many bases are run on the longest read of illumina sequencing?

A

300 bases or 0.3 Tb because 150bpX2
-higher models can run more fragments at the same time
-both ends of the fragments may be sequenced

33
Q

What are some probability considerations you need to take into account for illumina sequencing?

A

-each base is the result of an extension wash, imaging, cleavage and terminal modification
-reactions that are not 100% efficient on all copies of the template will create molecules that lag behind
-eventually all molecules are out of sync
-with the probability of not extending a particular base over long fragments the plot shows the fraction of molecules not in sync after certain iteration which means less than 50% of the molecules are out of sync and they give different signals as the length of one read increases

34
Q

After what base in illumina sequencing foes fidelity drop off due to the error rate?

A

-after the 100th base added since they get out of sync

35
Q

What are the quality values for DNA sequencing?

A

-a quality score assigned to each base for representing the predicted probability of error of the base
-higher phred score the greater the accuracy and the phred value are rounded to integers and are capped at 60

36
Q

What are the sequence file formats?

A

FASTA format - most widely used format - has header and sequence
FASTQ format - has the header, sequence, and phred score values

37
Q

What are the errors you get with illumina seuqencing?

A

substitution errors not deletion or addition because it goes only 150bp so will never miss or add an additional base

38
Q

What is the content of a sequencing read?

A

-a read starts from the 5’ end of a molecule and after amplification the dsDNA can be read from its other 5’ strand to get the 3’ end of the og molecule

39
Q

What is the solution to limited read lengths in sequencing?

A

single molecule sequencing

40
Q

What is the primary benefit of single molecule sequencing?

A

there is no collection of molecules to have signal out of phase

41
Q

What is the old drawback of single molecule sequencing and how has this now been fixed?

A

-signal is from one molecule not 1000 so you have lower accuracy like 80-95% or 70-80% and there are errors of insertions and deletions not just substituions which complicates analysis compared to illumina which just has insertions
-this is now fixed by reading the same molecule more than once via circular dna

42
Q

What is PacBio sequencing?

A

sequencing using single molecule fluorescence

43
Q

How does PacBio sequencing work?

A

-also a light base fluorophore sequencer and they have a thin layer of metal that has little holes in it and is the size of polymerase and the wavelenght molecule so light inly penetrates through the tiny holes and the laser reflects the light shown onto an analog reflecting back into a camera

44
Q

What fluorescent nucleotide analogs attached to in Pac Bio sequencing?

A

cleaved phosphate group

45
Q

Where is the DNA polymerase tethered in PacBio sequencing?

A

-to the bottom of a nanoscale hole in a thin metal sheet

46
Q

Where is the sheet immersed in PacBio sequencing?

A

a solution og analogs all of which emit light

47
Q

Which analogs are illuminated by the laser in PacBio sequencing?

A

at the very bottom of the cell

48
Q

What takes a short amount of time to accomplish the polymerization reaction in PacBio?

A

the polymerase

49
Q

How might you miss a pulse in PacBio sequencing?

A

two bases that are the exact same next to each other and incorporated quickly and could get a greater pulse of light instead of two distinct pulses - can also get extra inserted pieces and deletion

50
Q

Why is single strand sequencing bad?

A

for repeated sequences or homosequences which the human genome has a lot of the error rate is greater

51
Q

What stops the polymerase from working in PacBio sequencing?

A

-shine pulse of light and start the reaction and start filming in real time you eventually shine enough light in the polymerase that it breaks down and stops working in theory and it can go forever but the light is damaging to the nucleotides to whycih can also negatuvely affect the polymerase

52
Q

What is HiFi workflow?

A

developed a protocol where you take a fragment of DNA and turn it into a circle and have hairpin ends so polymerase can keep reading in a circle and can get multiple reads and then corss relate the eroors and piece together the errors based on the copies you have and can get a geomteric drop odd in probabilties in seeing the same segment of DNA which causes greater accuracy

53
Q

What is the current pacbio read length?

A

15-18kb which is one human genome

54
Q

What is single molecule sequencing with nanopores?

A

-have a membrane and the membrane has a bunch of nanopores and they are proteins that form a hole and they allow single stranded DNA to pass through it and the current measuring device that is centered around the pore and the concentrations of ions passing through the pore as nucleotides do through the pore

55
Q

Why is ONT tech better than PacBio HiFi?

A

can get a much longer assembly of genomes with longer sequences that was not able to be done before an order of magnitude longer than PacBio
-the individual accuracy of oxford nanopore is better but the consensus accuracy of PacBio is better
-analysis easier with PacBio and ONT; illumina runs more sequnces of DNA still

56
Q
A