Genome wide DNA and RNA analysis Flashcards

1
Q

What is a DNA microarray?

A

Known DNA fragments corresponding to specific genes are laid out in microscopic quantities on a solid surface at defined positions at a very high density

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a Spotted Microarray?

A

cDNA fragments or synthetic oligonucleotides are spotted or “printed” onto a solid medium, often glass flies
Printed DNAs are 50- 200 um apart, and up to 20,000 per slide

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the GeneChip Microarray?

A

Oligonucleotides are synthesized directly on the surface of a solid support
More than 400,000 oligonucleotide sequences can be placed in a 1.28 cm by 1.28 cm area (10 nm apart)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does the spotted microarray work?

A

1) Isolate RNAs from control and experimental cells or tissues
2) Synthesize cDNAs labeled with fluorescent dyes by RT-PCR –> Control: green fluorescence, Experimental: red fluorescence
3) Mix equal amounts of cDNAs and hybridize to the micro-array
4) Record the microarray result with a laser scanner at dye specific wave-lengths and analyze data with appropriate computer software

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does the GeneChip work?

A

Synthesize oligonucleotide probes on glass.
1) The surface is coated with a reactive group that is blocked by a photosensitive agent
2) This blocking agent can be removed with light
3) Can also mask the blocking agent preventing light from removing the blocking agent
Let’s look at an example:
In the 1st cycle, 4 out of 6 spots are masked, so the light can only reach the 2 unmasked spots, and remove the blocking agent. Then nucleotides blocked by a photo-sensitive agent can be chemically coupled to the unblocked spots.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When would a GeneChip be used?

A

To genotype for SNPs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the main applications of the DNA microarray Technology?

A

1) Transcription profiling of an entire genome

2) Genotyping

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is ChIP-chip?

A

Genome wide search for DNA-protein interactions in yeast by ChIP-chip analysis. (Chromatin immunoprecipitation followed by GeneChip array analysis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does ChIP- chip work?

A

1) Cross link proteins of the wild-type and mutant to DNA
2) Extract and shear cross-linked DNA
3) Immunoprecipiate with antibody specific to protein
4) Reverse cross link, amplify (PCR), and label DNA
5) Hybridize to microarray containing all intergenic regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Next-Generation Sequencing?

A

First generation sequencing method: automated dideoxynucleotide sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do Next Generation Sequencing methods differ?

A

They can differ by template preparation, sequencing and imaging, and data analysis (genome alignment and assembly) methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the types of Next Generation Sequencing?

A

1) Illumina
2) Roche 454
3) Ion Torrent
4) SOLiD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the types of Third Generation Sequencing?

A

1) PacBio
2) Nanopore
3) SLR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain how Illumina prepares its library, template, and how the sequencing works?

A

Library Prep: DNA or RNA
Template Prep: Bridge amplification
Sequencing: By synthesis, fluorescent detection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain how Roche prepares its library, template, and how the sequencing works?

A

Library Prep: DNA or RNA
Template Prep: Emulsion PCR
Sequencing: By synthesis, fluorescent detection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain how Ion Torrent prepares its library, template, and how the sequencing works?

A

Library Prep: DNA or RNA
Template Prep: Emulsion PCR
Sequencing: By synthesis, deltaPH detection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Explain how SOLiD prepares its library, template, and how the sequencing works?

A

Library Prep: DNA or RNA
Template Prep: Emulsion PCR
Sequencing: By ligation, fluorescent detection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is library preparation?

A

DNA/RNA samples are randomly fragmented and platform specific adaptors are added to the flanking ends to produce a library

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How are DNA/RNA libraries prepared?

A

1) Ligation of genomic DNA to linkers at both ends and circularization of the DNA via the linkers
2) Fragmentation of the DNA by restriction enzymes
3) Enrichment of fragments containing the linker
4) Preparation of a sequencing library by addition of adaptors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is template preparation?

A

Amplify the template DNA
Original template DNA is not present in high enough quantities, so amplify the template multiple times
Amplify the same region, so there are identical sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Why is template preparation required?

A

Most imaging systems cannot detect single fluorescent events, so amplified templates are required

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is an issue with template preparation?

A

Template amplification may introduce bias:

  • -> Amplification bias against AT, GC rich regions (can be corrected by adding PCR additives)
  • -> Overrepresentation of smaller fragments (may be ameliorated by fewer PCR cycles)
23
Q

How does Solid phase amplification/bridge amplification occur?

A

This is drawn to help explain

24
Q

How does Emulsion PCR occur?

A

1) DNA fragmentation and adaptor ligation
2) DNA fragments are added to an oil mixture containing beads
3) Emulsion PCR results in multiple copies of the fragment
4) Beads are deposited on plate wells ready for sequencing
This is also drawn to help explain

25
Q

What is Massive Parallel Sequencing?

A

In Sanger sequencing, the DNA synthesis and detection steps are two separate steps, and slow
NGS relies on coupling the DNA synthesis and detection and multiple sequencing reactions are run simultaneously
For most NGS platforms, desynchronization of reads during sequencing and the detection cycle is the main cause of sequencing errors and shorter reads

26
Q

What is Cyclic Reversible Termination?

A

It is sequencing by synthesis, single DNA molecule templates are clonal amplified

27
Q

How does Cyclic Reversible Termination occur?

A

1) All four nucleotides, each labeled with a different dye are incorporated
2) Wash, four-colour imaging
3) Cleave dye and terminating groups, wash
4) Next cycle begins
One flourescently labelled nucleotide is added each cycle
Signal always has to be picked up from the same spot

28
Q

How does Illumina use Sequencing by Synthesis?

A

1) Add dye-labeled nucleotides and then wash
2) Scan and detect nucleotide specific fluorescence as it binds to template DNA
3) Remove 3’ blocking group: Reversible termination
4) Cleave fluorescent group (dye)
5) Rinse and Repeat

29
Q

What are the advantages of Illumina?

A

1) High throughput/cost

2) Suitable for a wide range of applications most notability whole genome sequencing

30
Q

What are the disadvantages of Illumina?

A

1) Substitution error rates

2) Dephasing/desynchronization causes sequence quality deterioration towards the end of the read

31
Q

What is Roche 454?

A

Uses Pyrosequencing
Beads into holes, and light emitted from each hole
However, cannot detect nucleotide repeats

32
Q

How does Pyrosequencing work?

A

Makes use of pyrophosphates
Also uses luciferase, which emits light
If a T is added, it gives the signal that a T was added, which indicates that the nucleotide on the the template sequence was an A
When it works, it works really well

33
Q

How does SOLiD perform sequencing by ligation?

A

Through 2-base encoding
Instead to the typical single dNTP addition, two base matching probes are used, for a total of 16 probes
Colour Space: Four colour sequencing encoding further increases accuracy

34
Q

What is the steps involved in sequencing by ligation?

A

1) Anneal the primer and hybridize the probe
2) Ligation and Detection
3) Cleave fluorescent tail (3-mer)
4) Repeat ligation cycle
Only the first 2 base pairs of the primer have to base pair perfectly to the unknown target sequence, and an adaptor needs to be added
The adaptor has a sequence that is known and perfectly base pairs with target sequence
Repeat steps 1-4 with (n-1) for the primer

35
Q

How can the sequence be determined using the colour space in SOLiD?

A

There are 16 possible base combinations that are represented by four colours
The first base comes from the adaptor
The first base is known from the adaptor sequence
Overlap method

36
Q

Explain how the sequence will be determine if blue, green, green, blue, yellow is produced?

A
Blue: AA (from adaptor) 
Green: AC (from overlap method) 
Green: CA (from overlap method) 
Blue: AA (from overlap method) 
Yellow: AG (from overlap method) 
Overall Sequence: AACAAG (Last colour includes both colours) 
Slide 30 has an example you can look at.
37
Q

What is special about second Gen sequencing?

A

Everything needs to be in DNA

38
Q

What is Ion Torrent?

A

Similar to pyrosequencing, but using semiconducting chip to detect dNTP incorporation
The chip measures differences in pH (light: peak is registered, but the peak is not proportional to how many nucleotide repeats there are)
Higher output and longer reads
Cannot detect nucleotide repeats

39
Q

What is PacBio SMRT?

A

3rd-Gen Real Time Sequencing

40
Q

What is special about PacBIO SMRT?

A

Unlike reversible termination methods the DNA synthesis process is never halted. Detection occurs in real time

41
Q

How does PacBIO SMRT work?

A

Library Prep:
1) DNA template is circularized by the use of “bell” shaped adaptors
2) As long as the polymerase is stable, this allows for continuous sequencing of both strands
There is a size limit and a single stranded loop is used as the template for DNA synthesis
Synthesis:
1) DNA polymerase is fixed on the surface
2) Looped DNA comes in and synthesis occurs, each nucleotide added gives a signal
No template amplification is required

42
Q

What system does PacBIO use to detect nucleotides?

A

Single Molecule Sequencing:
Instead of sequencing clonal amplified templates from beads (Pyro) or clusters (illumina) DNA synthesis is detected on a single DNA strand
Uses Zero-mode waveguide (ZMW)
1) DNA polymerase is affixed to the bottom of a tiny hole
2) Only the bottom portion of the hole is illuminated, allowing for detection of incorporation of dye-labeled nucleotide
Every nucleotide has a different property, such as wavelength/current that the machine can register, and output
This happens in real-time

43
Q

What is Nanopore?

A

3rd Gen Real Time Sequencing that uses the single molecule sequencing technology
Based on the principle that each nucleotide has a different size and different electrical properties
Bases are guided through the nano pores by a molecular motor protein across an electrical resistant membrane (which has numerous nano pores)
Directly sequences single stranded DNA or RNA molecules in massive parallel by measuring characteristic current changes as the bases pass through the nano pores

44
Q

How does Nanopore work?

A

There is no DNA synthesis and happens in real time
There are motor proteins in the pore which thread the DNA strand through the protein channels in machine which utilizes the ionic current created in the machine
There is also an unwinding enzyme that unwinds the double stranded DNA into single stranded DNA so that the machine can read the DNA
As each nucleotide goes through the pore, a signal is given which identifies the nucleotide
In theory, there is no size limit, but in reality errors can occur the larger the DNA strand is

45
Q

Which sequencing system is better for direct RNA sequencing?

A

Nanopore because does not require synthesis and do not need to copy RNA to cDNA
Nanopore can direct sequence RNA because the RNA strand can just be thread through the machine and RNA strand can be read as it is.

46
Q

Which sequencing system is better for De novo genome assembly?

A

3rd Gen Sequencing because very long reads are needed to put together a genome

47
Q

What are the advantages of 3rd Gen Sequencing?

A

No amplification required

Extremely long reads

48
Q

What are the disadvantages of 3rd Gen Sequencing?

A

Higher error rates than 2nd Gen Sequencing (NGS)

Rate of 15% for indels, 1% for substitutions

49
Q

Which system would you use for short sequences and why?

A

NGS because with 3rd Gen, these sequences can undergo deletions, duplications, making it difficult to put together the genome

50
Q

Which system would you use for a sequence that is unknown/lot of variation/or complex?

A

3rd Gen because can sequence hundreds of kb, which makes it easier to put together the genome through the overlapping sequences (re-sequencing)

51
Q

What system would you use for little variation, small genome, SNPs, structural changes, and no reference genome, and a virus, and the human genome?

A
Little variation: NGS
Small Genome: NGS 
SNPs: NGS 
Structural Changes: 3rd Gen 
No reference: 3rd Gen 
Virus: NGS, because not complex 
Human: NGS because there is a reference genome given
52
Q

Explain whole genome sequencing fold of coverage

A

1X coverage: Most places only read once, could have duplications, deletions, or read error
5X coverage: Each nucleotide was read 5 times in the sequence
10X coverage is a good starting point because this is where SNPs start to be picked up
More coverage: More accurate and needed for larger, more complex genomes

53
Q

Where are indels an issue?

A

Indels are problematic in 3rd Gen Sequencing

54
Q

Looking at Roche, Ilumina, SoLID, Ion, and PacBio what are the advantages/disadvantages?

A

Roche: Long Read length, but high error rate in homopolymer, and 8 hours in run time
Illumina: High throughput/cost, but can only perform short reads, and long run time
SOLiD: Low error rate, but short reads and long run time
Ion: Short run times
PacBIO: No PCR, longest read length, but have a high error rate