L3: Genome Sequencing Flashcards

1
Q

What were the objectives of the genome project

A
  1. whole genome sequence
  2. establish an interface
  3. identity + annotate genes
  4. characterize DNA diversity
  5. more resources
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Go through the evolution of DNA sequencing technologies

A
  1. 1977 Sanger sequencing by Fredrick
  2. Next Generation: Roche 2005, Illumina 2007
  3. Third Generation: from 2010 includes PacBio
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is sanger sequencing

A
  1. Use a single primer to make a single DNA strand
  2. ss molecules made from templates using dNTPs and randomly terminated by adding dideoxynucleotides (ddATP, ddTTP, ddCTP, ddGTP)
    ==> seperated by polyacrymide gels
  3. the ddNTPs are labelled with fluro dyes
  4. sequenced with capillary sequencer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does high throughput Next Gen sequencing work and PROS

A

Make millions of short sequences in single run ==> sequences can overlap to be 100K base pairs long
PROS
- almost complete genome but prcy
- mix populations for biodiversity measure
- can use RNA population for gene expression analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Illuminia Soleca

A
  • producing single-stranded DNA
  • ligate the adapter oligos to DNA fragments
  • use microfluidic cluster to add fragments to the surface of a glass flow cell, each flow cell seperated into 8 lanes
  • interior surface covalently attached to oligos
  • complementary oligos are ligated into fragments
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Compare Illumina vs 454 Sequencing

A

454
* can make longer read than illuminia cause can do multiple reads at once

lluminia
* DNA/RNA fragments are shorters
* adapters are added
* fragments amplified by PCR w/ adapter primers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is third generation sequencing

A
  • PACBIO SMRT (single molecule real-time)
  • Illumina Tru Seq
  • Oxford
  • produces very long reads
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does Pac Bio work (easy then hard)

A

Generate amplicon => ligate adapters => sequence => data analysis
1. SMRT bells ligated to each amplicon
2. Sequencing primer annealed to SMRT bell template and polymerare bound to the complex
3. complex loaded into zero mode wavelengths to replicate and produces nucleotide specific fluroesnce
4. circular consensious allows poly to repeatedly replicate => one long read

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is nanoport sequences

A
  1. determine sequence of DNA fragments by passing DNA through protein pore in membrane
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Shot gun sequencing

A
  • Contifs are built by overlapping reads
  • There are always gaps between contigs where software cannot extend anyfurther
  • Always many reads left over that cannot be assigned to contig

Why: when reads contain parts of repetitive sequences, they may overlap thousands of other reads, making it impossible to uniquely determine overlaps. Consequently: Very few eukaryotic genomes are ever completely sequenced.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are key DNA sequencing NGS steps

A
  1. fragment genomic DNA into small fragments of a few hundred bases
  2. immobilize individual DNA molecules onto a solid surface
  3. amplify each molecule by PCR many thousands of times
  4. perform DNA synthesis using nucleotides that emit a characteristic wavelength of light each time a base is added
  5. read the sequence by imaging the emission of light in real time
  6. from the thousands or millions reads, assemble overlapping reads into long contiguous segments known as “contigs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly