Sequencing genomes, NGS and bioinformatics Flashcards

1
Q

Describe chromosome maps

A
  • different types of map have different resolutions
  • ## lower resolution maps can generate higher resolution maps
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

cM

A

proportional to percentage recombination in a single generation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe a karyotypic map

A

microscopic observation of chromosomal spreads

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe linkage maps

A
  • genetic maps derived from monitoring recombination between markers
  • cM units
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe physical maps

A
  • measured in bp
  • tiling path of overlapping BAC clones
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe a sequence map

A

sequence of bases along the chromosome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe hierarchical genome sequencing

A
  • target genome cloned into highly redundant BAC vector library
  • creates contigs
  • indentify minimal set of overlapping clones by restriction mapping and hybridisation
  • shotgun
  • sequencing and assembly
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

contig

A
  • tiling path of BACs
  • approximately 100kb each
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

plasmid subclones

A

approximately 2kb each

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe shotgun in hierarchical genome sequencing

A

fragment BAC and subclone pieces into plasmid vector

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe sequencing and assembly in hierarchical genome sequencing

A
  • compile sequences of individual overlapping plasmid subclones to produce sequence of entire BAC
  • compile sequences of individual overlapping BAC clones to produce sequence of entire chromosome / genome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe shotgun genome sequencing

A
  • fragment entire genome and clone pieces
    directly into plasmid vector
  • forms plasmid cones
  • sequencing of plasmid cones at random
  • computational assembly
  • individual reads & sequence contigs not anchored
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe NGS

A

enable “massively parallel sequencing”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

NGS

A

next generation sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is massively parallel sequencing

A

analysis of millions of fragments from a single sample in parallel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe ‘454’ pyrosequencing

A
  • pyrophosphate (PPi) released upon nucleotide incorporation by DNA polymerase
  • PPi used to fuel a downstream set of reactions that ultimately produces light
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How does pyrosequencing produce light

A

action of luciferase on luciferin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Describe library preparation for ‘454’ pyrosequencing

A
  • shear genomic DNA to 300-800bp fragments
  • ligate oligonucleotide adapters
  • amplify fragments by PCR
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Describe ‘454’ pyrosequencing emulsion PCR

A
  • anneal DNA fragments to an excess of agarose beads that have oligonucleotides complementary to the A/B adaptors attached to them
  • 1 fragment per bead
  • disperse beads and PCR reagents in oil to form an emulsion
  • each water droplet carries a single bead
  • PCR amplifies the unique sequence on the surface of each bead
  • release beads
  • add beads to a sequencing plate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Describe the functioning of water droplets in emulsion PCRs

A

each droplet functions as a discrete microreactor, eliminating cross-talk during PCR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does emulsion PCR produce?

A

millions of copies of an identical sequence on each of hundreds of thousands of beads

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Describe the sequencing plate of emulsion PCR

A

1.6 million wells

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Describe the pyrosequencing element of ‘454’ pyrosequencing

A
  • smaller enzyme beads added to each well to surround the DNA-carrying beads
  • sequencing primer, DNA polymerase, APS and luciferin added
  • different dNTPs added sequentially to the wells in repeated cycles
  • nucleotide incorporation results in light emission
  • light intensity recorded
  • CCD camera identifies which wells have incorporated a new nucleotide, producing a signal image
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

APS

A

adenosine 5’ phosphosulphate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Describe the error of the pyrosequencing method
- 2 or more consecutive bases of the same generate proportionally greater intensity that is difficult to measure - homopolymers errors
26
Describe the pyrophosphate reaction
- APS releases PPi by sulfurylase to produce ATP - ATP reacts with luciferin, catalysed by luciferase, to produce light + oxy luciferin
27
Describe Illumina sequencing
- reversible terminator sequencing - bridge amplification
28
Compare Illumina and Sanger sequencing
- similar in principle - Illumina can remove terminator
29
Describe the library preparation for Illumina sequencing
- fragment DNA - repair ends - add A overhang - ligate adaptors - select ligated DNA
30
Compare Illumina and 454 sequencing
bridge amplification functionally analogous to emulsion PCR
31
Describe bridge amplification in Illumina sequencing
- flow cell surface covered with a lawn of attached oligo that are complementary to adaptors - library fragments bound to the flow cell by hybridisation to the oligo - PCR amplification uses only the bound oligos as primers - denature and wash original strand away - denature clusters and cleave to wash away reverse strands - ready for sequence
32
What is the effect of using only the bound oligos as primers
constrains the distribution of the products, producing clusters ('colonies') of numerous, single-stranded identical template fragments that form bridges
33
polony
polymerase-generated colony
34
Describe the sequencing element of Illumina sequencing
- clusters supplied with polymerase and all 4 nucleotides, each tagged with a different fluor - because the nucleotides have their 3'OH chemically blocked, only one is incorporated per cycle - i.e. first base is extended - after each incorporation cycle, cell is imaged to identify the new nucleotide incorporated at each cluster - chemical step removes the fluorescent tag and the 3' block - generates base calls
35
base calls
image clusters after each cycle
36
Sanger sequencing aka
chain termination
37
Summarise Sanger sequencing
- 400-900bp read length - 99.999% accurate - 96 reads per run - takes 20mins-3hrs - $5000 per million bases
38
Summarises 454 pyrosequencing
- 700bp read length - >99% accuracy - 1 million reads per run - takes 24hrs - $10 per million bases
39
Summarise Illumina sequencing
- 150-300bp read length - >99% accuracy - billions of reads per run - 1 to 10 days per run, depending on the sequencer - much less than $0.5 per million bases
40
Illumina sequencing aka
sequencing by synthesis
41
Advantages of Sanger sequencing
- long, accurate individual reads - cost effective for very small projects
42
Disadvantages for Sanger sequencing
- low throughput - expensive and impractical for large projects
43
Advantages for 454 pyrosequencing
- long read size - fast
44
Disadvantages for 454 pyrosequencing
- expensive - homopolymers errors
45
Advantages for Illumina sequencing
extremely high sequence yield
46
Disadvantages for Illumina sequencing
- short reads difficult to assemble - equipment very expensive
47
What are the second generation technologies
- 454 pyrosequencing - Illumina
48
Describe second generation technologies
rely on the parallel, phased sequencing of huge numbers of identical fragments to generate detectable signal
49
Describe third generation technologies
- can sequence single molecules - dephasing not an issue; much longer reads can be achieved
50
List some third generation technologies
- Pacific Biosciences: SMRT sequencing - Oxford Nanopore Technology: MinION, PreomethION
51
SMRT sequencing
single molecule real-time
52
Describe the characteristics of third generation sequencing
- up to 100kb read length - high error rate
53
Describe RNA-Seq
provides information on the transcriptome; which genes are expressed, relative transcript levels
54
Summarise shotgun
- sequence genome - computational assembly
55
Summarise hierarchical shotgun
- create BAC clone map - sequence BACs - computational assembly
56
Summarise cDNA sequencing
- extract mRNA - generate and sequence cDNAs - computational assembly
57
Summarise resequencing
- sequence genome - genome alignment
58
Describe genome alignment
- align to previously derived sequence - aids assembly
59
What are the advantages of resequencing?
detect polymorphisms and mutations in different individuals, strains and mutants
60
Describe sequence annotation
- origin - background information - important regions of the sequence - links to protein sequence, and other information - can contain erros (dependent on researchers for entry)
61
What is important origin data
species, variety/strain, tissue, cell line, clone, etc.
62
What is important background information for annotation
literature, researcher, etc.
63
What is important background information for annotation
literature, researcher, etc.
64
What are important regions of the sequence to annotate?
promoter, introns, coding sequence, motifs, etc.
65
How to annotate
- identify ORFs - search databases using ORFs as queries
66
What will databases supply on searching for ORFs as queries
- related genes (potentially of known function) - conserved functional domains or motifs - protein targeting sequences, TMDs, etc.
67
Finding related genes
- BLAST
68
BLAST
Basic Local Alignment Search Tool
69
BLASTN
nucleotide query sequence searched against a nucleotide database