Genome Sequencing Flashcards
Sequencing a genome can be viewed as…
Obtaining the parts list of the cell/organism.
What 4 things are required for DNA synthesis?
DNA polymerase, primer, template DNA and dNTPs.
Describe DNA sequencing by the chain termination method (aka Sanger Sequencing)
- Template DNA, primer, DNA polymerase, dNTPs and ddNTPs added to reaction in gel electrophoresis
- Different size fragments are generated during DNA synthesis depending on the location of ddNTP incorporation/termination.
- Reaction stops when ddNTP is added, which helps determine the order of nucleotides
Describe fluorescent-labelled ddNTPs and capillary electrophoresis
Fluorescent ddNTPs determine which ddNTP has been incorporated in sequencing reaction
- As the DNA fragments exit the capillary, they pass through a laser detection system. The laser excites the fluorescent dye attached to the ddNTP at the end of each fragment, causing it to emit light at a specific wavelength corresponding to one of the four bases (A, T, C, or G). The emitted light is detected and recorded as a peak in a chromatogram.
Why is capillary electrophoresis better for fluorescent-labelled ddNTPs
Sequencing reactions are run on capillary gel electrophoresis (better heat dissipation and resolution, less sample required, more parallel reactions run at a time)
What is automated base calling?
- Form of Sanger sequencing, scanner records coloured images of different sized termination fragments for each fluorescent-labelled ddNTP
- Computer processes fluorescent signals to generate an electropherogram, assigning a base to each peak.
In automated base calling, what is Phred?
- Phil’s revised editing program
- Electropherograms are usually messy, so Phred estimates a probability of error for each base call in the electropherogram
- Error % is based on parameters such as shape of a peak, spacing between peaks, height of a peak.
What are automated sequencers?
- All steps from sample loading to base calling is automated
- Sequencing reactions are usually performed manually in 96-well microplates in a thermal cycler (denaturing, annealing, extension)
- Using machines like the “Applied Biosystems 3730xl DNA analyzer”, we can obtain up to 800 bp of sequence/reaction.
When doing Sanger sequencing, why is the reaction only limited to obtaining up to 800 bps of sequence per reaction?
The polymerase falls off (the polymerase has a certain sensitivity)
Describe the Human Genome Sequencing Project, specifically the public and private sectors involved in the project
- Advances in automated sequencing allowed for genomic projects such as the human genome project.
- Project formally proposed in 1985 with NIH and US Department of Energy with a 15 year and $3 billion plan (public) consisting of international genome sequencing centers
- Private consortium (Celera Genomics) started second project in 1998 to complete genome sequence in three years (very-profit driven, and you can’t patent anything made by nature)
In the Human Genome project, the DNA came from anonymous donors of diverse ethnic backgrounds. Why?
This system was better than a lottery system to sequence some random person’s DNA because it helped us determine how similar human genomes actually are (since the genome hadn’t been sequenced by that point). Found that humans have a pretty similar genome, with only some different alleles to account for our different phenotypes.
Describe the shotgun approach to sequencing
The shotgun approach requires breaking the genomes into smaller fragments or clones and sequencing these fragments
Random shearing/sonication in sequencing
Randomly breaking the fragments of the chromosome into random bits, and fragments are sequenced independently
You need many copies of the same fragment to perform Sanger sequencing and to accurately see fluorescence. What can be used (in general) to accomplish this?
Cloning vectors
What are the 5 common features of a vector?
- Promoter: constitutive (always on)/inducible
- Multicloning site: unique restriction sites of inserting gene
- Epitope tag: protein purification/localization
- Origin of replication: determines copy number (also ensures that both daugheter cells have the vector)
- Selectable marker: antibiotic resistance (used to identify which E/coli actually has the vector)
Phagemid
1 kb insert
Plasmid
up to 10 kb insert
P1 clone
100 kb insert
Bacterial Artificial Chromosome
up to 300 kb insert
What are the steps of hierarchical shotgun sequencing (3 steps)
- Chromosome is fragmented by partial restriction digest or shearing (sonication)
- Clone the unique fragments into BACs (300 kb), PACs (100 kb) and cosmids (50 kb), and transform into E.coli (DNA library which contains all the colonies together. Each colony contains one vector)
- Map the correct order of cloned fragments to select BACs for sequencing (all genome is represented).
What is the goal when mapping the correct order of BAC clones?
Sequence the minimum number of nucleotides to cover the entire genome to cut costs (i.e. don’t want to sequence multiple BACs containing the same region of the genome)
What are two ways of detecting BACs with overlapping genome sequences?
- BAC library screening by hybridization
- Restriction fingerprinting BAC clones
Describe the steps for BAC library screening by hybridization
- Rapid identification of overlapping clones using a random sequence/probe (single stranded DNA)
1. BAC colonies are robotically transferred to nitrocellulose/nylon membrane and screened with a radioactive probe
2. Probe will only hybridize to BAC colonies with overlapping fragments. Black spots show where the probe is bound (black due to the radioactivity of the probe)
3. The sequence at the end of a clone can be used as a probe in a subsequent screen to look for overlapping fragments: “chromosome walking”
Describe restriction fingerprinting of BAC clones
- Complete restriction digest of BAC clones followed by gel electrophoresis to determine restriction fragment profile for each BAC clone
- Identify BAC clones with common restriction fragments
-Overlapping patterns indicate that two BAC clones share common DNA sequences, allowing researchers to identify overlaps between different clones - By comparing the overlap of many clones, scientists can begin to determine the relative positions of BAC clones along the genome.