PREFI LEC: DNA SEQUENCING Flashcards
Refers to the order of the nucleotides in the DNA molecule
DNA SEQUENCE
Applications of DNA sequencing in medical laboratory:
1. Detection of mutation 2. Typing microorganisms 3. Identifying human haplotypes 4. Designating polymorphisms 5. Treatment strategies
DNA SEQUENCE
Sequencing Methods:
- Direct sequencing: manual and automated
- Pyrosequencing
- Bisulfite DNA sequencing
- RNA sequencing
- Next-Generation sequencing
Direct determination of the order, or sequence of nucleotides in a DNA polymer
DIRECT SEQUENCING
Most specific & direct method for identifying genetic lesions (mutations)/ polymorphisms
DIRECT SEQUENCING
2 TYPES OF DIRECT SEQUENCING
- Manual sequencing (chemical/ MaxamGilbert & Sanger sequencing)
- Automated fluorescent sequencing (dye primer & dye terminator sequencing)
2 PROCESS IN MANUAL DNA SEQUENCING
- Chemical (Maxam-Gilbert) Sequencing
- Dideoxy Chain Termination (Sanger) Sequencing
Requires a ds/ss version of the DNA region to be sequenced, with 1 end radioactively labeled (32P)
- Chemical (Maxam-Gilbert) Sequencing
Allan M. Maxam & Walter Gilbert
Chemical (Maxam-Gilbert) Sequencing
Sequencing proceeds in 4 separate reactions
Chemical (Maxam-Gilbert) Sequencing
Template used in Chemical (Maxam-Gilbert) Sequencing
labeled fragment
Addition of a ______ = ssDNA would break at specific nucleotides
strong reducing agent (10% piperidine)
After reactions: fragments will be separated by size on a ______
denaturing polyacrylamide gel (6-20%)
Short fragments (up to 50bp) =
1-2 hours
Long fragments (>150 bp) =
7-8 hours
Frederick Sanger
Uses dideoxynucleotides (ddNTPs) to determine the order/sequence of nucleotides in a nucleic acid
Primer complementary to DNA to be sequenced
Dideoxy Chain Termination (Sanger) Sequencing
Product detection of sequencing of Dideoxy Chain Termination (Sanger) Sequencing
- Primer is attached at the 5’ end to a 32P/fluorescent dye-labeled nucleotide
- Incorporate 32P/35S-labeled dNTPs in the nucleotide sequencing reaction mix (internal labeling)
are added, terminating the DNA synthesis (chain termination)
ddNTPs
5’-3’ phosphodiester bond cannot be established to incorporate a subsequent nucleotide
Lack OH
Components: Mixed in 4 reaction tubes
- DNA template (PCR product)
- Radioactively-labeled primer
- Enzyme (DNA polymerase)
- dNTPs (all 4)
- Buffer (20 mM EDTA, formamide, gel tracking/loading dyes)
- Different ddNTPs in each of the 4 tubes
Sequencing reaction of Dideoxy Chain Termination (Sanger) Sequencing
thermal cycler (cycler sequencing)
Automated reading of DNA sequence ladder requires fluorescent dyes (4 distinct colors) to label primers / sequencing events
1. Fluorescein
2. Rhodamine
3. Bodipy (4,4-difluoro-4-bora-3a,4a-diazas-indacene)
AUTOMATED FLUORESCENT SEQUENCING
Fluorescent dyes can be distinguished by
automated sequencers
Approaches (to label fragments according to their terminal ddNTP):
dye primer & dye terminator sequencing
Fragments ending in ddATP, read as A in the sequence =
green dye
Fragments ending in ddCTP, read as C in the sequence =
blue dye
Fragments ending in ddGTP, read as G in the sequence =
black/yellow dye
Fragments ending in ddTTP, read as C in the sequence =
red dye
4 different fluorescent dyes are attached to 4 separate aliquots of the sample
Dye Primer Sequencing
Dye molecules are attached to the 5’ end of the primer = 4 versions of the same primer w/ different dye labels
Dye Primer Sequencing
Products are labeled at the 5’ end using the dye color associated w/ the ddNTP at the end of the fragment
Dye Primer Sequencing
1 of the 4 fluorescent dyes attached to each of the ddNTPs
Dye Terminator Sequencing
All 4 sequencing reactions are performed in the same tube
Dye Terminator Sequencing
Product fragments are labeled at the 3’ end
Dye Terminator Sequencing
4 sets of sequencing products in each reaction are loaded onto a single gel lane/capillary
Fluorescent dye colors distinguish which nucleotide is at the end of each fragment
Fluorescent detection equipment yields results as electropherogram
Base calling: process of bases ID in a sequence by sequencing software If not clear, N will replace A, C, T, or G
Automated Electrophoresis
Determines a DNA sequence without having to make a sequencing ladder
PYROSEQUENCING
Relies on the generation of light (luminescence) when nucleotides are added to a growing DNA strand
No gels, fluorescent dyes, ddNTPs
PYROSEQUENCING
Reaction mix components PYROSEQUENCING
- ssDNA template 2. Sequencing prime 3. Sulfurylase 4. Luciferase 5. Substrates: adenosine-5’-phosphosulfate (APS) & luciferin
- 1 of the 4 dNTPs is added to predetermined order of the reaction
A.K.A. methylation-specific sequencing
BISULFITE DNA SEQUENCING
Chain termination sequencing designed to detect methylated cytosine nucleotides
2-4 µg of genomic DNA is cut with restriction enzymes to facilitate denaturation
BISULFITE DNA SEQUENCING
DNA is denatured (97ºC for 5 mins) & exposed to bisulfate solution (sodium bisulfite, NaOH, hydroquinone) for 16-20 hrs
Cytosines are deaminated uracil 5-methylcytosines are unchanged Can be detected by Sanger sequencing/ pyrosequencing
BISULFITE DNA SEQUENCING
Early approaches: used RNase to cut endlabeled RNA at specific nucleotides
Other approaches:
based on amino acid sequence based on sequencing of its complementary DNA
RNA SEQUENCING
Based on single-molecule sequencing technology & virtual terminator nucleotides mRNA is captured by immobilized polydT oligomers (through their polyA tails)
RNA w/o polyA tails: initial treatment w/ polyA polymerase
4 reversibly dye-labeled nucleotides are sequentially added
DIRECT RNA SEQUENCING
A.K.A. massive parallel sequencing
NEXT-GENERATION SEQUENCING (NGS)
Designed to sequence large numbers of templates carrying millions of bases
Powerful computer data assembly systems (bioinformatics, computer software and support) are required
NEXT-GENERATION SEQUENCING (NGS)
Require the preparation of a sequencing library (sets of DNA fragments representing the regions to be sequenced)
NEXT-GENERATION SEQUENCING (NGS)
Collection of genes that have been grouped for testing, enabling simultaneous sequencing of all genes (2 to >1000 genes)
Focuses on targeted selection of specific genes for a specific purpose
Gene Panels
3 TYPES OF Gene Panels
HOT SPOT PANEL
TARGETED PANEL
VERY LARGE PANEL
target regions of specific genes known to affect treatment response, disease state, or clinical condition
“Hot-spot” panels
critical genes in particular diseases (hematological-cancer specific, solidtumor specific)
Targeted panels
diagnostic, prognostic, discovery purposes
Very large panels (>3000 genes)
Collection of DNA fragments (100-1000 bp) to be sequenced
Represents a broad, comprehensive collection of DNA sequences (e.g., genomic or cDNA), allowing for genome-wide/large-scale analyses
NGS Library
synthetic short dsDNA carrying sequences complementary to a single primer pair, which may contain short sequences that will ID the sample (indexing / bar coding)
Adapters
The regions to be sequenced are enriched by:
1. Probe hybridization
Probes = biotinylated oligonucleotides complementary to specific gene regions
Targeted Libraries
loss of library fragments from the sequenced regions
Allele dropout
4 NGS Platforms
- Ion-conductance
- Reversible dye terminator sequencing
- Sequencing by ligation
- Nanopore sequencing
Indexed libraries (gene panels) are amplified using primers immobilized on microparticles (beads) in aqueous oil emulsion using adapters on the library fragments complementary to the immobilized primers
Ion-Conductance
Captured/amplified fragments are hybridized to immobilized primers on a solid surface (flow cell)
Labeled nucleotides are applied to the flow cell & incorporated into growing chains by DNA polymerase at each polony location
Reversible Dye Terminator Sequencing
Uses a pool of labeled oligonucleotide DNA ligase to identify the template sequence through the known probe sequences
Sequencing by Ligation
Does not require fragmentation & amplification of the template DNA
Each nucleotide can be identified by a disruption in current as it passes through the pore
Also used for direct RNA sequencing
Nanopore Sequencing
Optical signals are translated to a nucleotide sequence (base calling), which is measured by the Phred score, acceptable = 2-3 (100-1000-fold certainty of a correct call)
Each sequence is compared to a reference sequence (“normal”) through read alignment
Data Analysis
based on comparison w/ the reference sequence (SNVs, indels, rearrangement sequences, CNVs)
Variant ID
Sequence variations from the reference are arranged in a
variant call file (VCF)
performed for critical variants ID
Annotations
t or f
Confidence in the variant call is determined by sequence quality & coverage = at least 500x (recommended)
t
Involves using computer technology (in silico) to collect, store, analyze, & disseminate biological data & information (computational biology)
BIOINFORMATICS
System for homology searches
Searches GenBank Searches can be made of NA & amino acid sequences
Limits & parameters can be added (type of organisms)
Matches/hits = diagram showing alignments & color code
BASIC LOCAL ALIGNMENT SEARCH
TOOL (BLAST)
Assigned a universal nomenclature for mixed, degenerate, or wobble bases
International Union of Pure and Applied Chemistry and the International Union of Biochemistry and Molecular
Biology (IUB)
Assigned a universal nomenclature for mixed, degenerate, or wobble bases
International Union of Pure and Applied Chemistry and the International Union of Biochemistry and Molecular
Biology (IUB)
to decipher the sequence of the complete human genetic material (entire genome), identify all genes contained within the genome, & provide research tools to analyze all this genetic information
THE HUMAN GENOME PROJECT (HGP) primary mission
THE HUMAN GENOME PROJECT (HGP) established and headed by??
National Institutes of Health (NIH) headed by James Watson
1st complete genome sequence (1984)
Epstein-Barr virus
_____ (Institute Genomic Research) completed the:
1st sequence of a free-living organism (Haemophilus influenzae)
Sequence of the smallest free-living organism (Mycoplasma genitalium)
Craig Venter & colleague
1st sequence of a free-living organism
Haemophilus influenzae
Sequence of the smallest free-living organism
Mycoplasma genitalium