week 3: WES Flashcards
purpose of training - whole exome sequencing
learn more
give more solutions to customers
increase sales opportunity
central dogma of biology
DNA -> transcription -> RNA -> polyadenylation mRNA -> translation (in the ribosome) -> protein
WES/WGS variant analyses happen to DNA
mRNA-seq expression analysis happens to mRNA
Exon vs intron
exon - a segment of a DNA or RNA molecule containing coding info. for a protein or peptide sequence
intron - a segment of a DNA or RNA molecule which does not code for protein and interrupts the sequence of genes
so introns are usually spliced out because they do not code for a protein
service overview - review
Theexomeis composed of exons within the genome, the sequences which, when transcribed, remain within the mature RNA after introns are removed by RNA splicing and contribute to the final protein product encoded by that gene
1-2% of the genome
Can be done with any species, but a capture kit must exist.
Novogene only performs WES for mouse and human. We use Agilent for both
research use only (ROU)
Agilent SureSelect V6 58M for Human
Agilent SureSelect Mouse All Exon for Mouse
Clinical
Human Clinical WES (CAP/CLIA)
- US CLIA Certified (Algilent SureSelect V6 58M)
- US CLIA Certified (Twist V2)
- China CAP Certified (IDT V1, XGen Panel)
Lab Locations:
Human WES -> China Lab & U.S. Lab (preferred)
Mouse WES -> China Lab only
Talking Points
WES vs. WGS
Whole exome sequencing cannot:
- Look at introns
- Epistatic interactions (gene-gene interactions)
- Look at structural variant (SV, CNV) - so chromosomal
When is whole genome sequencing better:
- The client is looking for novel mutations
- Has more uniform sequencing coverage
Exome sequencing allows for a higher sequencing coverage due to lower costs
Workflow
Starting with genomic DNA, samples are sheared resulting in small DNA fragments
Libraries are prepared with Illumina compatible adapters and indices
Biotinylated cRNA baits are incubated with the library for 16 hours
Targeted regions are selected using magnetic streptavidin beads
Targeted regions are amplified, producing a sequence ready library
Sequence on NovaSeq 6000
Sample Requirements
human WES
for genomic DNA
- amount (Qubit): > or equal to 300ng
- volume: > or equal to 15ul
- concentration: > or equal to 15ng/ul
- purity: OD 260/280 = 1.8-2.0, no degradation, no contamination
for FFPE
- amount (Qubit): > or equal to 400ng
- volume: > or equal to 20ul
- concentration: > or equal to 20ng/ul
- purity: fragments longer than 1000bp
for cfDNA/ccDNA
- amount (Qubit): > or equal to 35ng
- volume: > or equal to 20ul
- concentration: > or equal to 0.5ng/ul
- purity: fragments of 170bp or its multiples, no genomic DNA contamination
mouse WES
for genomic DNA
- amount (Qubit): > or equal to 300ng
- volume: > or equal to 15ul
- concentration: > or equal to 15ng/ul
- purity: OD 260/280 = 1.8-2.0, no degradation, no contamination
for FFPE
- amount (Qubit): > or equal to 400ng
- volume: > or equal to 20ul
- concentration: > or equal to 20ng/ul
- purity: fragments longer than 1000bp
Analysis Pipeline
- Data quality control: filtering reads containing adapter or with low quality
- Alignment with reference, statistics of sequencing depth and coverage
- SNP and InDel calling, annotation and statistics
- Somatic variant detection (only apply for tumor-normal paired samples)
- SNP calling, annotation and statistics
- InDel calling, annotation and statistics
- CNV calling, annotation and statistics
coverage
Data Output – Raw vs. On-Target
6Gb/sample = 50X
12Gb/sample = 100X
How many Gb data is needed to get 200X coverage?
data amount (Gb) = coverage (X) * 0.12
Because of the lack of uniformity in WES capture, the raw coverage of 100x does not guarantee that each exon will have 100x.
If the client requires a MINIMUM of 100x, you must encourage them to sequence to a higher depth (usually to 150x)
Human Whole Exome Sequencing Pricing
Mouse Whole Exome Sequencing
Quotation Checklist
PI Name/Name for Quote:
Species:
Sample Number:
Coverage:
Material Provided:
Bioinformatics Analysis: Yes/No
Project Design
FFPE Samples -
Higher depth of coverage recommended due to low quality of DNA
Whole exome sequencing is species specific
Cannot use different reference genomes – analysis is based upon the specific reference used to design the capture probes
For tumor samples for paired comparisons, normal sample is needed (CNV)
what is WES
Whole Exome Sequencing (WES) is a high-throughput genetic sequencing technique used to capture and analyze the protein-coding regions of an individual’s genome, known as the exome. So it captures and studies the exome of the genome while WGS studies and looks at the whole genome which includes the exome and introns
The exome constitutes only about 1-2% of the entire human genome, but it contains approximately 85% of the known disease-causing genetic variants, making it a cost-effective approach for identifying genetic variations that may be associated with diseases, particularly rare genetic disorders and cancers.
WES vs. WGS
WES allows a targeted approach allows for deep sequencing of specific genetic regions, reducing the amount of data generated and therefore cost compared to WGS
WES
- Scope: WES targets and sequences the protein-coding regions of the genome, known as exons, which make up about 1-2% of the entire genome.
- Coverage: It does not capture or analyze non-coding regions of the genome, including introns, intergenic regions, and regulatory elements.
- Cost: WES is generally more cost-effective compared to WGS since it focuses on a smaller portion of the genome. The reduced data volume can result in lower sequencing and analysis costs. (~$200-$300)
- Application: commonly used in clinical settings to identify genetic mutations associated with various diseases, especially rare genetic disorders
WGS
- Scope: WGS sequences the entire genome, including exons, introns, intergenic regions, and regulatory elements.
- Coverage: It provides comprehensive coverage of the entire genome, making it suitable for detecting variations in both coding and non-coding regions.
- Cost: WGS is more expensive than WES due to the larger amount of data generated and the broader scope of sequencing. ($400-900)
- Application: WGS is suitable for a wide range of applications, including identifying coding and non-coding variants, structural variations, and copy number variations and is more suitable for population genetic studies