Next-Generation Sequencing of Cancer Genomes Flashcards
What does the central dogma of molecular biology describe?
The flow of genetic information within a biological system
What are the three forms of DNA sequencing?
- Whole genome sequencing
Sequencing of the entire genome
Can be either de novo sequencing or resequencing - Whole exome sequencing
Sequencing of exomes, formed by the exons (coding regions) - Targeted or hot-spot sequencing
Sequencing panel of a select number of genes
The human genome is approximately 3.2 Giga base pairs (Gbp)
True or false
True
What is required to minimise sequencing costs?
Careful planning of sequencing objective
Of gene panel, whole genome and whole exome sequencing which has the highest coverage?
Gene panel
What are the advantages and disadvantages of whole genome and whole exome sequencing?
Whole genome sequencing: low depth of coverage, high breadth of coverage
Whole exome sequencing: high depth of coverage, low breadth of coverage
What is the difference between single and paired end sequencing?
Single-end (SE)- Only one end sequenced
Paired-end (PE)
- Both ends sequenced
- Precise alignment in repeat regions
- Mapping of structural variations
- What format is the raw sequencing format?
2. What does this contain?
- FASTQ format-
Text-based storage of biological sequences - Phred score-
Quality measure for a given nucleotide
Why is quality control needed?
To remove poor quality reads
- Adapter contamination
- Poor quality base calling at the start or towards the end of sequencing reads
- What does quality control involve?
2. What tool enables this?
- Trimming of sequencing read
- Removal of adapter sequences
- Removal of low quality bases - Trimmomatic (Bolger et al. Bioinformatics. 2014
- After quality control what steps are pursued?
2. What is the most common tool to facilitate this?
Resequencing-
Requires a pre-build reference genome
Aligning sequencing reads-
Match a read to the most probable location in the reference genome
Error margin to allow for single nucleotide polymorphisms (SNPs) and single nucleotide variations (SNVs) - do not expect all sequencing reads to align perfectly to genome
- Burrows-Wheeler Aligner (Li and Durbin Bioinformatics. 2009)
What are variations described as?
A difference in a nucleotide between a sequencing read and the reference genome
Variations are described by what?
What is the most common tool that facilities this?
A measure of confidence and the variant allele frequency
Mutect2
What is a substitution?
Replacement of nucleotide base
- What are Insertions and deletions?
2. What are they commonly referred to as?
- Addition or removal of nucleotide bases
2. indels
What does Synonymous (silent) mutation result in?
No change in the amino acid is observed
What is a Non-synonymous (non-silent) mutation result in?
Results in a change of the amino acid
- Missense: different amino acid
- Nonsense: early stop codon
Insertions and deletions in a coding regions can result in what?
A frameshift If the length is not a multiple of 3
The cancer genome can contain structural variations and single nucleotide variations
True or false
True
What are structural variations?
Changes of large (> 1Kbp) regions of a chromosome or even whole chromosomes
Includes duplications, deletions and other rearrangements (e.g. translocations)
What tool can be used to detect structural variations?
Delly (Rausch et al. Bioinformatics. 2012)
How is structural variation identified?
What are examples?
By paired-end reads
- Read pair: Distance/insert size between read one and read 2 can be informative. If insert size is short and unexpected is shorter than expected after alignment, there is a deletion and if larger than expected it is an insertion
- Split read: Can be used to refine the exact breakpoint within genome. Refers to partial alignment of sequencing read e.g read sequencing read is split after alignment to reference genome which can indicate exact location of deletion
- Read-depth: If there is an increase in the number of sequencing reads observed over the average coverage for a particular genomic locus, this would indicate that a duplication is present. In contrast if number of sequencing reads is lower than the average coverage a deletion has occurred
What are driver alternations?
Somatic alterations that are causally implicated in oncogenesis
Confer selective advantages during the evolution of a cancer
What are Passenger alterations?
Do not confer any selective advantage in cancer development
Have no functional implications