Module 7.2 Cancer Genome Sequencing 1 Flashcards

Tissue Biopsy

1
Q

cancer genome sequencing

features (3)

A
  • direct sequencing of archived tumor tissues or tumor micro environments (connective tissue cells) or cell free DNA samples from blood
  • multiple sequencing types ( targeted, RNA, epigenetic, microbe)
  • goal is to assemble parts list of cancer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

cancer genome sequencing

parts

A

genetic structures including both DNA and RNA that are altered in cancer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The Cancer Genomic Atlas
(TCGA)

A
  • landmark cancer genomics program
  • molecularly characterized over 20,000 primary cancers and matched normal samples spanning different cancer types
  • genomic, epigenomic, transcriptomic and proteomic data publicly available through Genomic Data Commons portal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

tissue biopsy

sampling methods (3)

A
  1. excision biopsy: entire lump or suspicious area (maybe some healthy tissue from same area) is removed
  2. incisional biopsy: small cut is made into area of abnormal tissue and small sample is removed
  3. needle biopsy: sample of tissue or fluid is removed with needle
    - wide needle: core biopsy
    - thin needle: fine needle biopsy
    - tumor purity as low as 10-20%
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

tissue biopsy

storage methods (2)

A
  1. Formalin-fixed paraffin-embedded (FFPE)
  2. Fresh frozen (FF)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

tissue storage method

Formalin-fixed paraffin-embedded
(FFPE)

benefits and drawbacks

A

Benefits
- most common sources of archived materials
- can be stored in a cabinet at room temperature
- cheap to create
- can be stable for very long time
- most available types of samples for tumor sequencing

Drawbacks
- fragmentation of DNA and formalin-induced DNA damages = sequencing artifacts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

tissue storage method

Formalin-fixed paraffin-embedded
(FFPE)

features

A
  • formalin and wax preserves fragile structures inside and between the cells in tissue
  • Proteins preserved in denatured form
  • Nucleic acid can be isolated but not preserved very well and may not be ideal for molecular analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

tissue storage method

Fresh frozen
(FF)

benefits and drawbacks

A

Benefits
- works very well for molecular genetic analysis
- better if dipped in liquid nitrogen (flash freezing) and stored -80C

Drawbacks
- surgeons may not have access to liquid nitrogen for flash freeze
- biobanks have smaller frozen tissue collections

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

FFPE

Extraction process

A
  1. Paraffin blocks containing tumors cut using microtome to achieve thin slices (5-10 micrometers)
  2. One slice is mounted on glass slides and stained with H&E to confirm presence of tumor tissue
  3. pathologist estimates percent of tumor content in the tissue by counting percent of nuclei from cancer based cell pathology
  4. If the tumor fraction is low (<20%), micro dissection of tumor can be performed by superimposing each unstained slice with H&E template to enrich for tumor content. If high, don’t need microdissection.
  5. dissected areas are deparaffinated and used for DNA extraction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

sequencing targets

A

whole genome
- most expensive
- whole coverage but limited depth
- hard to detect variants in small fraction of cells

whole exome
- only protein-coding genes
- sequencing depths of 100-200x

targeted
- thousands of X coverage
- may not capture large structural variants with high sensitivity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

cancer genomic sequencing

workflow

A
  1. Extract DNA and convert to sequencing library
  2. Perform paired-end WGS sequencing
  3. Assess QC metrics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

matched normal

A
  • normal samples and non-cancer cells originating in same tissue from same patient
  • can also use patient’s blood sample (typically white blood cells)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Quality Control
Pre-alignment

metrics (6)

A
  1. % duplicate reads
  2. Base quality scores
  3. % Reads aligned
  4. % Paired GC content
  5. Insert size distribution
  6. PCR duplicates
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Quality Control
Post-alignment

sources of mapping errors (6)

A
  • inappropriate reference genome
  • polymorphisms
  • sequencing errors
  • segmental duplications
  • repetitive sequences
  • incomplete reference genome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Factors affecting observed VAF

3

A

1. Tumor purity (Tumor fraction in tissue sample- somatic)
2. Intra-tumor heterogeneity (different subclones or wild type normal cells within tumor)
3. Copy number (at locus)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cancer genomic sequencing

Variant allele frequency

features

A
  • VAF = # of reads supporting candidate mutation / read depth at position
  • key determinant in finding a somatic variant
  • subclonal mutation present in 20% of diploid tumor cells = 10% VAF -> 60 X sample = 6 variant reads (3 reads is tumor purity = 50%)
17
Q

candidate somatic mutations

A
  • genomic positions for which alternate allele supported by tumor reads is not present in matched normal sample
  • SNVs and Indels most common
18
Q

subclonal mutation

A

mutation that is present in a subset of tumor cells in a tumor sample or biopsy

19
Q

transversion

A

point mutation that changes purine to pyrimidine and vice versa

20
Q

tumor ploidy

A
  • amount of DNA in tumor cell
  • diploid: grows more slowly
  • aneuploid: abnormal amount of DNA
  • helps determine how malignant a tumor is
21
Q

removing germline variants in sequencing without matched normal

A
  • normal tissue not always available in clinical applications
  • filter annotated SNPs found in database such as DB snips and Gnomad (SNVs may be incorrectly identified due specific workflow or specific germline variants)
  • use panel of normals collected from different individuals, but processed in same way as tumor samples
22
Q

Variant error sources

4

A
  1. Library preparation: DNA polymerase for DNA synthesis and amplification can induce artifacts
  2. Oxidative Damage (i.e. C-A): Guanine oxidation during fragmentation via shearing can lead to low frequency transversions of C to A
  3. FFPE induced DNA damage (i.e. C-T): DNA fragmentation and base changes induced by formaldehyde, especially deamination of cytosine into thymine = high noise levels (CTT)
  4. Sample contamination: matched normal sample may be contaminated by tumor cells, or normal sample contaminating a tumor sample from a different patient
23
Q

artifacts

A

variations introduced by non-biological processes

24
Q

CNV detection

A
  1. segment genome into regions with distinct copy numbers using statistical techniques
  2. use matched control or normalization to statistically remove bias
  3. More advanced methods incorporate minor allele frequencies inferred from heterozygous SNPs for segmentation and to detect allele specific copy number variations
25
Q

B allele frequency
(BAF)

A
  • BAF is an estimate of the frequency of B allele of a given SNP in population of cells from which DNA was extracted
  • In normal cell, BAF at any locus is either 0 (AA), 0.5 (AB) or 1 (BB) and the expected log R ratio is 0.
26
Q

Structural variant detection

A
  • identified by split reads and clusters of discordant read pairs
  • breakpoint junctions often show complex patterns -> poor alignment
  • structural variant algorithms include local assembly step
  • context assembled from raw reads improve read mapping and characterization of insertion sequences at breakpoints
  • read depths data can provide additional information to improve detection of deletions and amplifications
  • Somatic structure variants harder to detect due to low VAF
  • number of supporting reads will fluctuate due to non-uniform read coverage across genome and sampling variation
  • dynamic determination of appropriate threshold (eg. number of supporting split reads) depending on local context + various filters to increase detection sensitivity
27
Q

split read

A
  • When only one end of a read aligns to the reference genome
  • provide evidence of a breakpoint and type of structural variant present in sample genome
28
Q

discordant read pair

A

reads in a pair mapped to different chromosomes, or in incompatible orientations, or not within size limit of sequencing library

29
Q

Copy-neutral LOH

A
  • copy-neutral loss of heterozygosity
  • one allele mutated to match other allele
  • same copy number but now homozygous for allele 1 or 2
30
Q

tumor profiling

A

Guide patient care
a. Predictive biomarkers for therapy selection
can be either tumor subtype specific or tumor agnostic
b. Assist with cancer subtype diagnosis
c. Confer increased heritable cancer risk

Basic clinical research
a. Cancer pathogenesis and progression
b. Biomarker discovery
c. New drug development

31
Q

EGFR Leu858Arg

A
  • epidermal growth factor receptor
  • leucine replaced with arginine at position 858
  • predictive biomarker for use of EGFR tyrosine kinase inhibitors in treating non-small cell lung cancer patient
32
Q

DNA biomarker

A

molecules that indicate normal or abnormal process taking place in your body and may be sign of an underlying condition or disease

33
Q

MSI-H and TMB-H

A

Biomarkers associated with tumor instability and response to anti-PD-1 immune checkpoint inhibitors for treating multiple cancer types

  • Microsatellite Instability-High
    associated with deficiency in mismatch repair
  • Tumor Mutation Burden High
    >10 mutations per megabase of DNA