7: Investigating Cancer Genomes Flashcards

Question 1

Q

What is a main motivation for large-scale cancer genome sequencing studies?

Answer

A

Identify cancer driver genes
They provide cells with a growth advantage when mutated

Question 2

Q

Why are cancer bio banks paired with blood samples from the same patient so valuable to clinicians?

Answer

A

Shows the progress/regression of a patient, and can identify what is going on from diagnosis > flow up/relapse

Question 3

Q

At what points through the cancer patient’s journey are blood samples taken?

Answer

A

Diagnosis
Surgery
Chemo start
Chemo end
Follow - Up
Relapse, if applicable

Question 4

Q

Outline the steps of workflow of NGS data anaylsis

Answer

A

Assessment of Quality
Aligning sequences
Identifying variants
Annotating variants
Visualizing NGS data

Question 5

Q

What is NGS data analysis?

Answer

A

Next-generation sequencing

emerging technology which determines DNA/RNA sequences for whole genome or specific regions of interest

Question 6

Q

What happens during assessment of quality in NGS data analysis?

Answer

A

NGS reads are evaluated to remove, correct, or trim reads that don’t meet standards
Errors include base calling errors, poor quality reads.
This is mostly automated.

Question 7

Q

What are NGS reads?

Answer

A

Short reads from chopping genome randomly and re-assembling them

Question 8

Q

What happens during the aligning sequences phase of NGS data analysis?

Answer

A

Reads as aligned to reference genome, eg to GRC (genome reference consortium).

Question 9

Q

What happens during the identifying variants phase of NGS data analysis?

Answer

A

Compares the difference between patient tumour and reference.
Sequence coverage is important in this stage as identified mutations hound be supported by multiple reads.

Question 10

Q

What is Coverage in NGS?

Answer

A

The average number of reads that align to/cover known reference bases

Question 11

Q

What is depth of coverage?

Answer

A

The number of reads of a given nucleotide in an experiment

Question 12

Q

Why is coverage an important factor for variant detection?

Answer

A

Determines wether discovery can be made with a certain degree of confidence at particular base positions
The higher the depth of coverage, the more likely you are to find all mutants

Question 13

Q

What are the preferred depths of coverage for normal versus cancer DNA?

Answer

A

Normal: 30x
Cancer: 60x

Question 14

Q

Why is the preferred depth of coverage for Cancer DNA higher than for normal DNA?

Answer

A

Due to tumour heterogeneity.

Some parts of the tumour may have a mutation present, whilst other areas may not. You therefore need more coverage as normal tissues tend to be more homogenous.

Question 15

Q

What are the 3 different groups of genomic changes in cancer?

Answer

A

small variants (SMPs, indels. <50bp change)
copy number alterations (amplifications, deletions)
structural variations (inversions, translocations)

Question 16

Q

What happens during the annotating variants of NGS data analysis?

Answer

A

Identifies disease causing variants. Annotation of SNPs and INDELs provided via computational annotation tools.

Question 17

Q

What happens during the visualization NGS data phase of NGS data analysis?

Answer

A

Use visualization tools and genome browsers to visualize variants.
Obtain information about variants.

Question 18

Q

What variants can we obtain by visualizing variants?

Answer

A

Mapping quality
Aligned reads
Annotation information ( consequence, impact of variance, scores of annotation tools)

Question 19

Q

How may a genome be visualization using a UCSC genome browser?

Answer

A

Gene as a long horizontal line, and exons as small vertical lines along it.
Arrows to denote direction of gene from promoter to 3’ end.

Question 20

Q

What is the overall idealized pipeline Fiordland cancer genome analysis?

Answer

A

Sequence data prep and processing (sequencing of matched tumour/normal DNA, alignment to reference genome)
Dissect and catalogue genomic changes (Nucleotide changes, copy number alterations, structural variation)
Consequence analysis (recurrent changes, significantly altered genes, biological pathways)

Question 21

Q

What is the challenge of genome sequencing glioblastomas?

Answer

A

There is strong intra- and inter- tumoural heterogeneity

Question 22

Q

What % of the mutations in cancer occur in non-coding parts of the genome?

Question 23

Q

Outline the idea of evolutionary conservation

Answer

A

Conserved genome positions for 100 million years implies these areas are important, and have a specific function
We can observe conserved invariant sequences across species and evolution
We can use this to identify novel candidate driver genes