mixed SAQs Flashcards
a) Name three file types used in an NGS analysis pipeline (3)
3 from: FASTQ
BAM or SAM or CRAM
VCF
BED
b) For each of these file types describe their contents and use. (6) fastq, bam, vcf, bed
FASTQ- Text file containing sequence reads and associated quality information
Standard format containing all reads from sequencing. Can be analysed to generate quality metrics, and used as input for read alignment tools.
BAM or SAM or CRAM- aligned/mapped reads and associated quality information
Output of read alignment. Can be analysed to generate quality metrics.
VCF - data lines containing information about a position in the genome, usually variants. May also include annotations
Output of variant calling. Annotations may be added prior to variant filtering and analysis.
BED - Genomic regions (chromosome, start and end)
Used to define the regions of interest for the assay.
c) NGS analysis often involves aligning short DNA sequences (reads) to a reference genome. Give two reasons why a read might not align correctly to the reference. (2)
Two from:
Read maps to multiple locations in the reference genome (e.g. pseudogene)
Reference genome is incomplete so sequence is missing (e.g. centromeric regions)
Errors introduced during sequencing
Variants in the sequence compared to reference
d) Reads that do not map uniquely to the reference genome (i.e. map to more than one location) are given a mapping score of 0 and may be excluded from downstream analysis. Explain possible reasons for non-unique mapping and what impact this might have on the clinical use of NGS. (3)
Duplicated regions of the genome (segmental duplications, pseudogenes) can result in the same sequence being present in 2 or more locations in the genome. NGS sequence reads that map to these duplicated regions will not have unique mapping and therefore may be removed from downstream analyses. If clinically relevant genes have a pseudogene it may be difficult to get sufficient coverage of the gene for variant calling. Alternatively, called variants may be in the pseudogene and not the gene itself. An alternate method may be required to confirm results in these genes such as long range PCR.
e) Give an example of a gene and an associated genetic disorder that might be difficult to analyse by NGS because reads do not map uniquely to the reference (2)
Possible examples: SMN1 and Spinal Muscular Atrophy or PMS2 and Lynch Syndrome
(both have pseudogenes)
Briefly describe paired-end sequencing and explain the advantages of paired-end over single-end sequencing for detecting variants associated with human disease. (4)
paired-end sequencing- Sequence both ends of the DNA fragment.
Paired-end sequencing can be useful for detecting structural variants (deletions, insertions or inversions)- read pairs mapping to different locations in the genome give information about the position of that sequence. This is not possible with single-end sequencing. Structural variants are a common cause of genetic variation and therefore genetic disease.
Describe the underlying genetic cause of fragile x?
FRAX is an X-linked recessive triplet repeat expansion disorder caused by a CGG
repeat expansion within the 5’ UTR of the FMR1 gene on the X-chromosome. When
the triplet repeat expands beyond a threshold (>200 repeats), this causes
hypermethylation of the FMR1 promotor and silencing of the gene
describe PCR for sizing?
The sizing PCR is a standard PCR with a F & R primer (one of which is fluorescently
labelled). Products are separated by capillary electrophoresis and sized against a
molecular ladder.
describe TP-PCR
TP-PCR uses F & R primers (again one of which is fluorescently labelled) and also a third
primer which is specific to the triplet repeat. The third primer is added in a limited
manner so that it is exhausted in early PCR rounds. This is to avoid preferential
amplification of smaller alleles. The products from the TP-PCR are also separated by
capillary electrophoresis and sized. A full expansion allele gives a classic ‘ski-slope’
pattern which tails off towards the larger end of the repeat.
a) List three differences between the nuclear and mitochondrial genomes
The mitochondrial genome is a fraction of the size of the nuclear genome (~16.5kb)
The mitochondrial genome is a small circular molecule
Mitochondrial DNA is maternally-inherited only.
Mitochondrial has no introns and very few genes ~37
Describe the inheritance patterns associated with mitochondrial disease
Mitochondrial disease can be caused by pathogenic variants in the mtDNA itself (maternally
inherited only) or by pathogenic variants in nuclear genes involved in mitochondrial DNA
maintenance which can be autosomal dominant or recessive
Define the term heteroplasmy and homoplasmy and mitochondrial bottleneck
Heteroplasmy – where two or more different variants of mtDNA exist within a cell
Homoplasmy – where all copies of the mtDNA are identical within a cell.
Mitochondrial bottleneck – a random shift of mtDNA mutational load between generations
(and even siblings) due to unequal transfer of mtDNA molecules during oogenesis
Describe 3 considerations for interpretation of pathogenicity unique to mtDNA variants
There are currently no mitochondrial DNA specific guidelines for interpreting variants.
Inheritance pattern (maternal or nuclear)
Population databases used (Mitomap instead of gnomAD for example)
check heter/homoplasmy levels in proband vs mum – if homoplasmic variant inherited from homoplasmic unaffected mum its unlikely to be disease- causing
Clinicians have referred an adult presenting with optic neuropathy to the highly specialised mitochondrial diagnostic service. Describe the appropriate testing pathway and any relevant candidate genes and variants for targeted analysis
Optic neuropathy is a generic term and can be caused by pathogenic variants in mtDNA
(such as Leber’s hereditary optic neuropathy (LHON)) or nuclear DNA. There are common
LHON variants which can be easily identified/excluded such as m.11778G>A (MT-ND4),
m.3460G>A (MT-ND1) and m.14484T>C (MT-ND6).
If these are negative, full gene screens can commence for each of the above three
mentioned genes.
f full gene screens are negative, a nuclear based eye panel may be appropriate.
Name the gene responsible for encoding mitochondrial DNA polymerase
POLG (polymerase gamma)
What is copy number variation?
A loss or gain of a region of the genome (could be single exon, multi-exon, whole
gene or multiple genes).
What types of genetic/genome abnormalities can oligoarray NOT detect
Uniparental disomy
Balanced translocations
Triploidy
Describe the differences between a SNP and oligo array?
An oligo array uses the patient and a sex-matched control sample which compete for
hybridisation to the probes on the array slide. The patient and the control DNAs are
labelled in different fluorescence and the captured image is converted to show if the
patient has a gain or loss compared to the control sample.
SNP arrays use thousands of known SNP positions across the genome and each SNP
is genotyped into AA homozygotes, BB homozygotes and AB and BA heterozygotes.
The patient is genotyped at each SNP position which is used to calculate the ratio of
AA, BB, AB and BA SNPs at each position and determine the copy number by the
ratio of heterozygous and homozygous SNPs
Briefly explain the use of the 3 resources/databases that you would use to aid interpretation of the clinical significance of a copy number change.
Database of Genomic Variation (DGV) – the DGV ‘gold standard’ track provides
information on the frequency of your copy number variant in the population. For
example, a CNV with a frequency of 0.80% in a population of 17,000 would be too
high to be disease causing.
ClinGen – This resource provides information on dosage pathogenicity and gives a
haploinsufficiency score and a triplosensitivity score for each gene in the CNV call.
For example a haploinsufficiency score of 3 would automatically make the CNV
pathogenic.
Decipher – Large database with national patient cohort. This can be used to
determine if your CNV has been seen before, the phenotype of the patient/s with
this CNV, the reporting laboratory and any overlapping features with similar
patients.
Briefly describe a known microdeletion syndrome region involving chromosome 16; include location, key gene(s) involved and provide two clinical features.
16p11.2 microdeletion syndrome which includes the TBX6 gene. Patients with this
disorder have intellectual disability, developmental delay and some also have autistic
behaviours.
Many newly described microdeletion or microduplication syndromes detected by
microarray are subject to reduced penetrance and variable expressivity. Define these
terms.
Reduced penetrance – Not all people with the genetic change will display the
features associated with that disorder.
Variable expressivity – The phenotype of the disorder is variable amongst
individuals, even those within the same family.
a) Give 3 clinical features of Prader Willi syndrome.
Intellectual disability
Obesity
Hypotonia in infancy
Hyperphagia
Overgrowth
strabismus
Give 3 clinical features of Angelman syndrome.
Seizures
Characteristic hand movements
Inappropriate laughter
AS PWAS Define the chromosomal region associated with these conditions.
15q11.2-15q13
List the different mechanisms that may result in Angelman syndrome together with
their recurrence risk for future pregnancies.
Paternal chromosome 15 UPD – <1-2%
Maternal deletion of the 15q11.2-q13 region (likely de novo, low recurrence unless
germline mosaicism)
UBE3A pathogenic variant – 50%
Imprinting control centre deletion (up to 50% recurrence)
Imprinting control centre defect (<1%) (
Describe two testing methods by which a Prader-Willi case may be genetically
confirmed.
Methylation-specific PCR
Methylation-specific MLPA
Name 3 other imprinting disorders.
Beckwith-Wiedemann syndrome
Silver-Russell syndrome
Temple syndrome
Transient neonatal diabetes
A 5 month old infant girl presents with a diagnosis of B-lineage Acute Lymphoblastic
Leukaemia (ALL)
Describe the priority setting and testing strategy for this sample?
This would be classed as an urgent referral
FISH for the common t(4;11) (MLL gene and AF4) translocation found in infant ALL – if
negative proceed to ETV6/RUNX1 FISH which will also detect iAMP21 and BCR/ABL1
FISH
As BCR-ABL1, ETV6-RUNX1, iAMP21 and MLL rearrangements are thought
to be mutually exclusive, if one abnormality is detected it is not mandatory to
exclude others.
If there is normal/failed result, ALL BPGs suggest additional FISH to detect hidden
hyper/hypodiploidy.
Extract DNA for SNP array
RNA extraction for fusion panel analysis
5 month old infant girl presents with a diagnosis of B-lineage Acute Lymphoblastic
Leukaemia (ALL)
What would be the most common/likely chromosome abnormality detected in this
patient and give the prognosis. Include ISCN for abnormality and name the genes
involved.
46,XX,t(4;11)(q21;q23) – poor prognosis, MLL gene and AF4
c) The patient is negative for the above abnormality. The ETV6-RUNX1 dual fusion probe for this patient shows two fusion, one red, and one green signal pattern (2F1R1G). What does this pattern suggest the patient carries and what prognosis does it confer?
They are positive for the ETV6/RUNX1 fusion which confers a favourable prognosis.
If the ETV6-RUNX1 dual fusion FISH probe showed a loss of the green ETV6 signal
(2F1R0G), what would this indicate loss of and how would this effect the prognosis?
Loss of the normal chromosome 12 – prognosis does not change.
e) Give two other chromosomal abnormalities seen in ALL that could be detected using the ETV6-RUNX1 probe other than the rearrangement in part C).
Loss of the red RUNX1 signal would indicate loss of the normal chromosome 21
Amplification of the red signal would indicate iAMP 21 (intrachromosomal amplification of chromosome 21)
describe the process of QF-PCR?
amplification of STRs on chromosomes of interest used to determine copy number
● STRs (short tandem repeats or microsatellites) are a pattern of 2 - 6 bp that are repeated directly adjacent to each other.
● STRs are known highly polymorphic markers; a patient is therefore likely to have different numbers of repeat units on each allele.
- 4 markers for each chromosome of interest and sex chromosome markers if referral indicates sex chromosome abnormality eg. AMEL, SRY and DXYS218(Xp) and X22
- quantitative pcr - extracted DNA added to fluorescent primer multiplex and pcr. The reaction must be quantitative to detect copy number, therefore the PCR is stopped while in exponential phase to detect copy number -“ During the exponential phase of the reaction, the amount of product is directly proportional to amount of template “
- rapid, cheap, small quantities of DNA needed
following a positive qf-PCR result, what other tests would you offer?
- sample identity must be confirmed prior to reporting (N.B. by a repeat test of the original sample or genotype comparison with a maternal blood sample
- karyotype to visualize abnormal rearrangements
- normal results based on a single marker should be confirmed by a second method, e.g. karyotype or FISH.
what are Characteristics of cffDNA?
-placenta - shed highly fragmented DNA into the maternal circulation during normal apoptosis
the total cell free DNA in maternal plasma that comes from the placenta (up to 20%)
- reliably detected from 7 weeks
- increases with increasing gestation
- rapidly cleared from circulation within an hour after delivery
- fetal DNA is shorter -approximately 200bp for fetal fragments and larger for maternal fragments.
how do you calculate fetal fraction?
dividing the amount of reads mapped to chromosome Y by the total amount of reads mapped to autosomal chromosomes.
define heteroplasmy
which two or more mtDNA variants exist within the same cell