SAQs Flashcards
SAQ
Question
Answer
a) Name three file types used in an NGS analysis pipeline (3)
3 from: FASTQBAM or SAM or CRAMVCFBED
b) For each of these file types describe their contents and use. (6) fastq, bam, vcf, bed
FASTQ- Text file containing sequence reads and associated quality informationStandard format containing all reads from sequencing. Can be analysed to generate quality metrics, and used as input for read alignment tools.BAM or SAM or CRAM- aligned/mapped reads and associated quality informationOutput of read alignment. Can be analysed to generate quality metrics.VCF - data lines containing information about a position in the genome, usually variants. May also include annotationsOutput of variant calling. Annotations may be added prior to variant filtering and analysis.BED - Genomic regions (chromosome, start and end)Used to define the regions of interest for the assay.
c) NGS analysis often involves aligning short DNA sequences (reads) to a reference genome. Give two reasons why a read might not align correctly to the reference. (2)
Two from:Read maps to multiple locations in the reference genome (e.g. pseudogene)Reference genome is incomplete so sequence is missing (e.g. centromeric regions)Errors introduced during sequencingVariants in the sequence compared to reference
d) Reads that do not map uniquely to the reference genome (i.e. map to more than one location) are given a mapping score of 0 and may be excluded from downstream analysis. Explain possible reasons for non-unique mapping and what impact this might have on the clinical use of NGS. (3)
Duplicated regions of the genome (segmental duplications, pseudogenes) can result in the same sequence being present in 2 or more locations in the genome. NGS sequence reads that map to these duplicated regions will not have unique mapping and therefore may be removed from downstream analyses. If clinically relevant genes have a pseudogene it may be difficult to get sufficient coverage of the gene for variant calling. Alternatively, called variants may be in the pseudogene and not the gene itself. An alternate method may be required to confirm results in these genes such as long range PCR.
e) Give an example of a gene and an associated genetic disorder that might be difficult to analyse by NGS because reads do not map uniquely to the reference (2)
Possible examples: SMN1 and Spinal Muscular Atrophy or PMS2 and Lynch Syndrome(both have pseudogenes)
Briefly describe paired-end sequencing and explain the advantages of paired-end over single-end sequencing for detecting variants associated with human disease. (4)
paired-end sequencing- Sequence both ends of the DNA fragment.Paired-end sequencing can be useful for detecting structural variants (deletions, insertions or inversions)- read pairs mapping to different locations in the genome give information about the position of that sequence. This is not possible with single-end sequencing. Structural variants are a common cause of genetic variation and therefore genetic disease.
Describe the underlying genetic cause of fragile x?
FRAX is an X-linked recessive triplet repeat expansion disorder caused by a CGGrepeat expansion within the 5’ UTR of the FMR1 gene on the X-chromosome. Whenthe triplet repeat expands beyond a threshold (>200 repeats), this causeshypermethylation of the FMR1 promotor and silencing of the gene
describe PCR for sizing?
The sizing PCR is a standard PCR with a F & R primer (one of which is fluorescentlylabelled). Products are separated by capillary electrophoresis and sized against amolecular ladder.
describe TP-PCR
TP-PCR uses F & R primers (again one of which is fluorescently labelled) and also a thirdprimer which is specific to the triplet repeat. The third primer is added in a limitedmanner so that it is exhausted in early PCR rounds. This is to avoid preferentialamplification of smaller alleles. The products from the TP-PCR are also separated bycapillary electrophoresis and sized. A full expansion allele gives a classic ‘ski-slope’pattern which tails off towards the larger end of the repeat.
a) List three differences between the nuclear and mitochondrial genomes
The mitochondrial genome is a fraction of the size of the nuclear genome (~16.5kb)The mitochondrial genome is a small circular moleculeMitochondrial DNA is maternally-inherited only.Mitochondrial has no introns and very few genes ~37
Describe the inheritance patterns associated with mitochondrial disease
Mitochondrial disease can be caused by pathogenic variants in the mtDNA itself (maternallyinherited only) or by pathogenic variants in nuclear genes involved in mitochondrial DNAmaintenance which can be autosomal dominant or recessive
Define the term heteroplasmy and homoplasmy and mitochondrial bottleneck
Heteroplasmy – where two or more different variants of mtDNA exist within a cellHomoplasmy – where all copies of the mtDNA are identical within a cell.Mitochondrial bottleneck – a random shift of mtDNA mutational load between generations(and even siblings) due to unequal transfer of mtDNA molecules during oogenesis
Describe 3 considerations for interpretation of pathogenicity unique to mtDNA variants
There are currently no mitochondrial DNA specific guidelines for interpreting variants.Inheritance pattern (maternal or nuclear)Population databases used (Mitomap instead of gnomAD for example)check heter/homoplasmy levels in proband vs mum – if homoplasmic variant inherited from homoplasmic unaffected mum its unlikely to be disease- causing
Clinicians have referred an adult presenting with optic neuropathy to the highly specialised mitochondrial diagnostic service. Describe the appropriate testing pathway and any relevant candidate genes and variants for targeted analysis
Optic neuropathy is a generic term and can be caused by pathogenic variants in mtDNA(such as Leber’s hereditary optic neuropathy (LHON)) or nuclear DNA. There are commonLHON variants which can be easily identified/excluded such as m.11778G>A (MT-ND4),m.3460G>A (MT-ND1) and m.14484T>C (MT-ND6).If these are negative, full gene screens can commence for each of the above threementioned genes.f full gene screens are negative, a nuclear based eye panel may be appropriate.
Name the gene responsible for encoding mitochondrial DNA polymerase
POLG (polymerase gamma)
What is copy number variation?
A loss or gain of a region of the genome (could be single exon, multi-exon, wholegene or multiple genes).
What types of genetic/genome abnormalities can oligoarray NOT detect
Uniparental disomyBalanced translocationsTriploidy
Describe the differences between a SNP and oligo array?
An oligo array uses the patient and a sex-matched control sample which compete forhybridisation to the probes on the array slide. The patient and the control DNAs arelabelled in different fluorescence and the captured image is converted to show if thepatient has a gain or loss compared to the control sample.SNP arrays use thousands of known SNP positions across the genome and each SNPis genotyped into AA homozygotes, BB homozygotes and AB and BA heterozygotes.The patient is genotyped at each SNP position which is used to calculate the ratio ofAA, BB, AB and BA SNPs at each position and determine the copy number by theratio of heterozygous and homozygous SNPs
Briefly explain the use of the 3 resources/databases that you would use to aid interpretation of the clinical significance of a copy number change.
Database of Genomic Variation (DGV) – the DGV ‘gold standard’ track providesinformation on the frequency of your copy number variant in the population. Forexample, a CNV with a frequency of 0.80% in a population of 17,000 would be toohigh to be disease causing.ClinGen – This resource provides information on dosage pathogenicity and gives ahaploinsufficiency score and a triplosensitivity score for each gene in the CNV call.For example a haploinsufficiency score of 3 would automatically make the CNVpathogenic.Decipher – Large database with national patient cohort. This can be used todetermine if your CNV has been seen before, the phenotype of the patient/s withthis CNV, the reporting laboratory and any overlapping features with similarpatients.
Briefly describe a known microdeletion syndrome region involving chromosome 16; include location, key gene(s) involved and provide two clinical features.
16p11.2 microdeletion syndrome which includes the TBX6 gene. Patients with thisdisorder have intellectual disability, developmental delay and some also have autisticbehaviours.
Many newly described microdeletion or microduplication syndromes detected bymicroarray are subject to reduced penetrance and variable expressivity. Define theseterms.
Reduced penetrance – Not all people with the genetic change will display thefeatures associated with that disorder.Variable expressivity – The phenotype of the disorder is variable amongstindividuals, even those within the same family.
a) Give 3 clinical features of Prader Willi syndrome.
Intellectual disabilityObesityHypotonia in infancyHyperphagiaOvergrowthstrabismus
Give 3 clinical features of Angelman syndrome.
SeizuresCharacteristic hand movementsInappropriate laughter
AS PWAS Define the chromosomal region associated with these conditions.
15q11.2-15q13
List the different mechanisms that may result in Angelman syndrome together withtheir recurrence risk for future pregnancies.
Paternal chromosome 15 UPD – <1-2%Maternal deletion of the 15q11.2-q13 region (likely de novo, low recurrence unlessgermline mosaicism)UBE3A pathogenic variant – 50%Imprinting control centre deletion (up to 50% recurrence)Imprinting control centre defect (<1%) (
Describe two testing methods by which a Prader-Willi case may be geneticallyconfirmed.
Methylation-specific PCRMethylation-specific MLPA
Name 3 other imprinting disorders.
Beckwith-Wiedemann syndromeSilver-Russell syndromeTemple syndromeTransient neonatal diabetes
A 5 month old infant girl presents with a diagnosis of B-lineage Acute LymphoblasticLeukaemia (ALL)Describe the priority setting and testing strategy for this sample?
This would be classed as an urgent referralFISH for the common t(4;11) (MLL gene and AF4) translocation found in infant ALL – ifnegative proceed to ETV6/RUNX1 FISH which will also detect iAMP21 and BCR/ABL1FISH As BCR-ABL1, ETV6-RUNX1, iAMP21 and MLL rearrangements are thoughtto be mutually exclusive, if one abnormality is detected it is not mandatory toexclude others.If there is normal/failed result, ALL BPGs suggest additional FISH to detect hiddenhyper/hypodiploidy.Extract DNA for SNP arrayRNA extraction for fusion panel analysis
5 month old infant girl presents with a diagnosis of B-lineage Acute LymphoblasticLeukaemia (ALL)What would be the most common/likely chromosome abnormality detected in thispatient and give the prognosis. Include ISCN for abnormality and name the genesinvolved.
46,XX,t(4;11)(q21;q23) – poor prognosis, MLL gene and AF4
c) The patient is negative for the above abnormality. The ETV6-RUNX1 dual fusion probe for this patient shows two fusion, one red, and one green signal pattern (2F1R1G). What does this pattern suggest the patient carries and what prognosis does it confer?
They are positive for the ETV6/RUNX1 fusion which confers a favourable prognosis.
If the ETV6-RUNX1 dual fusion FISH probe showed a loss of the green ETV6 signal(2F1R0G), what would this indicate loss of and how would this effect the prognosis?
Loss of the normal chromosome 12 – prognosis does not change.
e) Give two other chromosomal abnormalities seen in ALL that could be detected using the ETV6-RUNX1 probe other than the rearrangement in part C).
Loss of the red RUNX1 signal would indicate loss of the normal chromosome 21Amplification of the red signal would indicate iAMP 21 (intrachromosomal amplification of chromosome 21)
describe the process of QF-PCR?
amplification of STRs on chromosomes of interest used to determine copy number‚óè STRs (short tandem repeats or microsatellites) are a pattern of 2 - 6 bp that are repeated directly adjacent to each other. ‚óè STRs are known highly polymorphic markers; a patient is therefore likely to have different numbers of repeat units on each allele.- 4 markers for each chromosome of interest and sex chromosome markers if referral indicates sex chromosome abnormality eg. AMEL, SRY and DXYS218(Xp) and X22- quantitative pcr - extracted DNA added to fluorescent primer multiplex and pcr. The reaction must be quantitative to detect copy number, therefore the PCR is stopped while in exponential phase to detect copy number -“ During the exponential phase of the reaction, the amount of product is directly proportional to amount of template “- rapid, cheap, small quantities of DNA needed
following a positive qf-PCR result, what other tests would you offer?
- sample identity must be confirmed prior to reporting (N.B. by a repeat test of the original sample or genotype comparison with a maternal blood sample- karyotype to visualize abnormal rearrangements- normal results based on a single marker should be confirmed by a second method, e.g. karyotype or FISH.
what are Characteristics of cffDNA?
-placenta - shed highly fragmented DNA into the maternal circulation during normal apoptosisthe total cell free DNA in maternal plasma that comes from the placenta (up to 20%)- reliably detected from 7 weeks- increases with increasing gestation- rapidly cleared from circulation within an hour after delivery - fetal DNA is shorter -approximately 200bp for fetal fragments and larger for maternal fragments.
how do you calculate fetal fraction?
dividing the amount of reads mapped to chromosome Y by the total amount of reads mapped to autosomal chromosomes.