Session 2: Laboratory techniques and bioinformatics Flashcards
describe the ARMS process?
paired PCR reaction involving common primer and allele specific primers
• The allele specific primers differ in their 3’ end nucleotides. amplification does not occur if there is a mismatch allowing different alleles to be distinguished (by size or different flourescence)
useful for screening large number of samples for known panel of mutations
describe advantages of ARMS
quick, cheap, simple, sensitive, detects SNVs, ins/dels, can multiplex, doesn’t require specialized equipment
describe disadvantages of ARMS
unable to detect rare/unknown variants, cross-reactivity e.g. CF p.(Phe508del)/p.(Phe508Cys), NAFNAP, often needs commercial kit, confirmation via another method often necessary, MCC in prenatals,
describe restriction enzyme digest
• Restriction enzymes make DS breaks in DNA at specific recognition sites
• Can be used when a base substitution creates or abolishes a recognition site of a restriction endonuclease
- fragment amplified, product is digested with the relevant restriction enzyme and the products separated by gel or capillary electrophoresis
• Variation includes methylation specific restriction digests eg. Methylation-Sensitive Restriction Enzymes are not able to cut methylated-cytosine residues
what are the advantages of restriction enzyme digests?
cheap, simple, quick, no specialist equipment
what are the disadvantages of restriction enzyme digests?
NAFNAP, only confident of exact variant if it creates a site, if it destroys a site then we dont know the exact change, need r.enzyme to be available for site of interest, enzyme may be expensive or poor quality if rare site, Partial or over digestion can affect interpretation, non-specific activity, heteroduplexes (fragment re-anneals to non-complimentary fragment eg. G:T) will not cut
what is FRET (Fluorescence resonance energy transfer) hybridisation?
fluorogenic Minor groove binder probes for rtPCR are specific for SNVs.
Polymerisation of a new DNA strand is initiated from the primers, and once the polymerase reaches the probe, its 5’-3’-exonuclease degrades the probe, physically separating the fluorescent reporter from the quencher, resulting in an increase in fluorescence which is detected and measured in the real-time PCR machine
eg. Jak2 V617F mutation in Myeloid disease
what are advantages of FRET?
highly sensitive, highly stable, can detect very low levels of mutant DNA in a background of wild-type genomic DNA, useful for large sample numbers
what are disadvantages of FRET?
costly, no multiplexing, PH-sensitive
describe Droplet digital (ddPCR)?
sample fractionated into thousands of droplets, each containing single DNA template, run PCR using WT and mutant TaqMan probes, software then reads + and - droplets distinguishing WT from mutant.
eg. BRAF V600E and EGFR mutation testing
what are advantages of Droplet digital (ddPCR)?
detects low level mutations, quantitative
what are disadvantages of Droplet digital (ddPCR) for FFPE testing?
DNA quality obtained from tumour blocks is often of poor quality and can result in poor amplification
- Accuracy of the results depends on the quality of sample (e.g. containing a high percentage of tumour cells).
- Fixation of the sample causes DNA damage and can result in PCR artefacts.
what is pyrosequencing?
“sequencing by synthesis” principle in which polymerase extended the DNA one dNTP at a time. When dNTP is added to an open 3’ DNA strand pyrophosphate is released. A cocktail of enzymes is used in pyrosequencing which couples this pyrophosphate to light emission by luciferase. amount of light proportional to number of base incorporated eg. Mitochondrial point mutation analysis in MELAS, MERRF, NARP and Leber’s Herediatary Optic Neuropathy (LHON)
advantages of pyrosequencing?
Quick and cheap
Detects low level – quantifiable down to ~5% variant level
Detect heteroplasmy
Can be used to detect methylation status
disadvantages of pyrosequencing?
Short length of sequence is sequenced
Data can be complex to genotype depending on type of variant analysis required
SNPs can affect primer binding sites
what is the function of the centromere?
chromosome segregation - Microtubules of the spindle attach to the centromere via the kinetochore. acentric chromosomes fail to attach to spindle and are lost from the cell
describe the structure of the centromere?
constitutive heterochromatin consisting of repetitive satellite DNA. pericentric heterochromatin facilitates sister chromatin adhesion
describe the 2 groups of centromeric proteins (CENPs)?
1) constitutively associated with the centromere such as CENPA, CENPB and CENPC, which are thought to have structural roles in kinetochore formation
2) passenger proteins associate transiently throughout the cell cycle
which diseases are associated with centromere malfunction?
- Premature centromere division (PCD) – age-dependent phenomenon occurring in women. may result in increase of x chr aneuploidy
- Premature chromatid separation (PCS) - separate chromatids with no discernable centromere. no known phenotypic effect
- Roberts syndrome (chr breakage) - LOF mutations in ESCO2 cause delayed cell division and increased cell death. growth retardation, Limb malformations (reduction), craniofacial (microcephaly, clefting), intellectual disability and renal and cardiac abnormalities
what is the kinetochore?
multiprotein complex that assembles on centromere acting as point of contact for spindle fibers. inner kinetochore tightly associated with centromere DNA. outer kinetochore interacts with microtubules. a pair of kinetochores appears on each chromatid in late prophase.
what is a neocentromere?
new centromere that forms in non-standard centromere location as a result of disruption to natural centromere. they lack repetitive α satellite DNA sequences .
what are telomeres?
highly conserved gene-poor DNA-protein complexed that cap the ends of chromosomes. if lost the chr is unstable and can fuse with other chrs. prevents shortening > cell death. aids chr pairing.
describe the structure of telomeres
consists of tandem repeats associated with telomere-binding proteins. next to these are telomere associated repeats (unknown function). also has ss-DNA 3’ overhang protects chr end particulary in replication of lagging strand (back-stitching synthesis creates okazaki fragments)
what is telomerase?
RNA-protein enzyme that extends synthesis of leading strand to use as template for lagging strand. has 2 subunits: protein subunit and RNA subunit consisting of complimentary hexanucleotide sequence to telomere. cells that lack telomerase shorten progressively > cell death and ageing
what is cri du chat?
5p deletion. including cat-like cry, microcephaly, distinct facies and palmar creases. Deleted region includes the hTERT gene – telomerase reverse transcriptase, which helps maintain telomere ends
what is the Nucleolar organizing region (NOR)
on short arm of acrocentrics. contains rRNA genes 5.8S, 18S and 28S. responsible for organising the nucleolus structure and contain the approx. 200 rRNA genes necessary for protein synthesis, if transcriptionally active it stains dark with Ag-NOR staining
what info is required about the gene for a uv investigation?
- does patient phenotype fit gene?
- MOI
- mutational mechanism (eg. haploinsifficient, dominant neg?)
- protein structure & function
- strength of the gene-disease relationship
how do you carry out a SNV investigation?
- check pop freq (beware of later onset & lower penetrance)
- in silico (splice, conservation, AA substitution),
- lit/database search - previously reported variants, functional data, hotspots,
- segregation data (watch out for LD, phenocopies & penetrance, non-paternity)
- de novo - test parents, literature
- allelic data - in trans with dominant pathogenic or in cis with recessive = benign / in trans with pathogenic for recessive = pathogenic
- phenotype- MDT (assess variant in context of phenotype data)/referral info is specific
- ACMG/ACGS guidelines - evolving through clingen & monthly webezes
pathogenic = >99% disease-causing
likely path = >90%
class 3 (subdivided)
likely benign < 10%
benign <0.1%
- 4 & 5s clinically actionable
how can RNA studies help UV investigation and describe limitations?
- looks at splicing effects by studying cDNA generated from mRNA
considerations: - normal isoforms may complicate results
- sample type - is the RNA expressed in the blood?
- quality - RNA degrades quickly
- PTC can cause NMD of mRNA so product may not be present or be very low to sequence. biallelic expression of the variant rules out NMD
mini-gene assays can overcome some issues
what is LOH in tumour tissue & how does it influence pathogenicity investigation?
what limitations are there to this?
normal allele lost in tumour tissue usually due to large deletion suggests hemi variant is pathogenic
- if variant allele is lost this suggests it is benign
- be aware of variant in cis (ie. variant studied is not the causal variant)
- presence of normal tissue may skew results
what is NGS library preparation?
fragmenting starting material and ligating adapters and indices to allow sequencing.
what is NGS enrichment?
Enrichment is needed to capture regions of interest for single genes, panels and exomes but not WGS. It may be amplicon (PCR) based or hybridisation based
Describe amplicon (PCR-based_ enrichment) for NGS?
eg. Nextera XT illumina: transposomes randomly cleave ds-DNA and ligate adaptor oligos with different sequences to 5’ end. A limited PCR cycle then adds indexes and full adapter sequences to the fragmented DNA for sequencing.
eg. Qiagen - reduces bias by integrating unique molecular indices (UMI). genomic DNA is fragmented and ligated with UMI and adapter. Target enrichment performed by targeted PCR with gene specific primer and universal primer to the adapter. universal PCR amplifies the library. after sequencing, reads with same UMI are pcr duplicates and are removed to identify artefacts and CNV
describe hybridisation-based enrichment for NGS?
eg. Agilent SureSelect -fragment DNA, tag with adaptors and barcodes and capture libraries with RNA or DNA-based oligos. oligos anneal to specific regions of genome. hybridise sample with biotinylated RNA library baits and select target region by magnetic streptavidin beads. amplify and sequence
eg. agilent Haloplex - restriction digest, anneal ds-biotinylated oligos and capture with streptavidin coated magnetic beads. PCR with common primers generates library of enriched fragments
what are the advantages of amplicon (PCR) based enrichment? for NGS
cheap
low quantity needed
faster
useful for smaller regions
suitable for FFPE
what are the disadvantages of amplicon (PCR) based enrichment? for NGS
- preferential amplification leads to non-uniform enrichment
- artefacts
- difficult to multiplex primers to study larger targets
- sometimes unnecessary introns included
- no read depth for CNV
- not scalable
what are the advantages of hybridisation based PCR enrichment? for NGS
- more uniform coverage
- better coverage of GC-rich regions
- PCR duplicated easily removed reducing artefacts
- read depth can be used for CNV
- suitable for FFPE
what are the disadvantagesof hybridisation based PCR enrichment? for NGS
high quantity required
- higher cost
- longer prep time
- difficult to distinguish pseudogenes
what is the main difference between second and 3rd generation NGS platforms?
2nd generation platforms utilize amplification step prior to sequencing library molecules unlike single-molecule sequencing performed by 3rd generation platforms
describe the general 2nd generation NGS process?
sequencing platform uses a series of automatically coordinated, repeating chemical reactions typically carried out in a flow cell or compartment which houses the immobilized templates and necessary reagents. Most platforms (with the exception of SOLiD) use ‘sequencing by synthesis’ - a repeated cyclical process which occurs within the flow cell and consists of nucleotide addition, washing and signal detection.
what are advantages of WGS compared to WES?
- SNVs, indels , SV and CNVs in coding and non-coding regions ~3.5 million variants (WES omits promoters and enhancers & limited to coding and splice variants ~20 000 variants)
- more uniform coverage
- easier to capture low sequence complexity
- pcr amplification not required reducing GC bias
- not limited by sequencing read length (WES needs smaller target probes)
- no reference bias (WES preferentially enriches reference alleles at het sites producing false calls0
- WGS captures everything whereas WES is limited to current targeted genes
- wgs suitable for complex trait gene identification as well as sporadic phenotypes caused by de novo variants (WES suitable for highly penetrant mendelian disease gene identification)
what are advantages of WES compared to WGS?
- cheaper, less storage and analysis costs
- targets relevant regions - only small proportion of WGS data is clinically relevant
- reduced costs mean more samples sequenced
- WES is good for mendelian disorders which present with atypical manifestations and are difficult to confirm with previous genetic testing (eg. heterogeneity or a long list of candidate genes)
what is targeted NGS?
used for disease-specific targeted tests for hereditary disorders and therapeutic decision making. Uses gene panels which are specific to certain disease types eg. clinical exome or mendeliome or custom-designed panels. only known genes included with established phenotype. can be used for tumour profiling, MRD (can see emergence of clones and allelic ratios), microbiology (disease outbreak, resistance, screening), NIPT and NIPD,
what are the advantages of targeted NGS over WES/WGS?
- cheaper
- better coverage as can sanger fill
- can fully interpret every variant whereas many WES/WGS variants are filtered
- less chance of incidental findings
what is the main disadvantages of targeted NGS over WES/WGS?
- inflexible, need to redesign capture for new targets
what is a virtual exome and what are the advantages?
sequencing an exome and masking all but the desired data. reduces incidental findings, gives flexible analysis and addition of genes at no extra cost. can analyse primary genes first, then broader analysis if negative
what are the disadvantages of a virtual exome?
coverage
depth is sacrificed for breadth
what is ChiP-seq? (Chromatin immunoprecipitation followed by sequencing)-
copy number changes, for single nucleotide polymorphism (SNP) genotyping, but can also be used to study DNA methylation, alternative splicing miRNAs and protein-DNA interactions
technique for genome-wide profiling of DNA-binding proteins, histone modifications or nucleosomes used for studying transcriptional regulation and epigenetic mechanisms.
It is an in-vivo protein-DNA binding assay where antibodies are used to select specific proteins or nucleosomes which enriches for DNA fragments that bind. These DNA fragments are sequenced directly.
what is the transcriptome?
the complete set of transcripts in a cell and their quantity at a specific developmental stage or physiological condition
what is transcriptomics?
cataloguing all species of RNA transcript, including mRNAs, non-coding RNAs and small RNAs to determine the transcriptional structure of genes, start sites, 5’ and 3’ ends, splicing patterns and other post-transcriptional modifications and to quantify changing expression levels of each transcript during development and under different conditions. This may be done by hybridisation techniques (incubating fluorescently labelled cDNA with microarrays or sequencing cDNA libraries
how does RNA sequencing work?
RNA > cDNA with adaptors at one or both ends. Each molecule is sequenced and reads are aligned to reference to produce transcription map that includes transcriptional structure and/ore level of expression of each gene
what are the advantages of RNA sequencing?
- reveals precise location of transcription boundaries
- gives info on connectivity between different exons
- reveals SNPs in transcribed regions
- highly accurate for quantifying expression levels by qPCR
- less sample required as no amplification step in some technologies and no cloning step
- can identify novel transcribed regions
- high throughput and quantitative and lower cost than large scale sanger sequencing
- can identify novel gene fusions in cancer and subtyping
what are challenges of RNA-sequencing?
- larger RNA molecules must be fragmented to <500bp to be compatible with deep-sequencing technologies which may introduce bias
- may need strand-specific libraries which yield information about the orientation of transcripts valuable for transcriptome annotation especially for regions with overlapping transcription from opposite directions
how is NGS used for ct-DNA?
mutations on ct-DNA can act as a cancer biomarker to identify cancer patients from a group of healthy individuals. more sensitive than tissue biopsy. eg. SEPT9 methylation has been approved by FDA for blood-based screening test for CRC. NGS can also be used for treatment, selction, prognosis and MRD monitoring of ctDNA
how is NGS used for HLA typing?
knowledge of pilys in individuals in the HLA region is essential for organ and stem-cell transplantation
what is the purpose of the new CNV (2019) guidelines?
the guidelines introduce a semi-quantitative, evidence-based framework with a 5-tier classification system with separate schemes for gains and losses. The aim was to provide consistent, evidence-based clinical classification across labs.
Pathogenic 0.99 or higher
Likely pathogenic 0.90 to 0.98
Uncertain significance -0.89 to 0.89
Likely benign -0.90 to -0.98
Benign -0.99 or fewer
what is involved in CNV assessment?
CNV genomic content: size,overlap with established triplosensitive (TS), haploinsufficient (HI) or benign genes/genomic regions; gene number and content population frequency, evaluation of literature, public databases, internal lab data; inheritance pattern/family history for patient being studied
give examples of recurrent CNV syndromes caused by low-copy repeats which may misalign during recombination at meiosis
Prader-Willi / Angelman syndromes 15q11q13
DiGeorge/velocardiofacial syndrome 22q11.21 deletion including TBX1
Williams syndrome 7q11.23 deletion including ELN
Smith-Magenis syndrome 17p11.2 deletion including RAI1
Miller-Dieker syndrome 17p13.3 deletion including PAFA1B1 and YHWAE
why should number of protein-coding genes in a CNV be assessed?
• CNVs containing no protein-coding genes or other known functionally important elements are more likely to be benign.
• a small CNV in a gene-rich region could have a larger impact than a much larger CNV in a gene-poor region. ACMG guidelines suggest adding evidence if ≥25 genes for copy number loss and ≥35 genes for copy number gains. Consideration should be given for clusters of genes where a relatively large number of genes may be present with little phenotype impact e.g. olfactory receptor gene clusters
why might Pathogenic CNVs may also be present at low frequency in control populations ?
recessive inheritance, incomplete penetrance, parent of origin effect, late onset phenotypes and affected individuals in poorly phenotyped control cohorts
name sources for population frequency of CNVs
DGV - The DGV Gold is a curated set of variants from selected high resolution and high quality studies in DGV, clustering variants and removing low-resolution studies with high false-positive rates
GnoMad SV
o In-house laboratory database of normal control samples can be useful to identify common variants in a local population and/or technical artefacts that may be platform specific.
what kinds of evidence might aid CNV interpretation?
- multiple unrelated patients with a similar CNV and a consistent phenotype (more likely pathogenic)
- de novo supports pathogenicity. Additional weight is given if the phenotype is highly specific and relatively unique to the gene/genomic region.
- Segregation amongst similarly affected family members supports pathogenicity
• Non-segregations EXCEPT incomplete penetrance, phenocopies (variation caused by environment that resembles a genetic variation), accurate phenotyping of family members, age of onset and inheritance pattern - for a CNV identified in recessive gene - look for SNV on other allele
what is haploinsufficiency? how can it be found for a gene?
loss of one allele resulting in a single normal allele at a particular locus is inadequate for normal function resulting in a phenotype. Regions of copy number loss containing established HI genes are more likely to be pathogenic.
HI info available on ClinGen, decipher, knockout mouse models, literature, ClinVar, HGMD, OMIM
pLI available on GnoMad
what are intergenic deletions and could they be pathogenic?
do not affect the coding sequence of a gene. may be pathogenic if disrupt regulation of expression of nearby genes An intergenic deletion in close proximity to a known OMIM morbid gene with a relevant phenotype may warrant further investigation.
intragenic dels affecting the 5’ or 3’UTR, or are entirely within an intron, is less clear, but it is possible that such deletions could affect RNA transcription, splicing, stability, or translation.
what is Triplosensitivity (TS)? how is it assessed?
where an additional copy of a gene/genomic region results in a phenotype. Fewer genes appear to exhibit TS than HI and therefore copy number gains are more likely to result in a milder phenotype or to be benign than the reciprocal deletion. clingen, literature and database searching show TS.
why might 5’ or 3’ duplications be pathogenic?
may disrupt regulatory elements
why is it important to determine the source of a homozygous dup?
Homozygous duplications manifest as a copy number of 4, which could also represent a heterozygous triplication. Parental studies are therefore necessary to determine whether the imbalance is monoallelic or biallelic. A biallelic gene-disrupting duplication can be causative of a recessive disorder, whereas a monoallelic triplication cannot be associated with a recessive disorder but may cause disease through a different mechanism.
why is parental testing important for CNV analysis?
. If a variant detected in a patient is found to be inherited from an unaffected parent, it is unlikely to be pathogenic through a fully-penetrant, autosomal dominant inheritance pattern – but incomplete penetrance, recessive inheritance, parent-of-origin dependent effects etc are not excluded. De novo origin of a variant supports its pathogenicity but is not by itself sufficient to class a variant as pathogenic, particularly if relationships have not been confirmed. X-linked variants inherited from carrier mothers can be tested for in male relatives to help determine clinical significance.
should CNV carrier status be reported?
Reporting of carrier status for recessive conditions (i.e. unrelated to the reason for testing) is at the discretion of the laboratory. However, carrier status for X-linked recessive conditions in females should be reported due to her own future reproductive risk, the implications for other family members, as well as the possibility that the patient may be a manifesting carrier.
how is a reciprocal unbalanced
translocation identified by array and what follow up is required?
terminal deletion of one chromosome and a terminal duplication of a different chromosome is suggestive of an unbalanced product of a parental balanced translocation. This can be confirmed by G-band analysis in the proband, or FISH if the translocated segment is too small to be reliably detected by G-band.
what is an isochromosome?
Isochromosomes are recognizable by copy number gain of one chromosome arm. For example, Pallister Killian syndrome is caused by mosaic isochromosome of 12p, resulting in mosaic tetrasomy of 12p.
when should a CNV be reported?
o account for patient’s phenotype.
o have other implications for the patient
o have implications for the patient’s family
a VUS should be reported if it may be pathogenic and further investigations may clarify pathogenicity
when should a PRENATAL CNV be reported if not related to referral reason?
o Report high penetrance neuro-susceptibility loci that are associated with a risk of a severe phenotype
o Report neuro-susceptibility loci associated with an increased incidence of anomalies detectable on scan, as reporting these may help direct further scanning
o Report deletions of cancer susceptibility genes.
o Report a DMD deletion in a female fetus.
o Do not report low penetrance neuro-susceptibility loci eg 15q11.2 deletions.
o Do not report heterozygous deletions associated with recessive conditions unrelated to the fetal anomalies.
o Do not report unsolicited pathogenic variants for which there is no available intervention.
o Do not report variants of uncertain significance that cannot be linked to a potential phenotype on the basis of genes involved.
how is DNA fragmented before sizing if it is not being amplified by PCR?
restriction enzyme digest, sonication, transposases (Transposases fragment DNA by cleaving and inserting a short double-stranded oligonucleotide to the ends of the newly cleaved molecule. The inserted oligonucleotide must contain a sequence that is specific to the particular transposase being used. While this method is fast and has low input requirements, the known sequence bias associated with transposases make them incompatible with some applications. )
what are issues with PCR amplification before sizing?
- allele drop-out (SNPs)
- secondary structures
- preferential amplification
- lack of allele heterozygosity
may hinder estimation of fragment sizes
describe how gel electrophoresis is used to size DNA fragments?
- DNA loaded onto gel and current applied
- negative DNA moves to positive anode and separated by size
- afterwards, DNA can be visualized by UV light
how is capillary electrophoresis is used to size DNA fragments?
- DNA denatured and ssDNA migrates through charged capillary containing polyacrylamide gel
- rate of migration is dependent on the size of the fragment and requires an internal size standard to be run for each sample
- • Applications for this include MLPA, genotyping such as QF-PCR for the analysis of aneuploidies and microsatellite analysis of tumour samples (HNPCC / Lynch syndrome), and Sanger sequencing