Session 2 - testing methods Flashcards
What factors influence method of mutation detection?
Is the mutation known? Type of mutation. Tissue type tested Cost Hazardous materials needed? Specialist equipment needed? Turnaround time? Throughput? Polymorphic region?
Name some methods that can be used to scan DNA when a known mutation is not present.
Protein Truncation Test (PTT) Restriction Fragment Length Polymorphism (RFLP) SSCP CSGE dHLPC High Resolution Melt Curve Analysis (HRM) MALDI-TOF MutS
What are the advantages of MALDI-TOF?
High throughput
Rapid
Can determine base composition of DNA
What are the disadvantages of MALDI-TOF?
Expensive
Need a huge machine
What are the advantages of HRM?
Cheap, Easy, High throughput Can use leftover PCR products Low risk of contamination
What are the disadvantages of HRM?
Doesn’t detect actual mutation.
Not 100% sensitive
Need all DNA samples to be prepped in the same way
All DNA samples must be run at the same concentration
Not suitable for highly polymorphic genes
Lots of optimisation needed
What are the advantages of the PTT?
Fast Cheap Good for genes with nonsense mutations Large coding regions covered in one fragment Can detect mutations at 5-10%
Give some disadvantages of PTT
Costly reagents Won't detect missense variants Time consuming Radiolabels required Large deletions may be missed
Give some methods of known mutation detection
ARMS Allele-specific PCR OLA Pyrosequencing Minisequencing Restriction Enzyme digest (needs mutation to create a new restriction site)
What methods are available to size DNA fragments?
Long range PCR Electrophoresis Southern blotting Fluroescent PCR TP-PCR Molecular Combing Nanochannel technology
Name the 6 types of electrophoresis that can be used for fragment length detection
Capiliary Agarose Gel Polyacrylamide gel Nanowire Pulse-phase Bioanalyzer
What is chimeric PCR?
Principle is the same as TP-PCR, but there is a single reverse primer. 5’ binds to region outside of the repeat and the 3’ binds to the repeat. 3’ is more likely to bind at lower temperatures.
What method can be used to detect changes in fragment size due to inversions?
Inverse Shifting PCR (IS-PCR).
This is used to detect in Haemophilia A inversion of intron 22.
Describe the basic method of Southern blotting.
Restriction enzyme digest of gDNA Gel electrophoresis to separate Transfer to membrane Apply label Hybridise Wash Detect
List some bisulphite-dependent methods of methylation detection
ms-MLPA
ms-PCR
Pyrosequencing (uses MIP primers)
COBRA - introduces new restriction enzyme after bisulphite modification of the methylated DNA.
ms-HRM (the conversion of C -> U in the methylated strand causes a reduction in the GC content and melting temperature of the strand).
MethyLight and HeavyMethyl are both real-time quantitative techniques reliant on the binding of methylation specific taqman probes.
List some bisulphite independent methods of methylation detection
Restriction enzyme digest at unmethylated sites
Southern blotting
Which two types of primer can be used for methylation detection, and what is the difference between the two?
Methylation-Indepedent PCR primer - these amplify methylated and unmethylated sequence
Methylation-Specific primers - these primers are specific to the methylated target DNA
What are the sizes of the digested fragments seen in a normal female’s FRAX blot?
5.2kb and 2.8kb.
Draw expected patterns of bands for normal/pre-mutation/full mutation females and males.
Andrew: see revision notes :) 14m2.02 Methylation Detection page 5
What are the two band sizes you would expect to see on a normal Prader-Willi Angelman msPCR gel?
What is amplified to produce these products?
164bp for the paternal allele
131bp for the maternal allele.
Exon 1 of SNRPN has been amplified.
List methods of copy number detection.
G-banding aCGH FISH MLPA MAPH BAC array Oligo array SNP array NGS QF-PCR Real Time QPCR
What are the three main components of SNP arrays?
Slide labelled with allele-specific probes
Fragmented nucleic acid labelled with fluorescent dye
Detection system
What does the signal density of a SNP array depend on?
The copy number of the target sequence.
The affinity between the DNA and the probe.
What are the two types of SNP array?
Illumina.
Affymetrix
What are the differences between the Affymetrix array and an Illumina array?
Affymetrix arrays have oligos of 25bp in length; the oligos contain the SNP site. DNA with any SNP at a single position can bind that oligo. Presence of a SNP reduces the affinity of binding between the labelled DNA and the oligo so fluorescent signal is reduced.
Illumina arrays have 50bp oligos; these oligos are complementary to the sequence ADJACENT to the SNP. Single base-pair extension of the next nucleotide incorporates a fluorescently-labelled A/T/G/C. Fluorescence determines next base/SNP in sequence.
What can SNP arrays be used to detect?
Loss of Heterozygosity (LOH) and copy number changes.
List some applications of SNP arrays
LOH detection in imprinted regions (e.g. Prader Willi)
DNA copy number changes
Methylation analysis - DNA is fragmented using methylation sensitive or methylation specific enzymes.
Gene expression analysis - use cDNA
Why are SNP arrays good for prenatal testing?
A sex-matched control isn’t required.
What are some limitations of SNP arrays?
Can’t detect balanced rearrangements.
Not highly sensitive.
List the three types of microarray
BAC
Oligo
SNP
What are the advantages of BAC arrays?
Low res, so fewer variants of uncertain significance
High signal to noise ratio due to long fragment length
Cheap
Dye-swap to check results
Follow-up FISH is easy and FISH probes are also BACs
What are the disadvantages of BAC arrays?
Low res - may miss smaller abnormalities Fiddly Expensive as only 1 patient/slide Array design limited by BAC availability Determining precise breakpoints is not possible Lots of batch-batch variation.
List advantages of oligo arrays
Higher res than BAC arrays
Cheaper to run - multiple samples per slide
Calls are dependent on multiple consecutive probes - more acurate
List disadvantages of oligo arrays
No UPD/LOH
Need sex-matched control
increased number of variants of uncertain significance
Poor signal to noise ratio.
Some abnormalities are too small to be verified by FISH.
List some applications of arrays.
CNV detection
LOH detection
Gene expression - compare tissues or tumour/normal
Epigenetic expression
What kit is available for expression profiling in tumours, and what is it used to profile?
Mammaprint, breast tumours.
What array methods can be used to investigate epigenetic expression in tumours?
CHiP-chip (any proteins currently interacting with the DNA are covalently bound to it, these fragments are selected by immunoprecipitation. Unbound DNA then put on array.
DamID
Bisulphite modification and appropriate oligos on the array
What factors influence classification of CNVs?
Gene content Any known microdeletions/duplications in the region Size Presence in a control population Inheritance Presence in a disease population Zygosity CNV location - imprinting regions etc.
What does the EACH study recommend is reported in prenatal aCGH?
Only report de novo, fully penetrant CNVs, or those which correspond to a significant imbalance.
Avoid reporting VUSs.
What factors need to be taken into consideration when analysing prenatal arrays?
Prenatal phenotypes of disease may be different or poorly characterised compared with postnatal phenotypes
Mosaicism could be maternal cell contamination or confined placental mocaisism.
Name the two forms of detection chemistry used in RQ-PCR, and give examples of each.
Non-specific fluorescent dyes (SYBR green)
Specific fluorescent dyes (TaqMan, FRET, Hairpin, Scorpion).
How does SYBR green work?
Small dye that binds to the dsDNA groove. Fluorescence emitted when bound to dsDNA.
What can SYBR green be used to detect?
Accumulation of dsDNA during PCR cycles.
How does TaqMan work?
Specific probes labelled with a fluorophore and quencher bind target DNA. Once the polymerase extends the sequence to the probe the fluorophore is removed from the quencher dye and light is emitted.
How does FRET work?
Two probes are needed.
They bind adjacent to each other on the target sequence.
In close proximity resonance is transferred from the donor probe to the acceptor probe and light is emitted.
Probes are removed once the DNA sequence extends up to them.
How is material quantified using RQ-PCR?
As signal reaches threshold (Ct) the signal is above the baseline measurement and can be quantified.
Product in measured during the exponential phase.
What types of quantification exist?
Absolute - test sample measured against set standards.
Relative - test sample measured against internal control.
List some applications of RQ-PCR
Determining Viral/Bacterial/Fungal load Identifying and quantifying fusion genes in cancer, eg - CML (BCR-ABL1 transcript), AML (PML-RARA), ALL (ETV6-RUNX1) Single base mutation detection SNP genotyping Copy number detection NIPD
List applications of low-level mutation detection.
NIPD
Detection of ctDNA
MRD monitoring
heteroplasmy in mitochondrial disease.
List some methods of enrichment used for known low-level mutations.
Get rid of WT allele:
AIRE-RFLP - artificially introduce a new restriction site into WT sequence and digest WT.
REMS-PCR - digest WT sequence using naturally occurring restriction site.
Amplify mutant allele: Allele-specific PCR ARMS PCR Taqman RQ-PCR MAMA
Spatially separate WT and mutation allele:
ddPCR - very sensitive. Can be used to detect LDH1 mRNA in glioma patients’ cerebrospinal fluid.
List some methods of enrichment used for unknown low-level mutations
COLD-PCR - change the denaturing temperature to preferentially allow annealing of the primer to the mutant sequence (at lower temperatures the WT sequence will not denature so the primer cannot bind)
NGS - need very deep sequencing to detect low level mutations - can be prohibitively expensive
What types of COLD-PCR are there?
Full - enriches all possible mutations
Fast - enrichment of known mutations
ICE
How deep does sequencing need to be to accurately detect a variant present at 0.1%?
10,000 reads.
What processes can be used to improve accuracy of NGS for detecting low-level mutations?
Exogenous spike-ins (tags all reads originating from a single source with one tag - they should all be the same)
Duplex sequencing - used for mitochondrial variants
Tagged amplicon sequencing.
List some protein based methods of genetic investigation.
Western blotting MALDI-TOF Mass spectrometry Immunoprecipitation assays IHC
Describe the process of Western blotting
Lyse cells
Run components on a polyacrylamide gel
Stain protein to check it’s worked
Transfer to membrane
Block background using BSA
Hybridise antibodies specific to target antigen (direct or indirect)
Induce reaction allowing detection - chemiluminescense, radiolabelling, peroxidase reaction.
How are proteins separated during Western blotting?
By molecular weight
What modifications can be made to separate further?
Run a 2D blot
step 1. run gel in a pH gradient to separate by charge/size
step 2, rotate 90degrees in acidic buffer and separate by weight
What considerations need to be made during Western blotting?
run buffer type - some can change the protein structure and prevent antibodies binding their epitopes.
Describe the principle of immunoprecipitation. What can it be used for?
Use antibodies bound to beads to ‘capture’ proteins of interest.
Beads are magnetic so can be pulled out of solution for analysis
Protein eluted from antibody
Ready for Western blot. ELISA etc.
Uses: investigate protein half life in BRCA1/2 to determine affect of missense mutations
How can CHIP be used to study gene regulation?
Antibodies to bind chromatin associated proteins, selecting them for analysis, then can study DNA to which they were bound.
What are the limitations of IHC?
No sequence-level information
Need biopsy/sample on which to perform test
Need pathologist to prepare and interpret result
Cross reactivity of antibodies
Not quantitative
Low throughput
Truncated proteins or abnormal proteins with intact epitopes are not detected.
Give some examples of when IHC is used in clinical practice.
Tumour diagnosis and classification - expression of HER2 in breast tumours
Detection/absence of proteins in biopsies, e.g. dystrophinopathies
Identify loss of MMR protein in Lynch syndrome - can direct gene sequencing.
What is tandem mass spectrometry and give an example of when it can be used.
Two MS machines, in tandem - only selected target material passes to second. It can be used to screen for metabolite markers in metabolic disease, e.g. MCAD, PKU.
Name two types of protein microarray
Those that use protein capture agents, e.g. antibodies
Those that use protein-protein interactions to capture target sequence.
Why is proteomic data useful?
Provides comprehensive picture of the localisation, quantity, expression, modification of proteins in the patient.
How can proteomic data be used?
Identification of new genes, biomarkers, signalling pathways, drug targets.
List some RNA-based mutation detection techniques. State how they can be used.
RT-PCR - fusion gene detection, monitoring MRD
RNA-Seq - can detect novel transcripts, doesn’t need known targets or specific probes. New transcript discovery.
RNA expression arrays - tissue profiling. Need to know target sequences.
Minigene assays (insert exon into a vector and culture in HEK293T cells) - e.g. insert an exon of PMP33 containing a missense variant into the vector, grow, harvest and purify DNA, run on GEL and see if there’s a difference in size between WT and mutant.
Northern blotting - gene expression in a small number of genes.
List PCR-based NGS enrichment methods
Ampliseq LR-PCR TruSeq Access arrays ddPCR/Raindance Multiplicom
List hybridisation methods of NGS target enrichment
Sureselect
Haloplex
Trusight
Describe a general hybridisation enrichment capture
DNA fragmented (sheared by sonication or digestion)
Bound to capture probes in solution or on an array
Non-target seqeunce removed
Adapters/barcodes added and target sequences amplified
What additional prep is needed for Illumina sequencing?
Bridge amplification of target sequences to produce foci for sequencing.
List some differences between Illumina chemistry and Ion Torrent chemistry.
Ion Torrent measures H+ release in individual reaction chambers.
Illumina measures light emission in the established foci on the flow cell.
Ion Torrent chemistry requires the addition of a single base at a time.
Illumina chemistry labels each nucleotide a different colour and therefore all bases can be added in a single reaction.
Illumina chemistry uses reversible di-deoxy termination - a ddNTP is added, fluorescence is emitted, measured, slide washed, the 3-prime OH block is cleaved, new batch of ddNTPs added.
List some applications of NGS
WGS
WES
High throughput amplicon sequencing
Monitoring MRD
Tumour profiling
Prenatal diagnosis - NIPD, NIPT, PAGE
ChIP-seq - study chromatin states, epigenetics etc. - allows direct sequencing (no hybridization required) - used for the ENCODE project
RNA-seq - transcriptomics, gene expression - RNA is converted to cDNA library for sequencing, can be assembled without a reference sequence. Library construction can be a challenge as some lncRNAs need to be fragmented first - this can introduce bias.
List some potential causes of error in NGS data
Base calling errors
Alignment errors/mis-mapping
Low coverage
What is the phred score used or?
To determine the probability that the call is a true positive.
For accurate detection of variants using NGS what is required?
Correct alignment against the correct reference genome - determined by mapping quality
Correct variant calling - determined by phred score
Coverage depth is sufficient
Check for allele bias
Check for strand bias
Set required sensitivity of assay - setting this too ow will result in a lot of false positive calls, but too high may mean variants are missed.
What is the difference between single-end and paired-end reads?
Paired-end reads are forward and reverse strands joined by a hairpin structure - one is sequenced after the other. This improves accuracy.
Name three different 3rd generation sequencing approaches - give examples of each.
Single Molecule Real Time Sequencing- PacBio
Nanopore sequencing - Oxford Nanopore, Graphene Nanopore
Synthetic long read sequencing - 10X/Chromium
How does SMRT work?
Monitors polymerase addition of each base into the extending sequence.
dNTPs are fluorescently labelled.
Fragments are processed and adapters added
Fragments bind to hairpin loop and DNA circularises
Use zero-mode waveguide (ZMW) to guide detector to point of nucleotide incorporation into the strand. The DNA polymerase is bound to the ZMW.
Circularised DNA can be fed through the DNA pol multiple times to enhance sequencing accuracy.
How does Nanopore sequencing work?
DNA prepared and a hairpin loop incorporated. This DNA is linked to a helicase to unzip the strands.
ssDNA is pulled through a pore (biological or synthetic) by electrical current.
The ions in the buffer solution are pulled through the bore, and the presence of nucleotides changes their flow - the direct composition of the sequence is detected - multiple bases at a time. Slowing the passage through the pore and using shorter pores can improve accuracy.
How does synthetic long read sequencing work?
Take a long DNA fragment (~10kb) Create an emulsion Fragment each droplet further Add barcodes - each fragment with the same barcode appeared close together in the sequence. Sequence using standard methods
Can be low throughput - Chromium kit can now handle 4million barcodes
What are the disadvantages of SMRT sequencing?
Limited throughput
Expensive
Need a massive machine.
What are the disadvantages of SMRT sequencing?
Limited throughput
Expensive
Need a massive machine.
What machines are available for Oxford nanopore sequencing?
MinION
PromethION
What factors need to be taken into consideration when designing an NGS panel?
Sample type Sample concentration Cost Sensitivity/Specificity Throughput Choice of genes Choice of transcripts Read depth/mosaicism Type of variant Data volume Data storage Need pipeline adjustments?
What factors should be taken into account when planning delivery of a new service?
TAT
Validation - needs to show repeatability, identify range of expected mutations
QC and NEQAS
Variant confirmation
Report format - what to include? Technical reports adn Scientific reports.
Pricing
What changes in the patient pathway may be needed alongside a new test?
Changes in counselling - secondary findings, more VUSs, may be no clinical utility of finding a disease-causing variant
Pre-test counselling - discussion of testing methods and possible results.
Post-test counselling - renanalysis of data and how to communicate new findings, how to handle findings related to a future disease
Truly informed consent may be difficult to get.
What factors should be taken into consideration when assessing VUSs?
Literature reports Population datasets LSDBs Disease databases Inheritance/segregation - use SISA analysis to determine if likely pathogenic Type of mutation - LoF or GoF In-silico analyses Gene content Size of variant Cosegregation with disease Co-occurrence with a known pathogenic mutation in Dominant disorders RNA studies Functional analysis IHC Enzyme analysis - or measure other metabolites
What are the limitations of external databases?
Data accuracy
Data curation
Keeping the dataset up to date
Patient Confidentiality
Cost
Intellectual property on unpublished information
Amount of data - some have lots, some have none.
Variants can be in normal population datasets if disease is late onset, has low penetrance, or a variable phenotype.
Population-specific data; lacking minority datasets
List some useful population databases
ExAC GnomAD EVS dbSNP dbVar 1000G UK10K DGV
List some useful LSDBs
BIC InSIGHT LOVD HVP IPNN Retina International
List some useful disease databases
DECHIPER ClinVar HGMD dMuDB OMIM Orphanet NIH, Genetics Home Reference ECARUCA
List some useful Oncology-specific databases
COSMIC
Mitelman
Atlas of genetics and cytogenetics in Cancer
What are the three types of cell culture systems available?
Direct preparation - no culturing performed
Suspension - cells in media in tubes/flasks
In-situ - cells cultured on a coverslip and colonies form
What supplements can be added to the culture medium to promote growth?
L-glutamine
Foetal calf serum
Antibiotics
Antifungals
What mitogens are available to stimulate cell growth? Which cell types do they stimulate?
PHA - T cell stimulation
PWM - B cell and T-cell stimulation
LPS - B-cell stimulation (used for patients with chronic lymphoproliferative disorders
TPA - B cell stimulation
How long are Bone Marrow cultures routinely cultured for?
12/48hours
How long are AF/CVS samples cultured for?
5-7 days, undisturbed
What additional step needs to be performed prior to BM culture?
White Cell Count
Name three Chromosome Breakage syndromes. State which gene is mutated in each.
Fanconi Anaemia - 15 genes
Ataxia Telangiectasia - ATM
Bloom syndrome - BLM
Xeroderma Pigmentosa - multigenic
What chemicals/treatments can be used to highlight chromosome breakage in culture?
MMC or DOB - they crosslink the DNA and enhance the rate of breaks. This can be used to diagnose FA or AT.
UV-light - induces increased SCE in Bloom syndrome. Detected by adding BrdU to culture and G-banding.
How can the identification of a balanced translocation lead to new gene discovery?
Gene interruption (DMD, SOTOS)
Result in submicroscopic deletions/duplications (CHARGE syndrome)
Gene removed from cis-acting elements by balanced translocation (PAX6 in Aniridia)
Gene translocated next to an enhancer (BCR-ABL1, t(9;22)(q34;q11))
How can recurrent deletions/duplications leads to gene identification?
Microdeletions can narrow down a region linked to disease. Study genes in these regions to identify possible cause. - Miller Dieker (17p13.3), LIS1
What are the two types of linkage analysis? What is the difference?
Parametric (aka LOD score analysis) - needs large family, good disease model, but prone to errors and needs multiple polymorphic markers. Not good for common disease
Non-parametric - A strong genetic model is not required. Used for analysis of sib-pairs
Define autozygosity mapping
Tracks a region of homozygosity through a pedigree.
Used to identify genes in smaller cohorts of patients, particularly those with consanguinity or in populations where there may be a founder effect.
Large number of SNPs are needed.
Uses the inbreeding coefficient.
How were CF and DMD genes identified?
Positional cloning.
What two methods can be used for positional cloning? Which is better?
Chromosome walking Chromosome jumping (better)
How can WES be used to identify new disease genes?
Characterising monogenic disorders - new disease genes
Characterising complex traits - GWAS
Tumour vs Normal characterisation - new drug targets, new biomarkers, new prognostic markers.
What are confidence intervals? How are they calculated?
They define the range of values that you are 95% confident the true value lies in.
Calculated by sample mean +/-1.96 x standard error.
What are odds ratios? How are they calculated? What so the following results mean?
- OR >1
- OR<1
- OR = 1
Used to assess association between exposure and outcome.
OR calculated by mutliplying the number of outcome+/exposed+ x outcome-/exposure- and dividing them by outcome+/exposure- x outcome-/exposure+
Define sensitivity. How is it calculated?
The likelihood that a mutation will be detected if present. The True Positive rate.
TP/(TP+FN)
Define specificity. How is it calculated?
The true negative rate - the chance that someone negative will have a negative result.
TN/(TN+FP)
Define PPV. How is it calculated?
The proportion of positives that are true positives.
TP/(TP+FP)
Define NPV. How is it calculated?
The proportion of negatives that are true negative.
TN/(TN+FN).
What are the boundaries for DQ?
DQ = 0 homozygous deletion
0.4