Session 2: Laboratory techniques and bioinformatics Flashcards

1
Q

describe the ARMS process?

A

paired PCR reaction involving common primer and allele specific primers
• The allele specific primers differ in their 3’ end nucleotides. amplification does not occur if there is a mismatch allowing different alleles to be distinguished (by size or different flourescence)
useful for screening large number of samples for known panel of mutations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

describe advantages of ARMS

A

quick, cheap, simple, sensitive, detects SNVs, ins/dels, can multiplex, doesn’t require specialized equipment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

describe disadvantages of ARMS

A

unable to detect rare/unknown variants, cross-reactivity e.g. CF p.(Phe508del)/p.(Phe508Cys), NAFNAP, often needs commercial kit, confirmation via another method often necessary, MCC in prenatals,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

describe restriction enzyme digest

A

• Restriction enzymes make DS breaks in DNA at specific recognition sites
• Can be used when a base substitution creates or abolishes a recognition site of a restriction endonuclease
- fragment amplified, product is digested with the relevant restriction enzyme and the products separated by gel or capillary electrophoresis
• Variation includes methylation specific restriction digests eg. Methylation-Sensitive Restriction Enzymes are not able to cut methylated-cytosine residues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the advantages of restriction enzyme digests?

A

cheap, simple, quick, no specialist equipment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the disadvantages of restriction enzyme digests?

A

NAFNAP, only confident of exact variant if it creates a site, if it destroys a site then we dont know the exact change, need r.enzyme to be available for site of interest, enzyme may be expensive or poor quality if rare site, Partial or over digestion can affect interpretation, non-specific activity, heteroduplexes (fragment re-anneals to non-complimentary fragment eg. G:T) will not cut

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is FRET (Fluorescence resonance energy transfer) hybridisation?

A

fluorogenic Minor groove binder probes for rtPCR are specific for SNVs.
Polymerisation of a new DNA strand is initiated from the primers, and once the polymerase reaches the probe, its 5’-3’-exonuclease degrades the probe, physically separating the fluorescent reporter from the quencher, resulting in an increase in fluorescence which is detected and measured in the real-time PCR machine
eg. Jak2 V617F mutation in Myeloid disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are advantages of FRET?

A

highly sensitive, highly stable, can detect very low levels of mutant DNA in a background of wild-type genomic DNA, useful for large sample numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are disadvantages of FRET?

A

costly, no multiplexing, PH-sensitive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

describe Droplet digital (ddPCR)?

A

sample fractionated into thousands of droplets, each containing single DNA template, run PCR using WT and mutant TaqMan probes, software then reads + and - droplets distinguishing WT from mutant.
eg. BRAF V600E and EGFR mutation testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what are advantages of Droplet digital (ddPCR)?

A

detects low level mutations, quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are disadvantages of Droplet digital (ddPCR) for FFPE testing?

A

DNA quality obtained from tumour blocks is often of poor quality and can result in poor amplification
- Accuracy of the results depends on the quality of sample (e.g. containing a high percentage of tumour cells).
- Fixation of the sample causes DNA damage and can result in PCR artefacts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is pyrosequencing?

A

“sequencing by synthesis” principle in which polymerase extended the DNA one dNTP at a time. When dNTP is added to an open 3’ DNA strand pyrophosphate is released. A cocktail of enzymes is used in pyrosequencing which couples this pyrophosphate to light emission by luciferase. amount of light proportional to number of base incorporated eg. Mitochondrial point mutation analysis in MELAS, MERRF, NARP and Leber’s Herediatary Optic Neuropathy (LHON)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

advantages of pyrosequencing?

A

Quick and cheap
Detects low level – quantifiable down to ~5% variant level
Detect heteroplasmy
Can be used to detect methylation status

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

disadvantages of pyrosequencing?

A

Short length of sequence is sequenced
Data can be complex to genotype depending on type of variant analysis required
SNPs can affect primer binding sites

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is the function of the centromere?

A

chromosome segregation - Microtubules of the spindle attach to the centromere via the kinetochore. acentric chromosomes fail to attach to spindle and are lost from the cell

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

describe the structure of the centromere?

A

constitutive heterochromatin consisting of repetitive satellite DNA. pericentric heterochromatin facilitates sister chromatin adhesion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

describe the 2 groups of centromeric proteins (CENPs)?

A

1) constitutively associated with the centromere such as CENPA, CENPB and CENPC, which are thought to have structural roles in kinetochore formation
2) passenger proteins associate transiently throughout the cell cycle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

which diseases are associated with centromere malfunction?

A
  • Premature centromere division (PCD) – age-dependent phenomenon occurring in women. may result in increase of x chr aneuploidy
  • Premature chromatid separation (PCS) - separate chromatids with no discernable centromere. no known phenotypic effect
  • Roberts syndrome (chr breakage) - LOF mutations in ESCO2 cause delayed cell division and increased cell death. growth retardation, Limb malformations (reduction), craniofacial (microcephaly, clefting), intellectual disability and renal and cardiac abnormalities
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is the kinetochore?

A

multiprotein complex that assembles on centromere acting as point of contact for spindle fibers. inner kinetochore tightly associated with centromere DNA. outer kinetochore interacts with microtubules. a pair of kinetochores appears on each chromatid in late prophase.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what is a neocentromere?

A

new centromere that forms in non-standard centromere location as a result of disruption to natural centromere. they lack repetitive α satellite DNA sequences .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what are telomeres?

A

highly conserved gene-poor DNA-protein complexed that cap the ends of chromosomes. if lost the chr is unstable and can fuse with other chrs. prevents shortening > cell death. aids chr pairing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

describe the structure of telomeres

A

consists of tandem repeats associated with telomere-binding proteins. next to these are telomere associated repeats (unknown function). also has ss-DNA 3’ overhang protects chr end particulary in replication of lagging strand (back-stitching synthesis creates okazaki fragments)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

what is telomerase?

A

RNA-protein enzyme that extends synthesis of leading strand to use as template for lagging strand. has 2 subunits: protein subunit and RNA subunit consisting of complimentary hexanucleotide sequence to telomere. cells that lack telomerase shorten progressively > cell death and ageing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

what is cri du chat?

A

5p deletion. including cat-like cry, microcephaly, distinct facies and palmar creases. Deleted region includes the hTERT gene – telomerase reverse transcriptase, which helps maintain telomere ends

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

what is the Nucleolar organizing region (NOR)

A

on short arm of acrocentrics. contains rRNA genes 5.8S, 18S and 28S. responsible for organising the nucleolus structure and contain the approx. 200 rRNA genes necessary for protein synthesis, if transcriptionally active it stains dark with Ag-NOR staining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

what info is required about the gene for a uv investigation?

A
  • does patient phenotype fit gene?
  • MOI
  • mutational mechanism (eg. haploinsifficient, dominant neg?)
  • protein structure & function
  • strength of the gene-disease relationship
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

how do you carry out a SNV investigation?

A
  • check pop freq (beware of later onset & lower penetrance)
  • in silico (splice, conservation, AA substitution),
  • lit/database search - previously reported variants, functional data, hotspots,
  • segregation data (watch out for LD, phenocopies & penetrance, non-paternity)
  • de novo - test parents, literature
  • allelic data - in trans with dominant pathogenic or in cis with recessive = benign / in trans with pathogenic for recessive = pathogenic
  • phenotype- MDT (assess variant in context of phenotype data)/referral info is specific
  • ACMG/ACGS guidelines - evolving through clingen & monthly webezes

pathogenic = >99% disease-causing
likely path = >90%
class 3 (subdivided)
likely benign < 10%
benign <0.1%

  • 4 & 5s clinically actionable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

how can RNA studies help UV investigation and describe limitations?

A
  • looks at splicing effects by studying cDNA generated from mRNA
    considerations:
  • normal isoforms may complicate results
  • sample type - is the RNA expressed in the blood?
  • quality - RNA degrades quickly
  • PTC can cause NMD of mRNA so product may not be present or be very low to sequence. biallelic expression of the variant rules out NMD

mini-gene assays can overcome some issues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

what is LOH in tumour tissue & how does it influence pathogenicity investigation?
what limitations are there to this?

A

normal allele lost in tumour tissue usually due to large deletion suggests hemi variant is pathogenic
- if variant allele is lost this suggests it is benign

  • be aware of variant in cis (ie. variant studied is not the causal variant)
  • presence of normal tissue may skew results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

what is NGS library preparation?

A

fragmenting starting material and ligating adapters and indices to allow sequencing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

what is NGS enrichment?

A

Enrichment is needed to capture regions of interest for single genes, panels and exomes but not WGS. It may be amplicon (PCR) based or hybridisation based

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Describe amplicon (PCR-based_ enrichment) for NGS?

A

eg. Nextera XT illumina: transposomes randomly cleave ds-DNA and ligate adaptor oligos with different sequences to 5’ end. A limited PCR cycle then adds indexes and full adapter sequences to the fragmented DNA for sequencing.

eg. Qiagen - reduces bias by integrating unique molecular indices (UMI). genomic DNA is fragmented and ligated with UMI and adapter. Target enrichment performed by targeted PCR with gene specific primer and universal primer to the adapter. universal PCR amplifies the library. after sequencing, reads with same UMI are pcr duplicates and are removed to identify artefacts and CNV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

describe hybridisation-based enrichment for NGS?

A

eg. Agilent SureSelect -fragment DNA, tag with adaptors and barcodes and capture libraries with RNA or DNA-based oligos. oligos anneal to specific regions of genome. hybridise sample with biotinylated RNA library baits and select target region by magnetic streptavidin beads. amplify and sequence

eg. agilent Haloplex - restriction digest, anneal ds-biotinylated oligos and capture with streptavidin coated magnetic beads. PCR with common primers generates library of enriched fragments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

what are the advantages of amplicon (PCR) based enrichment? for NGS

A

cheap
low quantity needed
faster
useful for smaller regions
suitable for FFPE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

what are the disadvantages of amplicon (PCR) based enrichment? for NGS

A
  • preferential amplification leads to non-uniform enrichment
  • artefacts
  • difficult to multiplex primers to study larger targets
  • sometimes unnecessary introns included
  • no read depth for CNV
  • not scalable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

what are the advantages of hybridisation based PCR enrichment? for NGS

A
  • more uniform coverage
  • better coverage of GC-rich regions
  • PCR duplicated easily removed reducing artefacts
  • read depth can be used for CNV
  • suitable for FFPE
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

what are the disadvantagesof hybridisation based PCR enrichment? for NGS

A

high quantity required
- higher cost
- longer prep time
- difficult to distinguish pseudogenes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

what is the main difference between second and 3rd generation NGS platforms?

A

2nd generation platforms utilize amplification step prior to sequencing library molecules unlike single-molecule sequencing performed by 3rd generation platforms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

describe the general 2nd generation NGS process?

A

sequencing platform uses a series of automatically coordinated, repeating chemical reactions typically carried out in a flow cell or compartment which houses the immobilized templates and necessary reagents. Most platforms (with the exception of SOLiD) use ‘sequencing by synthesis’ - a repeated cyclical process which occurs within the flow cell and consists of nucleotide addition, washing and signal detection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

what are advantages of WGS compared to WES?

A
  • SNVs, indels , SV and CNVs in coding and non-coding regions ~3.5 million variants (WES omits promoters and enhancers & limited to coding and splice variants ~20 000 variants)
  • more uniform coverage
  • easier to capture low sequence complexity
  • pcr amplification not required reducing GC bias
  • not limited by sequencing read length (WES needs smaller target probes)
  • no reference bias (WES preferentially enriches reference alleles at het sites producing false calls0
  • WGS captures everything whereas WES is limited to current targeted genes
  • wgs suitable for complex trait gene identification as well as sporadic phenotypes caused by de novo variants (WES suitable for highly penetrant mendelian disease gene identification)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

what are advantages of WES compared to WGS?

A
  • cheaper, less storage and analysis costs
  • targets relevant regions - only small proportion of WGS data is clinically relevant
  • reduced costs mean more samples sequenced
  • WES is good for mendelian disorders which present with atypical manifestations and are difficult to confirm with previous genetic testing (eg. heterogeneity or a long list of candidate genes)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

what is targeted NGS?

A

used for disease-specific targeted tests for hereditary disorders and therapeutic decision making. Uses gene panels which are specific to certain disease types eg. clinical exome or mendeliome or custom-designed panels. only known genes included with established phenotype. can be used for tumour profiling, MRD (can see emergence of clones and allelic ratios), microbiology (disease outbreak, resistance, screening), NIPT and NIPD,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

what are the advantages of targeted NGS over WES/WGS?

A
  • cheaper
  • better coverage as can sanger fill
  • can fully interpret every variant whereas many WES/WGS variants are filtered
  • less chance of incidental findings
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

what is the main disadvantages of targeted NGS over WES/WGS?

A
  • inflexible, need to redesign capture for new targets
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

what is a virtual exome and what are the advantages?

A

sequencing an exome and masking all but the desired data. reduces incidental findings, gives flexible analysis and addition of genes at no extra cost. can analyse primary genes first, then broader analysis if negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

what are the disadvantages of a virtual exome?

A

coverage
depth is sacrificed for breadth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

what is ChiP-seq? (Chromatin immunoprecipitation followed by sequencing)-

A

copy number changes, for single nucleotide polymorphism (SNP) genotyping, but can also be used to study DNA methylation, alternative splicing miRNAs and protein-DNA interactions

technique for genome-wide profiling of DNA-binding proteins, histone modifications or nucleosomes used for studying transcriptional regulation and epigenetic mechanisms.

It is an in-vivo protein-DNA binding assay where antibodies are used to select specific proteins or nucleosomes which enriches for DNA fragments that bind. These DNA fragments are sequenced directly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

what is the transcriptome?

A

the complete set of transcripts in a cell and their quantity at a specific developmental stage or physiological condition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

what is transcriptomics?

A

cataloguing all species of RNA transcript, including mRNAs, non-coding RNAs and small RNAs to determine the transcriptional structure of genes, start sites, 5’ and 3’ ends, splicing patterns and other post-transcriptional modifications and to quantify changing expression levels of each transcript during development and under different conditions. This may be done by hybridisation techniques (incubating fluorescently labelled cDNA with microarrays or sequencing cDNA libraries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

how does RNA sequencing work?

A

RNA > cDNA with adaptors at one or both ends. Each molecule is sequenced and reads are aligned to reference to produce transcription map that includes transcriptional structure and/ore level of expression of each gene

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

what are the advantages of RNA sequencing?

A
  • reveals precise location of transcription boundaries
  • gives info on connectivity between different exons
  • reveals SNPs in transcribed regions
  • highly accurate for quantifying expression levels by qPCR
  • less sample required as no amplification step in some technologies and no cloning step
  • can identify novel transcribed regions
  • high throughput and quantitative and lower cost than large scale sanger sequencing
  • can identify novel gene fusions in cancer and subtyping
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

what are challenges of RNA-sequencing?

A
  • larger RNA molecules must be fragmented to <500bp to be compatible with deep-sequencing technologies which may introduce bias
  • may need strand-specific libraries which yield information about the orientation of transcripts valuable for transcriptome annotation especially for regions with overlapping transcription from opposite directions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

how is NGS used for ct-DNA?

A

mutations on ct-DNA can act as a cancer biomarker to identify cancer patients from a group of healthy individuals. more sensitive than tissue biopsy. eg. SEPT9 methylation has been approved by FDA for blood-based screening test for CRC. NGS can also be used for treatment, selction, prognosis and MRD monitoring of ctDNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

how is NGS used for HLA typing?

A

knowledge of pilys in individuals in the HLA region is essential for organ and stem-cell transplantation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

what is the purpose of the new CNV (2019) guidelines?

A

the guidelines introduce a semi-quantitative, evidence-based framework with a 5-tier classification system with separate schemes for gains and losses. The aim was to provide consistent, evidence-based clinical classification across labs.
Pathogenic 0.99 or higher
Likely pathogenic 0.90 to 0.98
Uncertain significance -0.89 to 0.89
Likely benign -0.90 to -0.98
Benign -0.99 or fewer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

what is involved in CNV assessment?

A

CNV genomic content: size,overlap with established triplosensitive (TS), haploinsufficient (HI) or benign genes/genomic regions; gene number and content population frequency, evaluation of literature, public databases, internal lab data; inheritance pattern/family history for patient being studied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

give examples of recurrent CNV syndromes caused by low-copy repeats which may misalign during recombination at meiosis

A

Prader-Willi / Angelman syndromes 15q11q13
DiGeorge/velocardiofacial syndrome 22q11.21 deletion including TBX1
Williams syndrome 7q11.23 deletion including ELN
Smith-Magenis syndrome 17p11.2 deletion including RAI1
Miller-Dieker syndrome 17p13.3 deletion including PAFA1B1 and YHWAE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

why should number of protein-coding genes in a CNV be assessed?

A

• CNVs containing no protein-coding genes or other known functionally important elements are more likely to be benign.
• a small CNV in a gene-rich region could have a larger impact than a much larger CNV in a gene-poor region. ACMG guidelines suggest adding evidence if ≥25 genes for copy number loss and ≥35 genes for copy number gains. Consideration should be given for clusters of genes where a relatively large number of genes may be present with little phenotype impact e.g. olfactory receptor gene clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

why might Pathogenic CNVs may also be present at low frequency in control populations ?

A

recessive inheritance, incomplete penetrance, parent of origin effect, late onset phenotypes and affected individuals in poorly phenotyped control cohorts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

name sources for population frequency of CNVs

A

DGV - The DGV Gold is a curated set of variants from selected high resolution and high quality studies in DGV, clustering variants and removing low-resolution studies with high false-positive rates
GnoMad SV
o In-house laboratory database of normal control samples can be useful to identify common variants in a local population and/or technical artefacts that may be platform specific.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

what kinds of evidence might aid CNV interpretation?

A
  • multiple unrelated patients with a similar CNV and a consistent phenotype (more likely pathogenic)
  • de novo supports pathogenicity. Additional weight is given if the phenotype is highly specific and relatively unique to the gene/genomic region.
  • Segregation amongst similarly affected family members supports pathogenicity
    • Non-segregations EXCEPT incomplete penetrance, phenocopies (variation caused by environment that resembles a genetic variation), accurate phenotyping of family members, age of onset and inheritance pattern
  • for a CNV identified in recessive gene - look for SNV on other allele
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

what is haploinsufficiency? how can it be found for a gene?

A

loss of one allele resulting in a single normal allele at a particular locus is inadequate for normal function resulting in a phenotype. Regions of copy number loss containing established HI genes are more likely to be pathogenic.
HI info available on ClinGen, decipher, knockout mouse models, literature, ClinVar, HGMD, OMIM
pLI available on GnoMad

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

what are intergenic deletions and could they be pathogenic?

A

do not affect the coding sequence of a gene. may be pathogenic if disrupt regulation of expression of nearby genes An intergenic deletion in close proximity to a known OMIM morbid gene with a relevant phenotype may warrant further investigation.
intragenic dels affecting the 5’ or 3’UTR, or are entirely within an intron, is less clear, but it is possible that such deletions could affect RNA transcription, splicing, stability, or translation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

what is Triplosensitivity (TS)? how is it assessed?

A

where an additional copy of a gene/genomic region results in a phenotype. Fewer genes appear to exhibit TS than HI and therefore copy number gains are more likely to result in a milder phenotype or to be benign than the reciprocal deletion. clingen, literature and database searching show TS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

why might 5’ or 3’ duplications be pathogenic?

A

may disrupt regulatory elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

why is it important to determine the source of a homozygous dup?

A

Homozygous duplications manifest as a copy number of 4, which could also represent a heterozygous triplication. Parental studies are therefore necessary to determine whether the imbalance is monoallelic or biallelic. A biallelic gene-disrupting duplication can be causative of a recessive disorder, whereas a monoallelic triplication cannot be associated with a recessive disorder but may cause disease through a different mechanism.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

why is parental testing important for CNV analysis?

A

. If a variant detected in a patient is found to be inherited from an unaffected parent, it is unlikely to be pathogenic through a fully-penetrant, autosomal dominant inheritance pattern – but incomplete penetrance, recessive inheritance, parent-of-origin dependent effects etc are not excluded. De novo origin of a variant supports its pathogenicity but is not by itself sufficient to class a variant as pathogenic, particularly if relationships have not been confirmed. X-linked variants inherited from carrier mothers can be tested for in male relatives to help determine clinical significance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

should CNV carrier status be reported?

A

Reporting of carrier status for recessive conditions (i.e. unrelated to the reason for testing) is at the discretion of the laboratory. However, carrier status for X-linked recessive conditions in females should be reported due to her own future reproductive risk, the implications for other family members, as well as the possibility that the patient may be a manifesting carrier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

how is a reciprocal unbalanced

translocation identified by array and what follow up is required?

A

terminal deletion of one chromosome and a terminal duplication of a different chromosome is suggestive of an unbalanced product of a parental balanced translocation. This can be confirmed by G-band analysis in the proband, or FISH if the translocated segment is too small to be reliably detected by G-band.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

what is an isochromosome?

A

Isochromosomes are recognizable by copy number gain of one chromosome arm. For example, Pallister Killian syndrome is caused by mosaic isochromosome of 12p, resulting in mosaic tetrasomy of 12p.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

when should a CNV be reported?

A

o account for patient’s phenotype.
o have other implications for the patient
o have implications for the patient’s family

a VUS should be reported if it may be pathogenic and further investigations may clarify pathogenicity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

when should a PRENATAL CNV be reported if not related to referral reason?

A

o Report high penetrance neuro-susceptibility loci that are associated with a risk of a severe phenotype
o Report neuro-susceptibility loci associated with an increased incidence of anomalies detectable on scan, as reporting these may help direct further scanning
o Report deletions of cancer susceptibility genes.
o Report a DMD deletion in a female fetus.
o Do not report low penetrance neuro-susceptibility loci eg 15q11.2 deletions.
o Do not report heterozygous deletions associated with recessive conditions unrelated to the fetal anomalies.
o Do not report unsolicited pathogenic variants for which there is no available intervention.
o Do not report variants of uncertain significance that cannot be linked to a potential phenotype on the basis of genes involved.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

how is DNA fragmented before sizing if it is not being amplified by PCR?

A

restriction enzyme digest, sonication, transposases (Transposases fragment DNA by cleaving and inserting a short double-stranded oligonucleotide to the ends of the newly cleaved molecule. The inserted oligonucleotide must contain a sequence that is specific to the particular transposase being used. While this method is fast and has low input requirements, the known sequence bias associated with transposases make them incompatible with some applications. )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

what are issues with PCR amplification before sizing?

A
  • allele drop-out (SNPs)
  • secondary structures
  • preferential amplification
  • lack of allele heterozygosity

may hinder estimation of fragment sizes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

describe how gel electrophoresis is used to size DNA fragments?

A
  • DNA loaded onto gel and current applied
  • negative DNA moves to positive anode and separated by size
  • afterwards, DNA can be visualized by UV light
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

how is capillary electrophoresis is used to size DNA fragments?

A
  • DNA denatured and ssDNA migrates through charged capillary containing polyacrylamide gel
  • rate of migration is dependent on the size of the fragment and requires an internal size standard to be run for each sample
  • • Applications for this include MLPA, genotyping such as QF-PCR for the analysis of aneuploidies and microsatellite analysis of tumour samples (HNPCC / Lynch syndrome), and Sanger sequencing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

describe fluorescent PCR for sizing? what are its limitation

A
  • • PCR with one primer with a fluorescent tag
  • • Products are analysed by capillary electrophoresis
    • able to resolve products 1 bp apart
    • limited by size of fragment able to be amplified by PCR ~ 5kb
    • Preferential amplification of smaller fragments means that large alleles may not be detected when present with smaller ones
79
Q

describe long range PCR for sizing?

A
  • larger fragments >5kb
  • additives can be used to overcome high GC eg. DMSO which destabilises secondary structure and weakens base pairing so primers can bind
  • Traditionally, long range PCR has been performed using a blend of Taq DNA polymerase (for fast elongation) combined with a small amount of proofreading polymerase eg. Pfu (for accuracy). The proofreading enzyme repairs DNA mismatches incorporated at the 3’ end of the growing strand, allowing Taq polymerase to continue to elongate the DNA much further than it would otherwise, resulting in longer DNA amplification.
80
Q

describe TP-PCR for sizing? what are disadvantages?

A

-used to detect expansions that are too big to amplify with conventional pcr
- three primers, one binds flanking the repeat, one binds to the repeat, and one is a reverse primer
- amplifies from multiple priming sites within the repeat giving rise to mixture of products
5’ end of reverse primer is complementary to repeat primer

  • Large expansions (>100 repeats) are not accurately sized, however this method will still show if an allele is in the affected range
81
Q

describe southern blotting for sizing?

A
  • large amounts of DNA are subjected to restriction digest to isolate framnets of interest (can be double digest with methylation-sensitive enzyme
  • digested DNA undergoes electrophoresis and then denatured
  • DNA transferred to a membrane (usually positively charged nylon) by blotting
  • labelled chemiluminescent probe hybridised to DNA
  • wash to remove unbound probe
  • radiolabelled blots visualised bu autoradiography
82
Q

describe how MLPA works?

A
  • DNA denatured
  • hybridised with probes which are ligated if adjascent. if there is a del or dup ligation doesn’t happen
  • probes amplified and separated based on size of stuffer fragment
  • amount of ligated probe is proportional to copy nu,ber
  • compared to control probes to indicate copy number
83
Q

what is reverse transcriptase MLPA?

A
  • used for mRNA expression profiling
    reverse transcriptase creates cDNA from RNA and MLPA continues as usual using cDNA
84
Q

What are the 4 NGS methods of detecting CNVs?

A
  1. read-pair - able to identify almost all types of SVs but it is unable to detect the exact breakpoints. accuracy of RP methods is largely dependent on the insert size. poor performance for dups
  2. split reads - detect the exact breakpoints of SVs >1 . poor performance for dupskb
  3. read depth - RD is more reliable for regions with deletions and duplications and can also count the number of CNVs but difficult to identify the exact breakpoints in RD. enriched in segmental duplications
  4. Assembly- poor performance for dups
85
Q

what is a benefit of using NGS for CNV instead of MLPA/array?
what is the disadvantage?

A

MLPA /array is costly (array) and time-consuming and only a subset of genes tested (MLPA)

NGS is high resolution, genome wide, provides positional info, detects UPD and LOH, high throughput, detects balanced and unbalanced rearrrangements

detection of large rearrangements such as copy-number variants (CNV) from NGS data is still challenging due to issues intrinsic to the technology including short read lengths and GC-content bias. need to confirm CNVs. the challenge is to identify a tool able to detect CNVs from NGS panel data at a single-exon resolution with sufficient sensitivity to be used as a screening step in a diagnostic setting

86
Q

what two things does a SNP array look for?

A
  • copy number
  • presence or absence of an allele eg. UPD or LOH
87
Q

what are the general principles of a SNP array?

A
  • array with immobilised oligos that are allele-specific targeted to SNPs
  • fragmented target DNA
  • hybridisation signal detection system measures signal intensity of the probe following hybridisation to the target sequence- depends on amount of target (CNV) and affinity between DNA and probe (less affinity if SNPs present).
    • The fluorescence emitted is dependent on which alleles are present in the patient at the SNP site targeted by the probe
  • In contrast to aCGH, in which CNV calling is done by direct comparison with a control sample, SNP arrays use in silico based methods for copy number calling. Copy number changes are calculated by comparison of signal intensity for a probe to that of a set of in silico reference samples within the analysis software
88
Q

what is the theory behind the B-allele SNP array chart?

A
  • B/B+A
  • AA homo 0/0+2 = 0
  • AB het 1/1+1 = 0.5
  • BB hom 2/2+0 = 1

duplication of B = 2/2+1 = 0.66
duplication of A = 1/1+2 = 0.33

deletion of B = 0/0+1 = 0
deletion of A = 1/1=0 = 1

mosaic cases have data values that lean more towards the middle 0.5 as there are more normal cells

89
Q

is SNP array able to detect triploidy?

A

yes

90
Q

what ratios will SNP array give for triploidy?

A

ABB = 0.66 AAB = 0.33 AAA = 0 and BBB = 1

91
Q

can snp array detect MCC and non amplification?

A

yes the logR ratios are skewed. diluting out of the AB 0.5 allele on frequency chart.

As the proportion of maternal cells present increases, the greater the divergence away from the expected 1.0, 0.5 and 0.0 values on the chart will be for all chromosomes. MCC % can be calculated.

92
Q

can nullisomy be detected with SNP array?

A

yes - seen as a large drop in the LogR ratio and so spots/features appear randomly between 0.0 and 1.0

93
Q

how is UPD detected on SNP array?

A

isodisomy -both copies inherited from one parent so every SNP is homozygous. looks the same as a deletion in the B-allele frequency chart but no copy number change.

heterodisomy - both homologoues from same parent. detected when you have a trio for comparison of mat and pat alleles. mixture of AA AB and BB alleles eg. dad has BB mum has AA and proband has AA but copy number is normal. also used to identify sample mix-up

94
Q

what might Copy number neutral LOH indicate on a SNP array?

A
  • isodisomy UPD
  • consanguinity - identical by descent. some software calculates the % LOH across the genome giving an indication of familial relationships but this is usually switched off. It may be useful for malignancy arrays.
95
Q

how does SNP array coverage compare to aCGH? what can be done to compensate?

what are other limitation of SNP compared to aCGH?

A
  • SNPs do not occur evenly across genome leading to uneven coverage and so combined SNP and oligo arrays can be used where standard oligos are added to the SNP design to fill gaps and increase coverage

-SNP less accurate for determining CNVs, parental samples required for heterodisomy and lower signal to noise ratio

96
Q

what is the first-line test for dev delay/dysmorphism/ID and congenital abnormalities in most uk labs?

what is the diagnostic yield?

A

microarray. (aCGH and SNP) - Although the probes in these arrays are distributed across the genome, coverage is variable and highest in genes linked to developmental delay and known microdeletion/duplication regions.
long stretches of homozygosity may be useful to unmask recessive disease but further testing is needed to confirm a homozygous mutation in the suspected gene.

diagnostic yield is between 5-20%

97
Q

what are the considerations for prenatal SNP array testing?

A

in-house experience, minimum false positive calls, fast TAT, internal and external database access, extensive literature searching and communication with clinical geneticists. combined SNP/oligoarray platform provides extra reassurance and SNP arrays do not require sex matched controls and so the sex of the prenatal sample is not needed,

98
Q

how are SNP array and CNV analysis useful for tumour studies?

A

LOH - used to differentiate small cell lung cancer from non-small cell lung cancer
CNV gain - 8q TPD52 prostate cancer overexpression androgen regulated gene
combined LOH & CNV - AML 20% have UPD
methylation - restriction enzymes can be applied before hybridisation and compare DNA sequences between methylation sensitive enzyme samples
allele-specific gene expression - using cDNA instead of genomic DNA as starting material to look at allele specific gene expression which is implicated in tumourigenesis

99
Q

what are the limitations of SNP arrays?

A
  • unable to detect balanced rearrangements/gene fusions/inversions
  • mosaicism not reliable <30% so not good for MRD detection or specific nucleotide mutations
  • coverage limited to where there is SNP variation
100
Q

how does array CGH work?

A

collection of probes attached to solid surface. each probe is a known sequence to which complementary DNA binds. patient and control DNA fluorescently labelled and compete to hybridise to array. Once bound, non-specific DNA is washed away. scanner measures fluorescence- this allows quantification of sequence within a sample.
Differences in Cy3 and Cy5 fluorescence for each spot will indicate loss or gain of material in the respective chromosomal region
. The log2 ratios of the test DNA (for example, Cy5) divided by the reference DNA (Cy3) are then plotted against the chromosome position
can be used for DNA or RNA.

101
Q

what are different arrays used for?

A
  • aCGH - Determine the copy number of genomic regions across the genome for CNV detection
  • SNP - genotype multiple genomic loci
  • expression profiling - measure expression levels of multiple genes to assess gene activity eg. at different time points in embbryonic development
  • methylation profiling - epigenetics
  • exon array - detect different splicing isoforms. probes are located within exons for measuring gene expression or detecting gene fusions or located at exon junctions to detect different splicing isoforms.
    fusion gene microarray - detect fusion transcripts
102
Q

what are the first and second line tests for prenatal ab scan and pregnancy loss?

A
  • rapid aneuploidy testing
  • array
103
Q

what are the advantages of an oligo array?

A
  • multiple patients on one slide
  • • Genome-wide test - not targeted
  • reducing cost
  • high resolution (50 - 200kb),sensitivity, specificity
  • better reproducibility and batch-to-batch variation than BAC
  • Multiple consecutive probes indicating the same copy-number change are required to determine a gain or loss. This enhances the accuracy of the interpretation
  • Oligonucleotide based arrays include more flexibility in terms of probe selection, which facilitates higher probe density and customisation of array content
  • • Accuracy of copy number variant detection is higher than in Next Generation Sequencing-based assays, including Whole Genome Sequencing
  • can be custom designed or purchased from vendors with a pre-determined probe coverage eg. ISCA 8 x 60K array which has probes targeted in disease causing regions and the remaining probes spread across the genome to give an overall resolution of 70 kb.
104
Q

what are the disadvantages of an oligo array?

A
  • cannot detect balanced translocations or inversions
  • poor for mosaicism <30%
  • cannot provide structural or positional information - may need karyotype or FISH follow-up
  • small CNVs not detected
  • cannot detect sequence changes
  • large number of probes needed for accuracy
  • Poor signal to noise ratio due to small probe size, which can result in a significant number of false-positive outliers
  • Cannot detect UPD or LOH or triploidy
  • markers chromosomes may be missed depending on size, composition and array coverage for the region on the marker
  • variants of uncertain clinical significance difficult to interpret
  • • In prenatal diagnosis, more expensive and slower than QF-PCR
105
Q

how does an expression array work?

A
  • RNA reverse transcribed to form cDNA which is labelled and hybridised to an array
  • can have different RNA populations examined on same chip
  • may compare cancer cDNA to healthy cDNA to see which genes are over or under expressed in the cancer cells
106
Q

how are expression arrays used in tumour profiling?

A
  • Expression profiling of tumours using microarrays has been used as an alternative approach, to identify genes that are upregulated or downregulated compared to the normal tissue, and may be used to predict clinical outcome or response to a given treatment
  • miRNAs expression is abnormal in cancer cells and can be used to discriminate different tumour types
107
Q

what methods can be used to explore epigenetic changes in tumours?

A

Chip - chromatin immunoprecipitation. allows researchers to examine the interactions between epigenetic regulators and DNA in their natural context. treat with formaldehyde to covalently link protein to DNA. cross-linked chromatin is isolated and fragmented and an antibody is used to precipitate the protein of interest (immunoprecipitation) with DNA. To identify attached DNA fragments, cross-links are reversed and DNA fragments are labelled fluorescently and hybridised to array. allows identification of protein binding sites that help identify functional elements in genome eg. transcription factor

bisulphite modification - C> U except methylated cytosine. CpG islands in promoters often hypermethylated in cancer genomes. probes on microarray hybridise to specifically either converted or unconverted sequence to understand if promoter is hypermethylated.

108
Q

what is quantitative real-time PCR?

A
  • technique used to quantitate levels of DNA or RNA and Continually measures the amount of PCR product during a PCR by means of fluorescence (eg. CYBR green for ds-DNA or sequence-specific flourescent probe with attached quencher which is released - eg. taqman)
    Key point: During the exponential phase of the reaction, the amount of product is directly proportional to amount of template
  • The number of PCR cycles required to reach a set fluorescence threshold is proportional to the amount of starting material
    Cycle threshold (Ct) = this is the number of cycles required for the fluorescent signal to be detected above the background/baseline level, after which an exponential increase is seen, and quantification can occur
  • A lower number of cycles to reach the fluorescence threshold correlates to a higher amount of starting material
  • standards of known concentration are run on the plate with the unknowns in order to create a standard curve
  • a calibrator sample is run on each qPCR plate and expression levels are given proportional to the calibrator sample
109
Q

what are the applications for quantitative real-time PCR?

A

Used in monitoring of minimal residual disease (MRD) with rt-cDNA, Counting bacterial, viral, or fungal loads, SNV detection with melt curve analysis and SNP genotyping with labelling two probes with different fluorophores. confirm copy number changes from microarray

110
Q

how do sequence-specific DNA probes labelled with fluorescent reporter work eg. Taqman? for sequence specific DNA probes used in quantitative pcr.

A
  • Probe is fluorescently labelled at 5’ end and non-extendable at 3’ end
  • Reported (5’) emits wavelength absorbed by quencher 3’ end
  • DNA polymerase extends primer moving towards the probe
  • probe is degraded, reported released and emits flourescence
111
Q

what are applications for low level mutation detection?

A
  • somatic mutations in tumour samples which have WT and tumour DNA
  • early cancer detection in ctDNA eg. KRAS in lung, CRC and adenocarcinoma
  • MRD after surgery or radiochemo & emergencing of resistance
  • disease staging and molecular profiling for prognosis/therapy
  • NIPD cffDNA <6%
  • heteroplasmic mutations in mtDNA where mtDNA only a proportion of wt-DNA
112
Q

what is Enrichment of low-level mutations?

A

the process that increases mutant allele concentration relative to wt alleles. may be for known or unknown mutants. known is easier to design for.

113
Q

by what 3 methods do Allele-Specific Amplification (ASA) methodologies preferentially amplify the known variant? give examples

A
  • destroying or blocking WT allele eg. restriction enzyme pcr cuts WT allele and amplified mutant allele, RFLPprimer binds to WT and introduces a RE site during PCR and so it is digested and smaller than mutant products when separated using electrophoresis
  • preferential amplification of KNOWN mutant allele eg. ARMS - primer has 3’ end that matches mutant but not WT allele, taqman RT-PCR, probe binds specifically to mutant and quencher separated from reporter.
  • spacially separating variant from WT eg. digital PCR - DNA diluted into multi-well plates and flourescent pcr performed from single templates. individual wells are analysed for the presence of pcr products or mutant & WT sequences using fluorescent probes. can detect known with allele specific fluorescent probe and unknown mutations where NGS used to sequence products.
114
Q

what methods can be used to enrich for unknown mutations?

A
  • cold PCR - full: induces formation of heteroduplexes after denaturation (mutant + WT bind) and by using a lower temperature, heteroduplexes denature first. amplification of homoduplexes is suppressed.
    fast: selectively denatures only variant sequences to be amplified,
    improved and complete: oligo complimentary to sense strand of WT has 3’ non-extendable phosphate and so pcr of WT is inhibited and only the variant sequence is amplified
  • NGS but beware of preferential amplification, FFPE and sonication errors, polymerase mistakes such as FoSTeS, sequencing errors in platforms, need sufficient read depth. molecular idetifiers or barcodes can be used to trace back the strands of origin for variant detected.
115
Q

what is the purpose of a western blot?

A
  • confirm presence of protein
  • give protein level
  • assess purity
  • estimate relative molecular mass
116
Q

what is the procedure for western blot?

A
  • extract protein
  • gel electrophoresis
  • blotting
  • use antibody to detect antigen
  • visualization
117
Q

what is immunoprecipitation?

A

used to enrich or purify a specific protein (or a group of proteins) from a complex sample using an antibody immobilized on a solid support (usually agarose resin beads).

118
Q

what is IHC?

A

Method for localising specific antigens (commonly proteins) in tissues based on antibody-antigen binding. This interaction is typically visualised using an antibody conjugated to an enzyme (e.g. peroxidase) that catalyses a colour-producing reaction (detectable via light microscopy), marking the sites of antibody binding. Provides information on the presence or absence and localisation of proteins and tissue structure/cellularity

119
Q

what is Mass spectrometry (MS)?

A

Mass spectrometry (MS) measures the mass-to-charge ratio (m/z) of one or more molecules present in a sample (and calculates the exact molecular weight of sample components). It can be used to identify unknown compounds (via molecular weight determination), to quantify known compounds, and to determine structure and chemical properties of molecules.

120
Q

what are advantages of IHC?

A

fast and provides positional info

121
Q

what are disadvantages of IHC?

A
  • cross-reactivity leading to false positive results
  • variability in fixation and processing
  • not quantitative
  • not high throughput (low level of automation possible)
  • need pathologist expertise
  • does not detect truncated or abnormal proteins with intact epitopes
122
Q

give examples for the use of IHC?

A
  • overexpression of HER2 protein in breast tumour predicts response to Herceptin
  • detection of DMD protein in dystrophinopathy
  • MMR protein detection in lynch syndrome and loss of MMR staining as evidence for mutation pathogenicity
  • EGFR detection in lung adenocarcinoma - good for low tumour cell content but false positives and negs
123
Q

how does RNA differ from DNA?

A
  • U instead of T
  • single stranded
  • ribose replaces deoxyribose as the sugar
124
Q

what 3 forms of RNA are there?

A

mRNA, tRNA and rRNA

125
Q

what is northern blotting used for?
what are pros and cons?

A
  • identify splicing variants
  • gene expression levels
  • analysis of new RNA species
  • ncRNAs

eg. looking for over expression of certain genes in tumour cells

PROS: assess size and abundance of RNA - microarrays cannot do this. looks for high or over-expression of genes
CONS: only single transcripts
not high throughput

126
Q

what is the process of northern blot?

A
  1. isolate RNA
  2. gel electrophoresis
  3. transfer to membrane
  4. detection with hybridised probe (often cDNA)
127
Q

describe the process of RNA sequencing?

PROS and CONS?

A
  • isolate RNA
  • break into short fragments
  • convert RNA fragments into dsDNA through reverse transcription
  • add adaptors to allow sequencer to recognise fragment and sequence multiple samples
  • pcr amplifies DNA
  • QC- verify concentration and fragment lengths
  • high throughput, can detect SVs and gene fusions, quantitative view of gene expression, alternate splicing and allele-specific expression
  • high quality DNA difficult to isolate particular in FFPE
128
Q

what are the pros and cons of hybridisation-based microarray for RNA?

A

PROS: high throughput, low cost, reads gene expression compared to other samples

CONS: need prior knowledge of sequence, hybrid artefacts, very low or high gene expression genes are difficult to detect

129
Q

what are pros and cons of quantitative RT-PCR for RNA analysis?

A

PROS: can use mRNA or RNA (RNA best for relative quantification of targets). mRNA more sensitive

CONS: different yields for different mRNAs

130
Q

what are pros and cons of sanger for RNA analysis?

A
  • fast, can directly determine transcript sequence
  • low throughput, not good for quantification of transcripts, cant measure splice isoforms and cant be used for novel gene discovery
131
Q

what are pros and cons of RNA-in situ hybridisation for RNA analysis?

A

PRO: can be performed on FFPE eg. detect ER and PR expression in breast cancer. fully automated, can view RNA expression in cells with cellular morphology and background intact

CON: not quantitative

132
Q

what is the principles of sanger sequencing?

A
  • uses ddNTPs which lack OH group on 3’ C and so extension is inhibited. ddNTP is fluorescently labbelled and different fragments of varying length are produced.
  • chains are denatured and separated using electrophoresis
133
Q

what are cons of sanger sequencing?

A
  • lower sensitivity than NGS or qPCR
  • ## low level variants may be missed
134
Q

what are applications of sanger sequencing? what are pros and cons?

A
  • confirm NGS
  • gap filling
  • founder mutations
  • familial mutations
    PROs:
  • generates longer reads than NGS for repetitive regions for repeat expansions
  • less reliant on computational tools than NGS
  • easier to score indels or pseudogenes
  • less space to store data than NGS

CONS: NAFNAP, poor sequence quality near primer binding site, NGS better at mosaicism - sanger = 15% threshold

135
Q

what is a phred score?

A

quality. >30 is good

30 = 1/1000 error rate so 99.9% accuracy
20 = 1/100 99% accuracy
10 = 1/10 90% accuracy

136
Q

what is a basic bioinformatic pipeline process?

A

quality control > alignment (data mapped to reference) > variant calling > annotation

137
Q

why is it important that reads are aligned correctly?

A
  • incorrect alignment leads to errors in variant detection and genotype calling. need to be able to cope with sequencing errors and real differences
138
Q

why is cluster density important?

A

• Low cluster density can give very high quality data but causes a lower depth of coverage. Higher cluster density gives a better depth of coverage but can lead to lower quality reads. If cluster density is too high, the clusters become difficult to read and data can be lost.

139
Q

what is a FASTQ file?

A

text-based format for storing both a nucleotide sequence and its corresponding quality scores.

This is generally the input for most bioinformatic pipelines.

140
Q

what is a BED file?

A

text file format used to store genomic regions as coordinates

141
Q

what is a BAM file?

A

aligned/mapped reads and associated quality information

A BAM file (or Binary Alignment Map) is a binary format for storing sequence data. Once a set of FASTQs have been aligned to a reference genome using an alignment algorithm, it forms a BAM file. These can be used in the analysis process to visualise variants or to check quality/coverage of an area. BAM files in IGV

142
Q

what is a CRAM file?

A
  • A very compressed version of a BAM file
143
Q

what is a BCL file?

A

base calls per cycle, a binary file containing base call and quality for each tile in each cycle. The raw file produced by Illumina platforms (other than MiSeqs). These must be converted into FASTQs for bioinformatic analysis.

144
Q

what things affect NGS alignment quality?

A
  • sequencing artefacts
  • poor quality reads
  • repetitive regions
  • homologous regions
145
Q

give examples of annotation in the bioinformatics pipeline?

A

gene symbols, the transcript exon numbers, HGVS nomenclature and the variant consequence

146
Q

why are quality steps used for in bioinformatics?

A
  • reject low quality reads
  • trim low quality bases
  • improve alignment accuracy
147
Q

why are paired-end reads better for repetitive regions and structural rearrangements eg. insertions, deletions and inversions?

A

the distance between each paired read is known and alignment algorithms can use this info to map the reads over repetitive regions more precisely

148
Q

why is read length important?

A

if too short they will not accurately align.
Long reads good for structural variation, repetitive STRs and pseudogenes

Longer reads can provide more information about relative locations of specific base pairs. However, long read technology is expensive and is currently not common place in the NHS. Oxford nanopore long-read technology is becoming more affordable but currently has an error rate that would be considered too high in most diagnostic settings.

149
Q

how can you validate a bioinformatic pipeline?

A
  • assess sensitivity and specifity against genome in a bottle
  • at least 10 individuals (not genome in a bottle alone)
  • sensitivity > 0.95
  • 3 independent runs for reporducibility
  • all validation samples should be downsampled to test limit of detection eg. 20x, 30x
  • specificity >0.95
  • ## known sanger confirmed insertions, deletions and delins should be ran through pipeline to assess complex variants
150
Q

how would you select genes for a panel?

A
  • published evidence
  • diagnostic yield should be at least equal to sequential sanger
  • tenuous evidence genes will lead to more VUS, increase cost and TAT
  • balance between number of genes and coverage. less genes means higher coverage, increased confidence and low heterogeneity
  • commercial gene panels are available
  • many labs have virtual panels derived from clinical exome sequencing but need validation
  • PanelApp - crowdsourcing tool to allow panels including STRs, CNVs and genes to be downloaded which encourages standardisation based on expert knowledge. traffic light system to reflect quality/quantity of evidence
151
Q

what different types of target enrichment are available for NGS?

A
  • OCR based eg. Raindance
  • hybridisation based eg. Agilent SureSelect
  • Amplicon based - eg. Agilent haloplex
152
Q

how do you include transcripts in design of a new NGS panel?

A
  • Alamut contains transcripts that encompass all required exons for a gene
  • NGS validation should include justification for selecting a transcript. if 2 transcripts have something unique (eg. unique exons) both can be joint together in the BED file
  • LRG is universally accepted reference standard containing fixed section and updatable section where biological info can be updated
  • list of transcripts fed into software using BED file with ROIs. these are tiled with RNA baits.
  • ROIs checked on alamut to ensure they span exon +- 50 bases
  • pseudogenes may result in poorer tiling across some regions. During mapping of reads, more than one alignment usually results in bot being discarded by mapping software and so may need to sanger-fill.
153
Q

what are the main steps of designing an ngs panel?

A
  1. target enrichment
  2. gene selection
  3. transcript selection
  4. design - ROIs tiled with RNA baits
  5. DNA quality checks
  6. barcoding samples - allows multiplexing which decreases cost
  7. virtual panels? sub-panel analysis eg. HCM within CM panel
  8. polymorphism list - gnomad data can be excluded. should be reviewed and updated
154
Q

describe validation for an NGS panel?

A
  • required on all aspects of the testing process including method, sequencing and analysis
  • need to understand technical weaknesses eg. homopolymer tract errors,
  • need to assess reproducibility and robustness eg. horizontal coverage, 3 independent runs for validation samples, run-to-run comparisons helps to determine level of multiplexing for adequate coverage, include positive controls. quality scores per base or read depth should be monitored
  • sensitivity = - <5% error rate at 95% confidence which requires 60 unique variants compared in new method in an independent blinded analysis
  • validation should be documented in laboratory-controlled document system
  • UKGTN requires that new panels and addition of genes to existing panels should be validated using a ‘known normal control’ from the 1000 genomes project.
155
Q

describe IQC and EQA for panel validation?

A
  • IQC for each run - record QC metrics such as cluster density, number of reads and coverage
  • EQA - participate in GenQA
156
Q

do NGS variants require confirmation according to BPG?

A
  • in the absence of barcode scanning or SNP assay to assure sample identity, it is essential to confirm variants with new DNA dilution in sanger, MLPA. involved new primers but makes them available for familial testing.
157
Q

what should be included in an NGS report according to BPG?

A
  • ACGS standards,
  • sequence data and clinical info
  • HGVS reporting
  • diagnostic yield for negative reports
  • panel, reference sequence, OMIM#, splice & promoter ROI, method used including library prep, analyser and bioinformatics pipelines and software, coverage, VUS with clinical relevance, ?secondary findings according to local policy, dosage
158
Q

how can NGS costs be reduced?

A
  • batching patients (but decreases read depth)
  • barcoding
  • NGS instead of sequential sanger is cheaper for heterogenous disease in most cases
  • reanalysis with sub-panels is also cheaper than sanger
159
Q

what are challenges of NGS for counselling?

A
  • clinical utility?
  • VUS, incidental findings, variable penetrance, lack of literature
  • lack of data sharing (however CVA is good reource)
  • ethical issues for relatives
  • resources for resequencing, VUS follow-up, counselling and medical follow-up
  • detection rates need to be weighed against risks of VUS (50-100 het variants per patient)
  • more errors
  • previously pathogenic variants will need to be downgraded as true variants are found
  • risk estimates difficult for polygenic diseases
  • negative reports - but still useful to rule out a diagnosis
160
Q

what should pre-test genetic counselling involve?

A
  • information range gained from test including VUS and incidental findings
  • implication of the genes involved
  • who will be able to view the report (an issue in the USA)
  • whether they can choose not to know disease status but still take part in trio
  • truly informed consent with explicit statements for genetic handling
161
Q

what should post-test genetic counselling involve?

A
  • updates on mutations and treatments
162
Q

what is 3rd generation sequencing? how does it compare to 2nd generation?

A

sequencing single DNA molecules without PCR amplification.
3rd gen is higher resolution generating over 10 000 bp reads and are better at detecting structural variants

163
Q

what are the advantages of 3rd generation sequencing?

A
  • small amount of starting material
  • higher throughput - hundreds to millions of reactions carried out
  • lower cost per base
  • longer read lengths >10 000 bp giving better mapping, phasing, CNV detection, insertions, dels and translocations, novel alternate splicing isoforms, chimeric transcripts
    -de novo assembly (without ref sequence)
  • better for repetitive sequence
  • better for pseudogenes
  • more uniform coverage and less sensitive to GC-content
  • potential to detect epigenetic modifications such as methylation
164
Q

what are the 3 types of 3rd generation technologies?

A
  • sequencing by synthesis eg. Pacific Biosciences SMRT
  • nanopore
  • synthetic long read eg. Illumina TruSeq
165
Q

describe 3rd generation sequencing by synthesis and give examples? eg. Pacific Biosciences SMRT

A
  • directly reads original DNA molecule instead of polymerase that copies a DNA strand

-eg. Single molecule real time (SMRT) sequencing:

  • single molecule template per well
  • polymerase incorporates fluorescent NTs which is visualized with a laser and camera

ADVANTAGES: fast, template sequenced multiple timescan detect methylated bases

DISADVANTAGES: expensive and limited throughput
determine large scale sequence structure of DNA without sequencing every base
Eg. FRET sequencing (life technology) fluorescence resonance energy transfer

166
Q

describe 3rd generation nanopore sequencing?

A
  • A single DNA molecule is threaded through a nanopore (biological or synthetic) and individual bases are detected as they pass through the nanopore
  • detects up to 200kb
  • each base alters the current to a different degree
167
Q

what is Synthetic Long Read 3rd gen sequencing? eg. Illumina

A
  • dna fragmented into large segments and then partitioned into microtitre wells so very few molecules exist in each well
  • fragments sheared and barcoded and run on existing short read sequencers eg. hi-seq
  • fragments with same barcode are generated from the same original large fragment and the sequence can be reassembled
168
Q

what is 3rd generation mapping?

A

determine the large-scale sequence structure of DNA without sequencing every base. eg. BioNano

optical mapping system using fluorescently tagged probes attached at “nicked” restriction digest sites to fingerprint long DNA molecules.

maps can be compared to a sequence assembly to construct scaffolds of how the sequences should be ordered and oriented along the chromosome, or compared to a reference genome to reveal structural changes, e.g. rearrangement/fusion of two chromosomes

169
Q

give examples of referrals for which a karyotype may be needed?

A
  • infertility, sex chromosome abnormality or prenatal trisomy detected on QF-PCR.
  • array follow-up where derivitive chromosome may be indicated
  • chromosome instability syndromes eg. Bloom syndrome, Fancomi Anemia, Ataxia telangiectasia
170
Q

why might a balanced translocation carrier have a phenotype? how can you investigate this?

A
  1. submicroscopic imbalance - – can be investigated with FISH/Array-CGH/Optical genome mapping (OGM)

eg. Miller-Dieker Syndrome (MDS)- 17p13.3 deletion- Type 1 lissencephaly with facial dysmorphism. Patients with isolated lissencephaly had smaller deletions. LIS1 gene identified, is deleted in the disease

  1. gene disruption such as inversion
    eg. GOF - splicing exons together creating novel chimeric gene such as BCR-ABL1 translated into tyrosine kinase
    eg. LOF - coding sequence disrupted in haploinsufficient gene such as DMD in x;autosome translocation

constitutional translocations may give cancer risk if TSG is disabled or oncogene separated from controlling region eg. RUNX1 disruption can give rise to Familial Platelet Disorder with predisposition to AML, MDS. often a second RUNX1 hit leads to leukaemia progression.
- identified by sequencing, FISH, rna sequencing, RNA acgh

  1. gene separated from cis regulatory elements such as promoter
171
Q

how can autozygosity mapping identify a disease gene?

A
  • identify mutations in recessive conditions in isolated populations or consaguineous families
  • affecteds are homozygous for markers around disease locus & size is progressively reduced due to recombination over generations
  • genotype SNPs or microsatellites to identify shared regions between affecteds
  • fine-map by including more polymorphic markers
  • identify candidate genes within the region and sequence them - easier if known protein function or expression studies have been done in affected tissues

AutoMap can be applied to WES or WGS VCFs to perform homozygosity mapping

172
Q

what are potential issues with autozygosity mapping?

A

homozygous regions unrelated to disease locus and
inflated LOD scores due to underestimating inbreeding extent

173
Q

what NGS methods can be used to identify disease genes.

A
  • targeted panels for clinically defined heterogeneous disease. requires known candidate genes, 100% coverage
  • WES - 2% of genome. useful for genetically diverse cases or multiple inheritance patterns. less biased approach to targeted NGS, cheaper than WGS and quicker to analyse however non-coding regions not covered, rarely 100% coverage due to poor enrichment & mapping issues, poor coverage of repetitive and GC-rich regions, not as good at detecting structural variantion
  • WGS - unbiased, includes non coding regions, fewer GC and repetitive regions bias, detects balanced chromosomal rearrangements and mosaic variants. HOWEVER it is costly, limited coverage of STRs and storage, security and sharing data issues.
174
Q

what should a pipeline take into account for filtering NGS variants?

A
  • quality - depth and call quality
  • frequency
  • inheritance pattern
  • penetrance - used to identify de novo variants in trio with unaffected parents
  • mosaicism by comparing affected to unaffected tissue
  • predicted consequence - LOF, missense, splice etc
  • variants may be filtered by known biological pathway or protein interaction with other genes associated with phenotype
175
Q

what are limitations of NGS for gene discovery?

A
  • interpretation is challenging eg. non-coding variants, functional studies are expensive and complex
  • cohort size - rare variants in single families difficult to corroborate and relies on data sharing
  • HPO terms essential
  • incidental findings - informed consent
176
Q

what future possibilities are there of using NGS for gene discovery?

A
  • RNA sequencing - validate results
    NGS-based methylation profiling
  • ChiP-seq - analyse protein interactions with DNA
  • gtex to look at gene expression in relevant tissues
  • more understanding of regulatory non-coding RNA’
  • improved data sharing
  • improved complex disease understanding eg. later onset and reduced penetrance
177
Q

what is the calculation for posterior probability?

A

a/a+b

where a = prior probability (CARRIER) x conditional probability of mutation not detected by test)

b = prior probability (not a carrier) X conditional probability of mutation not detected by test)

178
Q

what is the confidence interval?

A

gives an indication of how uncertain we are about that measurement with regards to the true population value, usually 95%.

if we were to repeat an experiment 100 times and calculate the 95% confidence interval each time, then 95% of the intervals would contain the population mean.

179
Q

what does it mean if the 95% confidence interval doesn’t span 1 for an odds ratio

A

there is statistically significant association between exposure and outcome

180
Q

ADD TO CARDS how do you calculate the odds ratio?

A

outcome status
+ -
exposed status + a b
- c d

Where:
a = Number of exposed cases
b = Number of exposed non-cases
c = Number of unexposed cases
d = Number of unexposed non-cases

Odds ratio (OR)= (a/c)/(b/d) which can be re-written as ad/bc

OR of > 1 suggests that the odds of exposure are positively associated with the adverse outcome compared to the odds of not being exposed

181
Q

define test sensitivity? how do you calculate it

A

Sensitivity is the ability of a test to correctly identify individuals who are affected by a disease, (the true positive rate)

True positives/true positives + false negatives

182
Q

define test specificity? how do you calculate it

A

the ability of a test to correctly identify individuals who are not affected by a disease, (the true negative rate)

true negatives/true negatives + false positives

183
Q

how to you calculate positive predictive value? (PPV)

A

true positives/true positives + false positives

184
Q

how to you calculate negative predictive value? (NPV)

A

true negs/true negs + false negs

185
Q

how does disorder prevalence affect PPV and NPV of a test?

A

higher prevalence means higher PPV and lower NPV

186
Q

what might a dosage quotient outside of defined range indicate? how can this be checked?

A
  • fail eg. poor quality DNA >repeat with new DNA dilution
  • mosaicism > repeat to see if get same result
187
Q

what is a polygenic score?

A

sum of the number of trait-associated alleles in an individual weighted by per-allele effect sizes from a discovery GWAS

quantifies an individual’s genetic predisposition to a trait

188
Q

give an example of a risk prediction model

A

BOADICEA (Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm)

  • FH
  • lifestyle
  • rare pathogenic variants
  • polygenic risk score
  • mammography density
189
Q

what are limitations of polygenic risk scores?

A
  • no single score, varies according to source
  • European ancestry, not adequate data for other populations
190
Q

how does CRISPR-Cas9 work?

A
  • cas9 is a nuclease that cuts DNA strands at targeted location
  • By providing a DNA repair template with the desired modification, mechanisms that repair DNA via homology driven repair use this DNA template which can result in the inclusion of your genetic modification into the endogenous genome
191
Q

what are The three main delivery strategies that could be used for clinical genome-editing applications ?

A
  • nanoparticles
  • viruses
  • purified ribonucleic proteins
192
Q

what are the limitations of CRISPR-Cas9 gene editing?

A
  • Accuracy - the ratio of on- versus off-target genetic changes
  • precision - the fraction of on-target edits that produce the desired genetic outcome
  • has the potential to create rearrangements that lead to cancer
  • • An immune response to bacterially derived editing proteins
  • pre-existing antibodies against CRISPR components to cause inflammation
    -• unknown long-term safety and stability of genome-editing outcomes
193
Q

what are possible ethical controversies of germline gene-editing?

A
  • could be avoided by PGD and embryo selection
  • unforeseeable risks
  • “slippery slope” - eugenics - is there a clear distinction between therapy and enhancement
194
Q

what are the advantages of using array over karyotype?

A
  • higher resolution 5mb vs <200kb
  • SNP arrays also detect UPD, LOH, mosaicism and parental origin
  • DNA from uncultured cells so processed quicker
  • custom arrays are targeted, limits IFs especialli in parental testing
  • array files are stored and can be reanalysed in future for newly identified conditions
  • enables identification of novel conditions
  • able to detect cryptic imbalances that may look balanced on karyotype