Impact Of NGS Flashcards
What are the steps involved in NGS
DNA extraction Purification Fragmentation/shearing If WES - target baits Adapters Flow cell loading Bridge PCR Sequencing by synthesis
Why is NGS used over Sanger sequencing - and what is one reason to not use NGS
NGS is faster and cheaper
However it may give too much information therefore Sanger sequencing is used for simpler tests e.g. single gene test
What are the uses of NGS in the diagnostic lab
It can be used for diagnosis, management, treatment
It can inform clinical trials
It can be used to predict pathogenicity and inform life choices
It can be used for prenatal testing
What are the benefits of the 100,000 genomes project
It’s sequences rare disease and cancer genomes
This brings benefits of genetics to patients to aid diagnosis and treatment
It can facilitate new discoveries and medical invite for the sample molecular research and clinical trials
It facilitates personalised medicine
What is the genomic’s England panel app
It’s an app where professionals suggest gene panels and the wider community review them
What are Genomic England’s variant classification tiers
Tier 1 – nonpathogenic, protein truncating
Tier 2 – protein altering (missense), intronic (splice site)
Tier 3 – loss of function in genes not on the panel
Why use long read technology/What are the drawbacks of short read technology
Short read relies on PCR
PCR is not useful for sequences with high GC content and deletion and repeat regions can be poorly sequenced
Therefore there is a lot of DNA missing that long read technology has been able to identify
Describe PACBIO SMRT library construction
Fragmentation – selection of large fragments around 30 to 80 kbp
End repair and adapted ligation
Adapters are circular hairpins at the end, containing primer binding sites at which DNA polymerase attaches
Describe PACBIO SMRT Zero Mode Waveguide sequencing
Zero mode waveguide guides light into small faces so it dissipates out
Each DNA base has a fluorescent tag
As bases are incorporated the tag is cleaved and fluorescence diffuses out
The light is detected resulting in base call
What are some of the applications of PACBIO
It helps fill in the missing regions of DNA that short we could not identify e.g. high GC region and repeat regions
It can detect structural variance such as copy number variation, repeat, inversions
It can identify mobile genetic elements and identify the alleles across a chromosome
Describe what a repeat expansion disease is and give some examples
These occur in UTRs, coding exons and introns
Examples include
Spinocerebellar ataxia – hereditary, progressive, degenerative, fatal
Huntington’s disease – CAG repeat in Huntington gene (HTT)
Normal = <27, intermediate = 27 to 35, pathogenic = > 35
Expanded protein is toxic and accumulates in neurons causing cell death
How is Huntington’s disease identified
Traditionally via PCR and electrophoresis
However this shows size but not the sequence info or structure
PACBIO
Can find interruptions in the sequence for example CAG > CAA
Can measure size and identify new somatic repeat expansions (somatic instability)
Describe how PACBIO is used in RNA sequencing
It can sequence the entire RNA at once
Different either form of mRNA can be sequence showing alternative splice sites
In contrast NGS only shows a reconstruction of one long consensus of all isoforms
What is dark genome and what are the camouflaged regions
The dark genome can be
By depth - poor sequence depth
By quality - poor sequence quality
Camouflage regions are usually due to kind of repeat region or difficult to recover variants
What is Oxford nanopore sequencing
This is ultra long lead up to 2MB
DNA passes to a nanpore and the base sequence is detected as an electrical signal
Each base has its own electrical pattern
What are the advantages and disadvantages of Oxford nanopore sequencing
Advantages – it is one single machine, no other expensive machines are needed and it is scalable by using multiple flow cells
Disadvantages – it is expensive and has a high error rate
Describe GBA mutations
The GBA gene codes for the enzyme glucocerebrosidase
Biallelic mutations can cause Gaucher’s disease (rare)
This causes increased the deposition of glucocerebroside in macrophages because of enzyme deficiency
Minor allelic mutations are a risk factor for Parkinson’s disease
It is difficult to study as it has a pseudogene (GBAP1) and duplicated region therefore there are many repetition which is difficult to study via NGS
What does the genetic association mean
This is when a variant allele appears at a higher frequency in unrelated subject with a disease (case) versus controls
Describe the GWAS study design
It consists of equal, match case-controls via SNP micro array
Statistical tests include chi squared and PLINK
Quality control must be done on SNP and sample data, batch effect and population structure
Describe SNP quality control
Remove SNP not typed in many subject (unreliable genotype)
Remove SNP with low minor allele frequencies (Where SNPs have no detection power while errors have high effect)
Exclude deviations from Hardy-Weinberg
Describe NGS sample data quality control
Remove samples with high levels of missing genotype
Remove samples with very high or very low heterozygosity (unlikely genotypically)
Check that reported sex is the same by checking X heterozygosity and comparing to data
(If 10 males, then 10 should be heterozygous)
What is meant by population structure
It is the consideration that an increase of one ethnicity means that any SNP common in the population may become associated with the disease
This affects Q-Q plot
What is a Q-Q plot
This is a plot of the observed versus expected values
Most values follow the XY line, with deviations as p-value is lower
If this lifts early that indicates population structure problem = genomic inflation
This inflation can be calculated and accounted for , e.g. by multidimensional scaling
What are batch effects
Variations in processing which can induce variation in genotype = false associations
This causes fake associations especially when cases and controls are in separate batches