Impact Of NGS Flashcards by Riyah Kay

What are the steps involved in NGS

DNA extraction
Purification
Fragmentation/shearing 
If WES - target baits 
Adapters 
Flow cell loading
Bridge PCR
Sequencing by synthesis

How well did you know this?

Not at all

Perfectly

Why is NGS used over Sanger sequencing - and what is one reason to not use NGS

NGS is faster and cheaper

However it may give too much information therefore Sanger sequencing is used for simpler tests e.g. single gene test

How well did you know this?

Not at all

Perfectly

What are the uses of NGS in the diagnostic lab

It can be used for diagnosis, management, treatment

It can inform clinical trials

It can be used to predict pathogenicity and inform life choices

It can be used for prenatal testing

How well did you know this?

Not at all

Perfectly

What are the benefits of the 100,000 genomes project

It’s sequences rare disease and cancer genomes

This brings benefits of genetics to patients to aid diagnosis and treatment

It can facilitate new discoveries and medical invite for the sample molecular research and clinical trials

It facilitates personalised medicine

How well did you know this?

Not at all

Perfectly

What is the genomic’s England panel app

It’s an app where professionals suggest gene panels and the wider community review them

How well did you know this?

Not at all

Perfectly

What are Genomic England’s variant classification tiers

Tier 1 – nonpathogenic, protein truncating
Tier 2 – protein altering (missense), intronic (splice site)
Tier 3 – loss of function in genes not on the panel

How well did you know this?

Not at all

Perfectly

Why use long read technology/What are the drawbacks of short read technology

Short read relies on PCR
PCR is not useful for sequences with high GC content and deletion and repeat regions can be poorly sequenced

Therefore there is a lot of DNA missing that long read technology has been able to identify

How well did you know this?

Not at all

Perfectly

Describe PACBIO SMRT library construction

Fragmentation – selection of large fragments around 30 to 80 kbp

End repair and adapted ligation
Adapters are circular hairpins at the end, containing primer binding sites at which DNA polymerase attaches

How well did you know this?

Not at all

Perfectly

Describe PACBIO SMRT Zero Mode Waveguide sequencing

Zero mode waveguide guides light into small faces so it dissipates out

Each DNA base has a fluorescent tag

As bases are incorporated the tag is cleaved and fluorescence diffuses out
The light is detected resulting in base call

How well did you know this?

Not at all

Perfectly

What are some of the applications of PACBIO

It helps fill in the missing regions of DNA that short we could not identify e.g. high GC region and repeat regions

It can detect structural variance such as copy number variation, repeat, inversions

It can identify mobile genetic elements and identify the alleles across a chromosome

How well did you know this?

Not at all

Perfectly

Describe what a repeat expansion disease is and give some examples

These occur in UTRs, coding exons and introns

Examples include
Spinocerebellar ataxia – hereditary, progressive, degenerative, fatal

Huntington’s disease – CAG repeat in Huntington gene (HTT)
Normal = <27, intermediate = 27 to 35, pathogenic = > 35
Expanded protein is toxic and accumulates in neurons causing cell death

How well did you know this?

Not at all

Perfectly

How is Huntington’s disease identified

Traditionally via PCR and electrophoresis
However this shows size but not the sequence info or structure

PACBIO
Can find interruptions in the sequence for example CAG > CAA
Can measure size and identify new somatic repeat expansions (somatic instability)

How well did you know this?

Not at all

Perfectly

Describe how PACBIO is used in RNA sequencing

It can sequence the entire RNA at once
Different either form of mRNA can be sequence showing alternative splice sites

In contrast NGS only shows a reconstruction of one long consensus of all isoforms

How well did you know this?

Not at all

Perfectly

What is dark genome and what are the camouflaged regions

The dark genome can be
By depth - poor sequence depth
By quality - poor sequence quality

Camouflage regions are usually due to kind of repeat region or difficult to recover variants

How well did you know this?

Not at all

Perfectly

What is Oxford nanopore sequencing

This is ultra long lead up to 2MB
DNA passes to a nanpore and the base sequence is detected as an electrical signal
Each base has its own electrical pattern

How well did you know this?

Not at all

Perfectly

What are the advantages and disadvantages of Oxford nanopore sequencing

Study These Flashcards

Advantages – it is one single machine, no other expensive machines are needed and it is scalable by using multiple flow cells

Disadvantages – it is expensive and has a high error rate

Describe GBA mutations

Study These Flashcards

The GBA gene codes for the enzyme glucocerebrosidase

Biallelic mutations can cause Gaucher’s disease (rare)

This causes increased the deposition of glucocerebroside in macrophages because of enzyme deficiency

Minor allelic mutations are a risk factor for Parkinson’s disease

It is difficult to study as it has a pseudogene (GBAP1) and duplicated region therefore there are many repetition which is difficult to study via NGS

What does the genetic association mean

Study These Flashcards

This is when a variant allele appears at a higher frequency in unrelated subject with a disease (case) versus controls

Describe the GWAS study design

Study These Flashcards

It consists of equal, match case-controls via SNP micro array

Statistical tests include chi squared and PLINK

Quality control must be done on SNP and sample data, batch effect and population structure

Describe SNP quality control

Study These Flashcards

Remove SNP not typed in many subject (unreliable genotype)

Remove SNP with low minor allele frequencies (Where SNPs have no detection power while errors have high effect)

Exclude deviations from Hardy-Weinberg

Describe NGS sample data quality control

Study These Flashcards

Remove samples with high levels of missing genotype

Remove samples with very high or very low heterozygosity (unlikely genotypically)

Check that reported sex is the same by checking X heterozygosity and comparing to data
(If 10 males, then 10 should be heterozygous)

What is meant by population structure

Study These Flashcards

It is the consideration that an increase of one ethnicity means that any SNP common in the population may become associated with the disease

This affects Q-Q plot

What is a Q-Q plot

Study These Flashcards

This is a plot of the observed versus expected values
Most values follow the XY line, with deviations as p-value is lower

If this lifts early that indicates population structure problem = genomic inflation

This inflation can be calculated and accounted for , e.g. by multidimensional scaling

What are batch effects

Study These Flashcards

Variations in processing which can induce variation in genotype = false associations

This causes fake associations especially when cases and controls are in separate batches

What is a Manhattan Plot and Regional Plot

X axis - chromosome location Y - -log10(p-value) Regional plot zooms into region of interest showing gene labels, individual SNPs and the p-value

What are the advantages of GWAS

Strong associations are found and easily replicated It has led to identifying new genes It has impacted clinical treatment for example Metformin and Type II diabetes

What are the disadvantages of GWAS

It describes associations and not the cause Most associated SNP have no known effect, are in genes of unknown function or NOT in genes There is little clinical relevance

What is Mendelian Randomisation

It is a test because the effects in studies with confounding factors It uses genetic variation with no affect on traits or exposes related to the disease for example using a genetic variant which has the same effect as a drug would E.g. a variant causing lower LDL rather than using a statin

Give an example of how to investigate if low LDL-C levels cause less cardiovascular events

Instead of using statins you can find a genotype that reduces LDL The control group would be for the homozygous normal variant While the group studies is the homozygous beneficial variant This shows lower LDL reduces CV events rather than the drug

What are the assumptions of Mendelian randomisation

It is completely random if someone is bb or BB (unrelated to mate choice) That you didn’t pick up a bio marker that causes the disease directly That there is no population stratification GWAS has provided suitable SNPs to select for

Give an example of how to investigate cannabis and schizophrenia with Mendelian randomisation

If cannabis is the exposure causing schizophrenia You find a genotype that increases cannabis use Vs the other allele causing no change This study found odds ratio of 1.37

What is a polygenic risk score and how is it calculated

Adding up the risk alleles present and combining the score Allele load = 0,1,2 and adjusted for effect size

What are the limitations of polygenic risk score

Is it sensitive to ethnicity The effect size is affected by the environment It is limited by the power of association of the SNP It assumes that they are additive Does not take into account other variants that may directly cause the disease

How can polygenic risk scores be used in clinical trials and how is a threshold determined

You split the group into clear case control groups If the cases have a higher polygenic risk score that doesn’t overlap with controls this is a good score to use as a threshold If the controls have a very low polygenic risk score and this does not overlap with cases this can be a threshold where you can be more sure that there is no concern

When should you and shouldn’t you use Polygenic risk scores

You can use it to identify case control in large groups You can use it to see if certain biomarkers are increased in higher risk groups You cannot use it to identify cases in individuals It is not useful when the disease has other important variance You cannot substitute for family history risk

Impact Of NGS Flashcards

(35 cards)