DNA variant analysis Flashcards
What are the steps of next generation sequencing? (4)
- Library preparation (DNA fragmentation and adding adapters)
- Sequencing
- Read mapping and assembly (using reference genome)
- Data analysis
What is the process of NGS? (10)
- Fragment the patient DNA
- Add adapters containing a unique DNA barcode per patient and a PCR primer site
- Adapters contain sequences that are complementary to the oligonucleotides coating the flow cell
- Oligonucleotides ae forward and reverse sequence types
- DNA fragments ligate to the flow cell via the oligonucleotide sequences specific to the adaptors
- Then a polymerase duplicates the attached fragment = double strand
- Denatured, template washed away
- Fragments are amplified by bridge amplification
- Bridge polymerase duplicates the fragment = double stranded, denatured
- Process repeats resulting in 100s of fragments derived from the original fragment
What is sequencing by synthesis? (4)
- The original fragment is sequenced 100s of times
- Nucleotides are specifically fluorescently labelled
- Laser excitation, colour emits, detected by scanner which can build the sequence based on order of emission
- All DNA fragments are sequenced at the same time = parallel sequencing
What is depth of coverage? (2)
- The number of fragments that align to a specific location on the reference genome
- Minimum depth of coverage of 30 is required for diagnostic purposes to exclude mistakes and prove accuracy of variant identity
How is genomic sequencing targeted? (5)
- Make biotinylated RNA probes that match the DNA sequences of the genes of interest
- Fragment patient DNA, mix with RNA probes to hybridise
- Pass the mix over streptavidin which binds to the biotin label
- Magnetically separate RNA-bound DNA and sequence it
- Reduces cost and amount of data generated
What proportion of pathogenic DNA variants are in acceptor and donor splice sites?
15%
What are the boundaries of an intron? (2)
- 5’ = donor site (GT(U))
- 3’ = acceptor site (AG)
What evidence is required to classify a variant? (8)
- Type of variant
- What is the frequency of the variant in the normal population
- How is the protein function/structure affected (loss of function? alter splicing?)
- Has it been seen in other patients with the same disease
- Is it inherited or de novo
- Do other family members have it and do they have disease (familial segregation)
- Are the patient’s clinical symptoms in keeping with the disease associated with the gene
- Is the variant in a mutational hotspot or vital functional protein domain
Why is it useful to know the type of variant? (2)
- Nonsense vs synonymous: truncating changes are more likely to be pathogenic
- Missense: is the amino acid replacement one with different properties that could be predicted to alter the shape/function of the protein
Why is it useful to know the variant frequency in the normal population? (4)
- Variant frequencies can be found in Gnomad database
- Excludes individuals with severe disease
- If the variant is reported in Gnomad it is less likely to be pathogenic, seen in healthy individuals
- If it is seen in 5% of the population it is not likely to be pathogenic
Why is it useful to know if an amino acid change is likely to alter protein function? (3)
- Evolutionary conservation: degree of similarity of amino acid sequence across different species
- If the one that is changed by the variant is highly conserved it is likely to have an effect on function
- Revel scores are used to predict the likely pathogenicity of DNA variants, higher the score the more likely it is to be pathogenic (0-1)
Why is it useful to know is the variant is loss of function? (3)
- DNA variants resulting in premature truncation (nonsense and frameshifts) are likely to disrupt important functions and be pathogenic
- LoF = DNA variants resulting in proteins with reduced function
- E.g. nonsense, frameshift, donor and acceptor splice site variants
What should be considered when assessing LoF variants? (2)
- Likelihood of NMD
- mRNA that escapes NMD may produce proteins with some function
Which variants may escape NMD? (3)
- DNA variant is present in the last exon
- DNA variant is located in the last 50 nucleotides of the penultimate exon
- Because likely to have a milder phenotype and still retain some function
Why is it useful to know if the variant is seen in patients with the same disease?
Further evidence linking the variant to the disease
Why is it useful to understand the inheritance of the variant? (2)
- Test parents to find out if the variant is inherited or de novo
- A de novo variant in an individual with disease features consistent with the gene in question is evidence of pathogenicity
What is familial segregation analysis? (3)
- Looking for inheritance of the variant in the patients’ relatives in relation to who has/hasn’t got disease features
- Those with the disease have the variant = evidence for pathogenic (segregation of the variant with the disease)
- Uses Sanger sequencing
What are mutational hotspots?
Protein domains with vital functions are often the site for hot spots