Investigating the Genome Flashcards
Definition of genome annotation
Identifying the locations of genes and all coding regions and determining what they do
Definition of actionable genes
Genes in which small variants have reported therapeutic, prognostic, clinical trial associations
Why do we use whole genome sequencing
Only sequencing technique that accommodates for
- large scale structural changes
- balanced translocations
- distant consanguinity
- uniparental disomy
- novel/known coding/non coding variants
What are the 4 benefits of whole genome sequencing
Whole genome is complete
The individual’s genome doesnt change
Potential to collect once, store, refer to again for clinical use
Only need to analyse each time for a specific question
Describe the key properties of Next Gen Sequencing
- cost
- how many fragments are sequenced and how
- fragment length?
- how many times is each position sequenced on average
Allows for cheap whole genome sequencing
Billions of random fragments sequenced in parallel
Fragment length = 150 bases x2
Each position is sequenced 30x on average
2 copies of genome in each cell, each sequenced 15x on average
Why is it important to sequence each position several times
Any variants between chains can be detected many times, less likely to be a mistake
What are the current 2 main limitations of current technology
Short reads of NGS make characterisation of large variants hard
-most genomes sequenced with NGS => knowledge of normal structural variants is limited
Accuracy lower than older more costly sequencing tech
- variants detected by NGS, verified with Sanger
- Involves use of primers to target variant
What are the 2 sources of info about variants
Functional annotation of the reference genome
Occurence between affected and unaffected individuals
What is genome annotation
Identifying gene locations of genes and coding regions and determining what they do
How would you identify causal variants in rare disease
Filter out commonly observed variants
Look for variants identified as pathogenic
Look for variants in genes linked to condition
Look for variants that affect functional elements
Look for variants that are normally conserved
How would you filter out commonly observed variants
MAP >= 1%
Use data from gnomAD (has data on exomes and whole genomes)
What do you need to look out for when looking for variants identified as pathogenic
Frequency of variant occurence only recently surveyed
=> many false +ve pathogenic variants
Where can you look to find variants that affect functional elements
Protein coding sequence
Splicing
Regulatory element
What are the 2 uses of WDS in
- research
- clinical diagnostics
Research
rare disease discovery
-sequences of groups of affected individuals
-look for genes sharing variants
Clinical diagnostics
rare disease diagnostics
-sequences of affected + other family members (affected/unaffected)
What are the current limited uses of WGS in clinical care
Limited to
- monogenic disease
- patients with a clear phenotype
- patients who are ill
Reporting limited to
-variants in protein coding sequences, easier to predict mutation effect