Databases and Web-tools Flashcards
Why are online databases, resources and tools becoming increasingly important in clinical diagnostics?
- A massive increase in the amount of DNA sequence information being generated gloabelly has been driven by the advent of massively parallel sequencing.
- Huge amounts of data make manual manipulation alone impractical therefore the use of in silico tools has enabled more efficient analysis of data.
- Very rare disorders require the ability for separate groups to share data to aid interpretation of such rare cases
- Databases of variants in unaffected people are also needed to interpret genome wide data
When using databases, clinical laboratories should consider what factors?
- Determine how frequently the database is updated, whether data curation is supported, and what methods were used for curation
- Confirm the use of HGVS nomenclature and determine the genome build and transcript references used for naming variants
- Determine the degree to which data are validated for analytical accuracy (e.g., low-pass next generation sequencing versus Sanger-validated variants) and evaluate any quality metrics that are provided to assess data accuracy, which may require reading associated publications
- Determine the source and independence of the observations listed
What are the General Limitations of using external databases?
- Accuracy of the data
- Patient consent and confidentiality associated with public sharing of clinical data
- Intellectual property rights of information
- duplicate entries being counted as independant entries
- Frequency of update/curation
- Clinical status of participant ?affected/unaffected
- Free or license required?
List commonly used tools for browsing reference genome sequences.
- Ensembl Genome Browser
- UCSC Genome Bioinformatics
- NCBI Genome
- Locus Reference Genome (LRG)
- Joint venture between the NCBI and EBI
- Records contain internationally recognized stable reference sequences designed specifically for reporting clinically relevant sequence variants.
List commonly used databases of population variant data.
- Exome Aggregation Consortium (ExAC): 60,706 unrelated individuals
- NHLBI Exome Variant Server (EVS):
Created to discover genes contributing to heart, lung and blood disorders. 6503 samples taken from cardiovascular and respiratory disease cohorts (can be used to rule out severe congenital variants).
- 1000 Genomes Project
- genome Aggregation Database (gnomAD): intergrates ExAC, EVS, 1kg. 123,136 WES, 15,496 WGS from unrelated individuals
- dbSNP - SNVs: direct submissions - full of rubbish
- dbVAR - CNVs
- Database of Genomic Variants (DGV) - CNVs - non-direct submission - good quality
What is a Locus specific mutation database (LSDBs)?
- Repositories that contain variation information for genes and proteins that have disease relevance
- Usually the primary and most trusted variation information source as they are curated and maintained by experts in the gene and disease
- Lists of LSDBs available
- HGVS: http://www.hgvs.org/locus-specific-mutation-databases
- LOVD: http://grenada.lumc.nl/LSDB_list/lsdbs
What were the recommendations for LSDBs and Classification of Variants proposed by Greenblatt et al, 2008?
- LSDBs should only report a conclusion related to pathogenicity if a consensus has been reached by an expert panel representing different areas of expertise (clinical, diagnostic, molecular, and computational).
- The system used to classify variants should be standardised.
- Evidence supporting a conclusion of pathogenic or neutral should be reported in the database, including the source and the criteria used for assignment.
- Variants should only be classified as pathogenic or not if more than one type of evidence has been considered.
- All instances of all variants should be recorded.
What is the Human Variome Project (HVP)?
- Was initiated in 2006 to foster discussion around how disparate work of genetic variation database curators could be connected globally.
- Aimed to standardise nomenclature and data presentation
- Developed Country Nodes to document variation within a specific population.
- The HVP, HGVS and the GEN2PHEN project are all working toward standardized variation and pathogenicity data presentation.
What are the most commonly used public databased for Rare Diseases?
- DECIPHER: DatabasE of Chromosomal Imbalance and Phenotype in Humans using Ensembl Resources
- ClinVar (SNVs/CNVs)
- Human Gene Mutation Database (HGMD)
- OMIM (Online Mendelian Inheritance in Man)
- ECARUCA (European Cytogeneticists Association Register of Unbalanced Chromosome Aberrations)
What are the most commonly used public databased for Oncology?
- Catalogue of Somatic Mutations in Cancer (COSMIC)
- Mitelman Database of Chromosome Aberrations and Gene Fusions in cancer
- Atlas of Genetics and Cytogenetics in Oncology and Haematology