20.02.20 External databases/ Bioinformatic resources Flashcards
What are external bioinformatic databases
Databases that store biological data information
Types of bioinformatic databases
- Genome/sequence
- Gene expression
- transcriptomics
- proteomics
- epigenetic
What two types of DNA databases are there
- Primary: contain experimentally derived data
- Secondary: data produced from the analysis of primary data
Examples of Primary DNA databases
- EMBL-EBI: European Molecular Biology Laboratory- European Bioinformatics Institute. Ensembl= genome browser, BLAST/BLAT= sequence search
- GenBank (NCBI- National Centre for Biotechnology Information). Contains DNA sequences from a group of sources (Genbank, refseq) for 300,000 organisms
- DNA databank of Japan= nucleotide sequence and evolution data.
Examples of secondary DNA databases
- OMIM (Online Mendelian Inheritance in Man)= contains genotype-phenotype information on mendelian disorders.
- RefSeq= annotated references for genomic, transcriptomic and protein data.
- 1000 genome project= data is available on the Ensembl platform
- HapMap= Map of haplotype regions and SNPs within them.
What is phastCons and phyloP scores
-Used for phylogenetic and evolutionary conservation predictions. Models are used in UCSC genome browser and other variant classification software (Alamut).
What is Human splicing finder
a tool to predict the effects of mutations on splicing signals or to identify splicing motifs in any human sequence.
What is gnomAD
- Genome aggregation database
- Large population dataset from unrelated individuals. 125,748 exomes, 15,708 genomes
What is Alamut Visual
-Software that incorporates multiple datasets from different sources to allow user friendly and efficient variant classification and genome interrogation.
Examples of gene expression databases
- ArrayExpress: archived functional genomic data from microarray and sequencing platforms.
- Human protein atlas: expression profiles of human protein coding genes expressed in mRNA and protein levels and in multiple tissue levels.
Examples of transcriptomic databases
- miRBase (microRNA database) from Manchester Uni= published miRNA sequences and annotation.
- Rfam= Collection of RNA families, represented by multiple sequence alignments, consensus secondary structures.
Examples of protein sequence databases
- Disprot: database of manually curated experimental disorder evidence.
- Interprot: provides functional analysis of proteins.
- Pfam (EMBL-EBI): collection of protein families shown as multiple sequence alignments and hidden Markov Models. Enables identification of protein domains
- Uniprot and Swissprot (EMBL-EBI): Curated protein sequence information
- NCBI (National centre for Biotechnology Information): database of nucleotides, genomes, SNPs, proteins.
- Protein databank: archive of macromolecular structure data (X-ray, NMR).
Protein interaction databases
- BioGRID (Biological Gnereal Repository for interaction datasets): archive of protein interaction data from model organsisms and human studies.
- RBPDB: database of RNA-binding protein specificity.
What is Cancer genome atlas (TCGA)
-Epigenomic, transcriptomic and proteomic data from 20,000 cancer patients and matched controls for 33 cancer types.
What is MethBase
Reference methylomes from different organisms.