Introduction Flashcards
What is Bioinformatics?
Application of computer sciences to biology
What are the main computational challenges?
- Data acquisition, tracking and preliminary analysis
- High quantity of storage, computing power and specialized software requirements
- Model creation
Name some research areas of Bioinformatics
Sequence analysis and function prediction
Protein structure analysis and prediction
Comparative genomics, evolutionary biology
Gene and protein expression
PPI
Proteomics
What are the types of tools used in bioinformatics?
- Databases
- Softwares
- Servers
Main Service centers
- NCBI
- EBI
- CIB
What are the properties of model organisms?
-Easy mainteinance & breed
-Selected for specific purposes
Experimental
Genetic
Genomic
-Good for specific purposes
What is a database?
A collection of related data that is structured, searchable, updated periodically and cross-referenced. It includes the tools necesary for acces, updating and information management. Each has a specific format for data storage.
Most important databases
Table —Page 8
What is a server?
Computer from an institure that provides services to other computers (db storage and associated tools)
Main servers?
Expasy Uniprot NCBI EBI Japanese Genome net
What types of life sciences databases?
Nucleotide sequences (DNA, RNA) Genomics Mutation/polymorphism Protein sequences Protein domain/family Proteomics 3D structure Metabolism/ pathways Bibliography
What are the main nucleotide seq. Databases?
EMBL/ENA (Europe)
GenBank (USA)
DDBJ (Japan)
What are genomic db?
Databases that contain information on gene chromosomal location (mapping), nomenclature and provide links to sequence db. Usually contain no sequences.
Examples of genomic db
MIM, GDB (human) , MGD (mouse), Flybase (Drosophila), SGD (yeast), MaizeDB.
Examples of genome browsers
Ensembl, UCSC
What are SNPs?
Single nucleotide polymorphisms are unique genetic differences between individuals that contribute to the determination of human variation. (physical, behavior, disease, response to therapy)
Mutation/ polymorphism db
They contain data on SNPs related usually to a specific disease. One general db: dbSNP
Protein sequence db
UniprotKB : Swiss-prot + TrEMBL
NCBI-nr : Swiss-prot + GenPept + PIR + PDB + PRF + RefSeq
What are Prot. domains and what do domain/family db contain?
Most proteins are formed by modular conserved structures. Databases contain information on domain signatires, from modelled sequence allignments.
Some prot domain db?
Prosite, PRINTS, ProDom, Pfam, SMART, TIGRfam, DOMO, BLOCKS, CCD
Proteomics db examples
Swiss2Dpage, Eco2Dbase, maize2Dbase
Name the database of 3D structures
Protein Data Bank, it works by several servers
What does a 3D structure db contains?
Spatial coordinates (x,y,z) of a macromolecule atoms whose structure has been defined experimentally. Most (90%) are proteins, but also, polysaccharides, RNA, viruses…..
What do metabolic db contain?
Information describing enzymes, biochemical reactions and metabolic pathways
Name metabolic db
MetaCyc
KEGG
Unipathway
Rhea
Name nomenclature db
ENZYME, BRENDA
They store info on enzyme names and reactions
What is the interactome?
Description of all protein/protein interactions in a sample
What is Gene ontology?
What are the terms divided in?
A contolled vocab., standard terms used for indexing and retrieving information.
- Biological process
- Molecular function
- Cellular component
Whta should we consider when using a db?
Quality of the data Comprehensivenes Update Redundancy Indexation Server quality (fast response)