Bioinformatics lecture Flashcards
The extent to which two sequences are the same
Identity
Lining up two or more sequences to search for the maximal regions of identity in order to assess the extent of biological relatedness of homology
Alignment
The relatedness of sequences
Similarity
A fixed set of commands in a computer program
Algorithm
A space introduced in alignment to compensate for insertions or deletions in one of the sequences being
compared
Gap
Similarity attributed to descent from a common ancestor
Homology
The sequence presented for comparison with all other sequences in a selected database.
Query
The genetic sequence database sponsored by the National Institutes of Health.
GenBank
describes the number of matches
to the query by chance when searching a database of a
particular size.
E- value (Expect value)
study on evolutionary relatedness among species by comparing homologies and differences in gene
sequences
Phylogenetics
- A field which uses computers to store and analyze
molecular biological information. - application of tools of computation and analysis to the capture and interpretation of biological data.
BIOINFORMATICS
- Allow the storage and management of large biological data sets
- Data is being generated at a much greater pace than its
analysis (Human Genome Project)
CREATIO OF DATABASES
Determine relationships among members of large data users
DEVELOPMENT OF ALGORITHMS AND STATISTICS
- Transcriptomics
- Microbiomics
- Metabolomics
- Genomics
- Proteomics
BRANCHES OF BIOINFORMATICS
- Retrieving DNA sequences from databases
- Computing nucleotide compositions
- Identifying restriction sites
- Designing polymerase chain reaction (PCR) primers
- Identifying open reading frames (ORFs)
- Predicting elements of DNA/RNA secondary structure
- Finding repeats
- Computing the optimal alignment between two or more DNA
sequences - Finding polymorphic sites in genes (single nucleotide
polymorphisms, SNPs) - Assembling sequence fragments
- Creation and visualization of 3D structure models for
biological molecules of significance.
BIOINFORMATICS APPLICATIONS
- Microbial genome applications
- Molecular medicine
- Personalized medicine
- Gene therapy
- Drug development
- Antibiotic resistance
- Evolutionary studies
- Waste cleanup
- Biotechnology
- Climate change studies
- Alternative energy sources
- Crop improvement
- Forensic analysis
- Bio-weapon creation
- Insect resistance
- Improve nutritional quality
- Veterinary science
BIOINFORMATICS APPLICATIONS IN VARIOUS FIELDS
THREE EARLIEST DNA SEQUENCE AND PROTEIN DATABASES
- DDBJ (DNA DataBank of Japan)
- EMBL (European Molecular Biology Lab)
- Genbank (USA)
- Contain original data in the form of primary sequence data
or structural data as submitted by the scientific community. - Examples: GenBank, EMBL, DDBJ, SWISS-PROT and PIR
PRIMARY DATABASES
Contain information that has been
process and derived from the raw data available in primary
database
SECONDARY DATABASES
- A way of rearranging sequences of DNA, RNA or protein to identify regions of similarity.
SEQUENCE ALIGNMENT
To understand functional, structural, or
evolutionary relationships between the sequences
identify regions of similarity
TYPES OF SEQUENCE ALIGNMENT
- Pairwise - compare two sequences
- Multiple- compare more than two sequences
compare more than two sequences
o MUSCLE
o MAFFT
o CLUSTAL Omega
Multiple
compare two sequences
o EMBOSS WATER
o BLAST
Pairwise