Databases Flashcards
Uniprot
protein
EMBL
gene
ENSEMBL
genome
NCBI
bacteria
methods for sequence comparison
- Diagonal plots
- FASTA
- BLAST
FASTA
speeding up alignments with hash tables, heuristic algorithm, usage of K-tuples to search for matching sequence patterns of K-tuple hits
BLAST
an algorithm for comparing primary biological sequence information, optimized for speed use.
blastn
compares your nucleotide sequence with database nucleotide sequence
blastp
compares your query protein sequence with database of protein sequence that were derived from cDNA of interest
blastx
first translates your sequence into amino acids in 6 reading frames then compares the protein sequences with protein databases
tblastn
compares your query protein sequence with the database after translating each nucleotide sequence into protein using all 6 reading frames
tblastx
translates both query nucleotide sequence & the database sequence in all 6 reading frames & then compares the protein sequence. looks for protein coding regions. Good choice- less noise
PROSITE
protein database. Its uses includes identifying possible functions of newly discovered proteins and analysis of known proteins for previously undetermined activity
what is PSI-BLAST
(position specific iterated BLAST)- iterative search using protein BLAST algorithm.
how is PSI-BLAST used
- a list of all closely related proteins is created
- these proteins are combined into a general “profile” sequence, which summarizes significant features present in their sequences
- a query against the protein database is then run using this profile. larger group of proteins is found
- this larger group- used to construct another profile-> process repeated