BINF Flashcards
Protein Data Bank
Curated by humans
Contains multiple protein structures
UniProtKB
Curated by humans
Contains gene-specific information
No nucleotide sequences
RefSeq
Large but non-redundant genes and genomes
Great for BLAST searches (more targetted hits(
NCBI
Nucelotide and protein
BLAST
Basic local alignment search tool
most commonly used sequence alignment database
What is sequence alignment?
To determine sequence similarity
To find common motifs
To find point mutations
To find insertions/deletions
What are the two steps involved in sequence alignment
Construction of the best alignment between seq
Assessment of similarity
Global seq alignment
Determines the best alignemnt over the entire length of two sequences
Best applied when the seq are similar
Local seq alignment
Determines the best alignment in shorted stretches than the entire sequence of two sequences
Best applied when the sequences are substantially different but have regions of similairy
Multiple seq alignemnt
simultaneous alignment of more than 2 sequences
Best applied when looking for conserved seq
Seq or patterns in a protein family
When aligning two sequences, how fo you determine which is best?
use the concept of alignment score
General approach to blast
It is the most commonly used sequence database for alignment
Uses a match word to start the alignemnt
High scoring words are extended in either diretion until alignment score starts to drop
S=__
slignemnt score
W=
word length. 8-9 nt
P value
porbability that an alignment with a score greater than or queal to S occured by chance