Quiz 1 - BINF Flashcards
What is the World wide protein data bank
has 3d structures of proteins, nucleic acids, ligand interactions, mutations links to other protein databases
UniprotKB was
made by humans and has gene specific info and is validated but has no nucleotide sequences so protein focused
THE NCBI has
a large but redundant amount, has genes and genomes of any organisms, mrna, jmicrorna, and anything that’s ever been sequenced do both proteins and nucleotides
RefSeq is
large but not redundant, has genes and genomes, mrna microrna and is good for BLAST search, nucleotide focused
FAST-A format is
a simple sequence that has meta-data
BLAST stands for
basic local alignment search tool
What is sequence alignment used for
to find sequence similarity, find common motifs, point mutations and insertions and deletions
the three types of sequence alignments
Global, local, multiple
What does global sequence alignment do?
Determines the best alignment over the entire length of two sequences
Best when sequences are similar
What does local sequence alignment do?
looks at sequence stretches that are shorter than the entire thing
good for comparing really diff sequences with regions of similarity
What does multiple sequence alignment do?
Aligns more than 2 sequences
good for when looking for conserved sequences of patterns in a protein family
Math framework sequence alignment is good for
aiming to estblish residue-to-residue correspondances between sequences while preserving the order of other residues
Math allows for
the into of Gaos so residue-to-nothing in a sequence
Alignment scores, explain
Hoe to determine best sequence when aligning. matched are +1, mismatches and gaps -1
So blast uses a match word to start alignment and
high scoring words are extended in either direction until alignment score drops