Databases 2 Flashcards
what is pairwise alignment
sliding one sequence along the other slowly - keep moving until more nucleotides line up
what happens if you add gaps in pairwise alignment
you get more matches
what is the scoring system for pairwise
penalties for the start of a gap and for each gap - this takes into account gaps and identities
what are the negatives to multiple sequence pairwise alignment
3 or more - slow - takes a long time
what two amino acids both have a -ive charge
aspartate D, glutamate E
what amino acid is +ive
phenylalanine F
what is MAFT
a programme to make multiple sequence alignment
if you find an amino acid in all organisms what does that tell you
it is an important part of protein so has been conserved
what is a DNA motif
short recurring patterns in DNA presumed to have a biological function - they have been conserved but don’t code for aa’s
what provides the first insight into a proteins function
domains
where is the domain ATP-ase found
in transcription factors
what does a sequence logo show
shows the degree of conservation - how important aa’s are to a protein
what does Pfam stand for and what is it
protein families database, collection of known protein domains
what information does the results page on pfam show
where in the sequence the domains start/stop, links to wiki page, E value