Databases 1 Flashcards
what do you use BLAST for
sequence similarity searching
what does BLAST stand for
Basic Local Alignment Search Tool
Name 2 large DNA databases
EMBL and GenBank
Name a large protein database
TrEMBL
what are the properties of automated vs non-automated sequence production
Automated - low sequence quality, high no. of sequences
Manual - high sequence quality, low no. of sequences
what does NCBI stand for
National Centre for Biotechnology Information
what are boolean operators in syntax
AND - must match 1 +2
OR - matches either 1 or 2
NOT - must not match
what do quotation marks do when searching
force text as one phrase eg. ‘public health’ not public and health
what does an asterik do when searching *
anything to do with the word - transposam will find transposon, transpososome etc…
what do fields do [ ]
eg. homo sapiens [organism]
what would you use if you have a sequence of a gene, but don’t know the name
Use BLAST - similarity searches for DNA or RNA
Can be the full length of the gene or part of a gene
what is the ‘query’ and what is it broken down into
query - unknown piece of sequence
broken down into words - small pieces
what is the ‘seedling’
search for identical matches in all sequences - you can extend the search for identical matches only
what is blastn
blastp
blastx
blastn - blast nuceotide
blastp - blast protein
blastx - tblastn - translated nucleotide
What does the FASTA data give you - and what do the colours mean
a graphical representation of the results - red closely matches, then purple then green