Databases 1 Flashcards

1
Q

what do you use BLAST for

A

sequence similarity searching

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what does BLAST stand for

A

Basic Local Alignment Search Tool

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Name 2 large DNA databases

A

EMBL and GenBank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Name a large protein database

A

TrEMBL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the properties of automated vs non-automated sequence production

A

Automated - low sequence quality, high no. of sequences

Manual - high sequence quality, low no. of sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what does NCBI stand for

A

National Centre for Biotechnology Information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are boolean operators in syntax

A

AND - must match 1 +2
OR - matches either 1 or 2
NOT - must not match

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what do quotation marks do when searching

A

force text as one phrase eg. ‘public health’ not public and health

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what does an asterik do when searching *

A

anything to do with the word - transposam will find transposon, transpososome etc…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what do fields do [ ]

A

eg. homo sapiens [organism]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what would you use if you have a sequence of a gene, but don’t know the name

A

Use BLAST - similarity searches for DNA or RNA

Can be the full length of the gene or part of a gene

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the ‘query’ and what is it broken down into

A

query - unknown piece of sequence

broken down into words - small pieces

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the ‘seedling’

A

search for identical matches in all sequences - you can extend the search for identical matches only

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is blastn
blastp
blastx

A

blastn - blast nuceotide
blastp - blast protein
blastx - tblastn - translated nucleotide

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the FASTA data give you - and what do the colours mean

A

a graphical representation of the results - red closely matches, then purple then green

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what does a + symbol in the subject sequence mean

A

that the amino acid is similar - same charge but not identical

17
Q

How do you know if the sequence is a genuine homologue, or just one by chance

A

look at the expect value - the no. of hits you would expect to see by chance with the observed score or higher - a small E value is good, large is bad

18
Q

What are the numbers at the ends of the sequences

A

amino acid positions