Class 1 Flashcards

1
Q

What is Entrez and what are some example databases under it?

A

Entrez: an integrated database retrieval system that provides access to a diverse set of 39 bases

ex: PUBMED, Nucleotide and Protein Structures, Complete Genomes, Taxonomy, SRA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are accessions and what do they tell us?

A

Unique identifiers; a unique string associated with some data.

Conveys relationships between objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why are bioinformatic analyses build specific?

A

Every time a new reference build is made, the entire coordinates of the genome changes, therefore when sequences are aligned during the reference of a particular build, analysis is build specific.

Gene annotations are associated with specific genome builds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are hardlinks? Give an example

A

Direct connections between entries in different databases

ex: a link to a paper describing a nucleotide sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are neighbours in Entrez and what is something that one must be careful of?

A

Neighbours are indirect links/connections bewteen entries in different database that are based on similarity, but not identity.

Note that the definition of similarity for each database is subjective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Give examples of types of neighbours.

A
  1. 3D structures/similar structures (proteins w same 2’ structures)
  2. Similar papers in Pubmed (# of overlapping words
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is something that you should be careful of when finding similar sequences through BLAST?

A

The establishment of neighbour connections are qualitative and are subject to debate. Have to make decisions of appropriate cut-offs for e-values to decide what is included or not as neighbours. Connections are curated by individuals or based on principles

It is assumed that sequences with high sequence similarity often have related biological functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly