BIOINFORMATICS Flashcards

Question 1

Q

Concerned with knowledge and the flow of knowledge in biological systems using computational methods in genetics and genomics

Answer

A

BIOINFORMATICS

Question 2

Q

study of genes

Question 3

Q

study of proteins

Answer

A

Proteomics

Question 4

Q

A collection of related information which are:
○ Structured
○ Searchable → index
○ Updated periodically
○ Cross-referenced → hyperlinks

Answer

A

DATABASES

Question 5

Q

○ These are programs that keep the database
working behind the scenes
○ Computerized data-keeping system

Answer

A

Tier 1: Database management system

Question 6

Q

○ Facilitates communications between applications or databases
○ Extracts information from either local or remote databases

Answer

A

Tier 2: Middleware layer

Question 7

Q

○ Enables users to access the database from anywhere without the need for downloading or installing any code
○ The one that we see – the graphic user interface.

Answer

A

Tier 3: Web interface

Question 8

Q

CLASSIFICATION OF DATABASES
1. Scope of data coverage
give me the 2

Answer

A

● Comprehensive
● Specialized

Question 9

Q

CLASSIFICATION OF DATABASES
2. Methods of biocuration
give me the 2

Answer

A

● Expert-curated (RefSeq)
● Community-curated (GenWiki)

Question 10

Q

CLASSIFICATION OF DATABASES
3. Level of biocuration
give me the 3

Answer

A

● Primary
● Secondary
● Composite

Question 11

Q

CLASSIFICATION OF DATABASES
4. Type of data managed
give me the 3

Answer

A

● DNA/RNA/Protein
● Disease
● Nomenclature/Literature

Question 12

Q

● Information on sequence or structure alone
● Experimentally derived data submitted directly
● Archival in nature

Answer

A

PRIMARY DATABASE

Question 13

Q

● A variety of primary databases, that allow for an ‘all-in-one’ search with multiple resources

Answer

A

COMPOSITE DATABASE

Question 14

Q

● Derived from primary databases
● Based on analysis of the data from the primary
database

Answer

A

SECONDARY DATABASE

Question 15

Q

“Google” of bioinformatics

Answer

A

COMPOSITE DATABASE

Question 16

Q

● Primarily used is PubMed
● Contains entries for >11 million abstracts of scientific publications

Answer

A

LITERATURE DATABASE

Question 17

Q

● GenBank, EMBL-bank, and DDBJ exchange data to ensure comprehensive worldwide coverage;
accession numbers are managed consistently between the three centers

Answer

A

NUCLEIC ACID DATABASE

Question 18

Q

● Contains publicly available DNA sequences from >100,000 organisms
● Also contains derived protein sequences, and annotations describing biological, structural, and other relevant features

Question 19

Q

● Contains nucleotide sequences from all public sources.
● Accessible through Sequence Retrieval System (SRS), which allows keyword searching.
● Sequence similarity search tools: BLAST, Blitz, Fasta

Question 20

Q

● Contains curated data on everything that has to do
with proteins, motifs, and interactions with other
substances.

Answer

A

PROTEIN DATABASE

Question 21

Q

● >18,000 macromolecular structures on proteins,
peptides, viruses, protein/NA complexes, nucleic acids, and carbohydrates.
● Determined by X-ray diffraction and NMR.

Answer

A

PROTEIN DATA BANK

Question 22

Q

○ Curated database focusing on high level of annotation (sequence, function, structure, post-translational modifications, variants) of proteins.
○ Non-redundant and reviewed.

Answer

A

● SWISS-PROT

Question 23

Q

○ Computer-annotated supplement to SWISS-PROT.
○ Redundant and unreviewed.

Question 24

Q

● Secondary database on protein families, domains and functional sites that contain manually curated
information.
● Provides tools for analysis of protein sequences and motifs.

Question 25

Q

● Protein family fingerprints (groups/motifs).
● Detects distant relatives of large and highly divergen protein superfamilies by looking at conserved regions in alignments.

Question 26

Q

● Protein families and domains represented as multiple
sequence alignments.

Question 27

Q

PFAM
___ : Automatically Generated, LQ Entries

Question 28

Q

PFAM
___ : Manually Curated, HQ Entries

Question 29

Q

● Collection of ungapped multiple alignments of segments of related protein sequences (blocks)
● For: protein family classification, protein structure prediction

Question 30

Q

● Contain data regarding structures of nucleic acids and proteins.

Answer

A

STRUCTURAL DATABASES

Question 31

Q

Easy to use website to align FASTA files.

Question 32

Q

Translates DNA sequences or RNA
sequences into their protein sequences.

Question 33

Q

Provides a prediction of the protein structure.