BIOINFORMATICS Flashcards

1
Q

Concerned with knowledge and the flow of knowledge in biological systems using computational methods in genetics and genomics

A

BIOINFORMATICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

study of genes

A

Genomics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

study of proteins

A

Proteomics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A collection of related information which are:
○ Structured
○ Searchable → index
○ Updated periodically
○ Cross-referenced → hyperlinks

A

DATABASES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

○ These are programs that keep the database
working behind the scenes
○ Computerized data-keeping system

A

Tier 1: Database management system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

○ Facilitates communications between applications or databases
○ Extracts information from either local or remote databases

A

Tier 2: Middleware layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

○ Enables users to access the database from anywhere without the need for downloading or installing any code
○ The one that we see – the graphic user interface.

A

Tier 3: Web interface

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

CLASSIFICATION OF DATABASES
1. Scope of data coverage
give me the 2

A

● Comprehensive
● Specialized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

CLASSIFICATION OF DATABASES
2. Methods of biocuration
give me the 2

A

● Expert-curated (RefSeq)
● Community-curated (GenWiki)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

CLASSIFICATION OF DATABASES
3. Level of biocuration
give me the 3

A

● Primary
● Secondary
● Composite

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

CLASSIFICATION OF DATABASES
4. Type of data managed
give me the 3

A

● DNA/RNA/Protein
● Disease
● Nomenclature/Literature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

● Information on sequence or structure alone
● Experimentally derived data submitted directly
● Archival in nature

A

PRIMARY DATABASE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

● A variety of primary databases, that allow for an ‘all-in-one’ search with multiple resources

A

COMPOSITE DATABASE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

● Derived from primary databases
● Based on analysis of the data from the primary
database

A

SECONDARY DATABASE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

“Google” of bioinformatics

A

COMPOSITE DATABASE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

● Primarily used is PubMed
● Contains entries for >11 million abstracts of scientific publications

A

LITERATURE DATABASE

17
Q

● GenBank, EMBL-bank, and DDBJ exchange data to ensure comprehensive worldwide coverage;
accession numbers are managed consistently between the three centers

A

NUCLEIC ACID DATABASE

18
Q

● Contains publicly available DNA sequences from >100,000 organisms
● Also contains derived protein sequences, and annotations describing biological, structural, and other relevant features

A

GENBANK

19
Q

● Contains nucleotide sequences from all public sources.
● Accessible through Sequence Retrieval System (SRS), which allows keyword searching.
● Sequence similarity search tools: BLAST, Blitz, Fasta

A

EMBL

20
Q

● Contains curated data on everything that has to do
with proteins, motifs, and interactions with other
substances.

A

PROTEIN DATABASE

21
Q

● >18,000 macromolecular structures on proteins,
peptides, viruses, protein/NA complexes, nucleic acids, and carbohydrates.
● Determined by X-ray diffraction and NMR.

A

PROTEIN DATA BANK

22
Q

○ Curated database focusing on high level of annotation (sequence, function, structure, post-translational modifications, variants) of proteins.
○ Non-redundant and reviewed.

A

● SWISS-PROT

23
Q

○ Computer-annotated supplement to SWISS-PROT.
○ Redundant and unreviewed.

A

TrEMBL

24
Q

● Secondary database on protein families, domains and functional sites that contain manually curated
information.
● Provides tools for analysis of protein sequences and motifs.

A

PROSITE

25
Q

● Protein family fingerprints (groups/motifs).
● Detects distant relatives of large and highly divergen protein superfamilies by looking at conserved regions in alignments.

A

PRINTS

26
Q

● Protein families and domains represented as multiple
sequence alignments.

A

PFAM

27
Q

PFAM
___ : Automatically Generated, LQ Entries

A

Pfam-B

28
Q

PFAM
___ : Manually Curated, HQ Entries

A

Pfam-A

29
Q

● Collection of ungapped multiple alignments of segments of related protein sequences (blocks)
● For: protein family classification, protein structure prediction

A

BLOCKS

30
Q

● Contain data regarding structures of nucleic acids and proteins.

A

STRUCTURAL DATABASES

31
Q

Easy to use website to align FASTA files.

A

MULT-ALN

32
Q

Translates DNA sequences or RNA
sequences into their protein sequences.

A

EXPASY

33
Q

Provides a prediction of the protein structure.

A

I-TASSER