Bioinformatics Flashcards

1
Q

What is systems biology?

A

The study of the interactions between components of biological systems and the function and behaviour they provide.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the cycle of biological science discovery?

A
New hypothesis
Experiment 
New data
Model construction
Model analysis
Biological insight
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is sequence identity?

A

A perfectly matched sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is sequence homology?

A

A partially matched sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are gene predictions built around?

A

Pattern recognition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What search does DNA fingerprinting use?

A

Homology search using microsatellites- small arrays of tandem repeats

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the evolutionary theory based on?

A

The similarities in biological sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is sequence annotation?

A

The process of identifying similarities between different biological sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does sequence homology work?

A

It compares an unknown sequence against a database of known sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which databases contain DNA sequences?

A

ENA/EMBL
GenBank
DDBJ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which databases contain protein sequences?

A

UniProtKB

RefSeqP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the algorithm used to compare an unknown sequence to known sequences?

A

A pairwise sequence alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the two types of alignment and what do they mean?

A

Global- aligns the whole sequence

Local- aligns domains and subsequences so some parts are unrelated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How are alignments produced?

A

A score is produced for each match or mismatch. If the score reaches a threshold it is reported.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are expressed sequence tags (ESTs)?

A

cDNA produced from mRNA so only contain exons not introns.

Only a local alignment could be used with these sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the standard format that programs require?

A

Fasta format

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is Fasta format?

A

> description of the sequence

The sequence on the subsequent lines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is BLAST?

A

Basic local alignment search tool

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is blastn?

A

Search for a nucleotide using a nucleotide query

20
Q

What is blastp?

A

Search for a protein using a protein query

21
Q

What is blastx?

A

Search for a protein using a translated nucleotide query

22
Q

What is tblastn?

A

Search for a translated nucleotide using a protein query

23
Q

What is tblastx?

A

Search for a translated nucleotide using a translated nucleotide query

24
Q

What are the 5 BLAST outputs?

A
Score
Identities
Positives
Gaps
E-Value
25
Q

What does the score show in a BLAST output?

A

The matches - mismatches

26
Q

What does the identity show in a BLAST output?

A

The number of identical residues

27
Q

What does the positives show in a BLAST output?

A

The number of similar residues

28
Q

What does the gaps show in a BLAST output?

A

The number of gaps introduced to give the best alignment

29
Q

What does the E-Value show in a BLAST output?

A

The reliability of the alignment calculated by expected alignments and chance of alignments.

30
Q

What is a good E-Value in a BLAST output?

A

A value less then 1e^-3

31
Q

Define similar residues

A

Residues that have yet he same chemical and physical properties

32
Q

Name 6 different properties of amino acids

A
Hydrophobic
Aliphatic 
Aromatic
Small
Charged
Polar
33
Q

What is a protein domain?

A

A part of the protein structure that evolves function and can exist independently. They are between 25 and 500 residues long and appear in evolutionary related proteins

34
Q

What are some example protein domain functions?

A
Ligand binding
Spanning the plasma membrane
Containing the catalytic site
DNA- binding
Surface to bind to other proteins
35
Q

Give an example of a domain database

A

CDD

InterPro

36
Q

What is a multiple sequence alignment?

A

It aligns several sequences

37
Q

What is a multiple sequence alignment tool?

A

Clustal

38
Q

What 2 algorithms are used in a multiple sequence alignment?

A

Position specific scoring matrices (PSSM)

Hidden Markov Model (HMM)

39
Q

What is found in a Clustal output?

A
  • entirely conserved column
    : roughly the same size and hydrophobicity
    . Conserved size or hydrophobicity
40
Q

What does an output from Clustal of a good multiple sequence alignment contain?

A

10-30 residues
1-3 stars (*)
5-7 colons (:)
A few full stops (.)

41
Q

Why are multiple sequence alignments useful?

A

To show sequence conservation, particularly domains
Identify a particular conserved residue
Determine secondary and tertiary structures
To build phylogenetic trees to show evolutionary origins

42
Q

What are phylogenetic trees used for?

A

To construct an evolutionary relationship between species or sequences

43
Q

What is a rooted phylogenetic tree?

A

Each node represents the most recent common ancestor. The line corresponds to time

44
Q

What is an unroofed phylogenetic tree?

A

This makes assumptions about relatedness without ancestry. If an ancestor is identified the tree can be converted to a rooted tree

45
Q

How are phylogenetic trees rooted?

A

Using an outgroup that is closely related to the groups but less closely related than the other groups are to each other.
The trees require related sequences or multiple sequence alignments

46
Q

What 6 things can you find using bioinformatics?

A
Gene prediction
Sequence analysis
Protein structure prediction
Epidemiology
Microarray data analysis
Metabolic pathway modelling