CMB2000/L18 Bioinformatics I Flashcards

1
Q

Explain sequence alignment.

A

Same set of sequences with zero or more gaps (-) inserted into them such that:
All sequences have the same length
No alignment position where every sequence contains a gap

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain pairwise alignment.

A

Optimal alignment for any pair of biological sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define global alignment and give an example of a software using this.

A

Aligning whole sequences end-to-end
E.g., Needle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define local alignment and give an example of a software using this.

A

Focuses of best matching part of sequences
E.g., Water

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the formula to calculate the number of comparisons in multiple alignment?

A

(n(n-1))/2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Give a software which uses multiple sequence alignment.

A

JalView

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the 2 main classes of multiple sequence alignment?

A

Progressive
Iterative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why is multiple sequence alignment described as a heuristic approach?

A

It is more experimentation, evaluation and trial and error than optimal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Give 4 algorithms using multiple sequence alignment.

A

Clustal (1992-2011)
T-Coffee (2000)
MAFFT (2002)
Muscle (2004)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 3 steps of the Clustal algorithm?

A

Compare sequences to obtain similarity matrix
Make a guide tree relating all sequences
Perform progressive alignments, adding new sequences according to guide tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain how a similarity matrix is obtained in Clustal MSA. (3)

A

Long vector alignment
Clustering using standard algorithms
Genetic distances between sequence pairs computed - no. mismatched positions/total number of matched positions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain how a guide tree is made in Clustal MSA. (3)

A

Genetic distances used to form phylogenetic tree
Used to control order of adding sequences to multiple alignment
relative contributions of alignment weighted according to evolutionary positions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Explain how progressive alignments are formed in Clustal MSA. (3)

A

Sequences aligned progressively
Most closely related pairs aligned first
Next closely related sequences added by aligning with existing alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does MAFFT stand for?

A

Multiple Alignment using Fast Fourier Transform

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain MAFFT. (3)

A

Flexible alignment method constructs progressive alignment then improves iteratively
(Hybrid between two main approaches)
uses subsitution matrix when adding new (protein) sequences to alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Give 3 factors to consider when choosing which algorithm to use for MSA.

A

DNA/protein
Number of sequences to align
Accuracy vs. speed