Lecture 8 Multiple Alignment Flashcards

1
Q

Homology

A

similarity that is the result of inheritance from a common ancestor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Orthologs

A

genes in different species that evolved from a common ancestral gene by speciation. Normally, orthologs retain the same function in the course of evolution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Paralogs

A

genes related by duplication within a genome. paralogs evolve new functions, even if these are related to the original one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

An Alignment

A

an hypothesis of positional homology between bases or amino acids.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Homology versus Similarity

A
  • When two sequences are descended from a common evolutionary origin, they are homologous. When thinking of homology remember pregnancy
  • sequence similarity is the percentage of aligned residues that are similar in physiochemical properties (size, charge, and hydrophobicity).
  • Sequence similarity can be quantified using percentages (40% similarity); homology is a qualitative statement (homologous or nonhomologous).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are other set of related terms for sequence comparison

A

sequence similarity and sequence identity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sequence Identity

A

the percentage of matches of the same amino acid residues between two aligned sequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are sequence similarity and sequence identity synonymous for

A

nucleotide sequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sequence identity can be calculated in two different ways

A

1)I=[(Li ×2)/(La +Lb)]×100
La and Lb are the total lengths of each individual sequence.
Li is the number of aligned identical residues.

2) I(S)% = Li/La
where La is the length of the shorter of the two sequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What issues are associated with multiple sequence alignments

A

All sequences show some similarity (even random sequences, 25% for nucleotides, 5% for proteins).

Similarity levels might be high in some parts of the sequence and low in other parts.

Sequences might show substantial length variation and presence/absence of various domains.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 3 main methods of alignment:

A
  • Manual
  • Automatic
  • Combined
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why might manual alignment be carried out ?

A

– Alignment is easy.
– There is some extraneous information (structural).
– Automated alignment methods have encountered
the local minimum problem.
– An automated alignment method can be “improved”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Progressive Alignment

A

a heuristic method and as such is not guaranteed to find the ‘optimal’ alignment.

Devised by Feng and Doolittle in1987.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What steps are involved in the ClustalW procedure

A
  1. Quick pairwise alignment: calculate distance matrix
  2. Neighbor-joining tree (guide tree)
  3. Progressive alignment following guide tree
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How is the ClustalW Pairwise Alignment done?

A
  • First perform all possible pairwise alignments between each pair of sequences.
  • Calculate the ‘distance’ between each pair of sequences based on these isolated pairwise alignments.
  • Generate a distance matrix.
17
Q

What is the Neighbor-joining method

A

*The neighbor-joining method is a greedy heuristic which joins at each step, the two closest sub-trees that are not already joined.

*One of the important concepts in the NJ method is neighbors, which are defined as two taxa that are connected by a single node in an unrooted tree

18
Q

Multiple Alignment- First pair

A
  • Align the two most closely-related sequences first.
  • This alignment is then ‘fixed’ and wil lnever change. If a gap is to be introduced subsequently, then it will be introduced in the same place in both sequences, but their relative alignment remains unchanged.
19
Q

What are the advantages and disadvantages of ClustalW

A
  • Advantages:
    – Speed.
  • Disadvantages:
    – No objective function.
    – No way of quantifying whether or not the alignment is good
    – No way of knowing if the alignment is ‘correct’.