Lecture 5&6: Protein Sequence Alignment Flashcards

1
Q

What are Orthologs?

A

Copies resulting from a gene duplication event that come from speciation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are Paralogs?

A

Copies resulting from a gene duplication event within the same organism.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the formula for percentage sequence identity?

A

Number of identical residues/number of residues in smallest protein) * 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the general steps to solve the function of a protein?

A
  1. Do fast scans using approximate methods. (BLAST)
  2. Align proteins using a dynamic programming method (Needleman & Wunsch, Smith & Waterman)
  3. Scan against sequence profiles or HMMs in secondary databases (Pfam, InterPro)
  4. Align sequence against family relatives using ClustalW, Jalview
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference between Needleman & Wunsh and Smith & Waterman algorithms?

A

Needleman & Wunsch uses Global Alignment

Smith and Waterman uses Local Alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some rules of sequence homology?

A

Protein pairs having more than 150 residues are homologs if they have a sequence identity > 25%

For shorter fragment proteins, 30% sequence identity is required.

Structure within families tends to be much more conserved compared to sequence.

Inheriting functional properties from a homolog requires around 60% sequence identity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the different matrices used when comparing to proteins?

A

Identity Matrix (Binary)

Physicochemical properties matrix (range)

Evolutionary matrices (Dayhoff, BLOSUM matrices)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the Dayhoff matrix?

A

It is an evolutionary matrix.

It measures the evolutionary distance by determining the number of point accepted mutations, where 1 PAM = 1 point mutation/100 residues

if more than 100PAM, it means multiple substitutions have occurred to the same site.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the BLOSUM matrix?

A

It is an evolutionary matrix.

It is derived from analyzing substitution patterns in more distant relatives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the difference between a p-value and an e-value?

A

the p-value is the likelihood that this match was obtained by chance, which is converted to an e-value, which takes into consideration the size of the database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What types of residues are most conserved?

A

Catalytic residues are the most highly conserved residues. Others could include residues in the binding pocket, the surface of a protein.

Highly conserved residues are usually associated with the function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is progressive alignment?

A

It is a heuristic approach that uses the idea that sequences are evolutionarily related and can be aligned using an underlying phylogenetic tree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the features of the Clustal W algorithm?

A

It has position specific gap opening and extension penalties (higher within strands and helices, lower between them).

It uses two different amino acid substitution matrices: one for close relatives, one for distant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are some alternatives to Clustal W?

A

MAFFT
T-Coffee
MUSCLE
JALVIEW

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How can conservation be measured?

A

While there are various methods to measure the magnitude of conservation, common ones use the frequency of a residue at a particular site.

Entropy scores are generated. A lower entropy score indicates a less conserved region.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly