Sequence alignment (long ver p2) Flashcards

1
Q

Examples of Pairwise alignment software

A

EMBL - EBI Pairwise Sequence Alignment
BLAST’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the different applications of Pairwise Alignment?

A

measuring sequence similarity
studying the evolution of sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

share a common evolutionary ancestor

A

Homologous sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

True or False: Homologous sequences does not share a significantly related 3D structure but share the same evolutionary ancestor

A

False

shares the same 3D structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

usually share significant amino acid/ nucleotide identity

A

homologous sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

sequence regions that are homologous are also called

A

conserved regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

sequences that share a common evolutionary ancestry

A

homologs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

derived from a single ancestral gene in the last common ancestor

A

orthologs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

homologous genes with identical function in different organisms and is only separated by speciation

A

orthologs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

two or more homologous genes found within a single species

A

paralogs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

separated by a gene duplication event

A

paralogs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

if a gene in an organisms is duplicated and transposed so that two copies occupy two different positions in the same genome, then the two copies are _

A

paralogous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

create gene families

A

paralogs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

consists of two or more copies of paralogous genes within the genome of a single organism

A

gene families

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

True or False: Biological sequences does not occur in families

A

False

it often occurs in families

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

related genes within an organism

A

paralogs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

sequences within a population

A

polymorphic variants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

genes in other species

A

orthologs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

True or false: Homologous sequences often retain similar structures and functions

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

collection of three or more proteins (or nucleic acid) sequences that are partially or completely aligned

A

multiple sequence alignment

22
Q

Homologous residues are aligned in _ across the length of the sequences

23
Q

In multiple sequence alignment, the residues are presumed to be homologous in an:

A

evolutionary and structural sense

24
Q

residues are homologous as they are presumably derived from a common ancestor

A

evolutionary sense

25
Q

aligned residues tend to occupy corresponding positions in the three-dimensional structure of each aligned protein

A

structural sense

26
Q

What are the 5 main approaches to multiple sequence alignment

A

exact methods
progressive alignment
iterative approaches
consistency-based methods
structure-based methods

27
Q

employs dynamic programming (similar to Needleman Wunsch but the matrix is multidimensional)

A

exact methods

28
Q

goal is to maximize the summed alignment score of each pair of sequences

A

exact methods

29
Q

generate optimal alignments but are not feasible in time or space for more than a few sequences

A

exact methods

30
Q

strategy entails calculating pairwise sequence alignment scores between all the proteins (or nucleic acid sequences) being aligned

A

Progressive Sequence Alignment

31
Q

beginning the alignment with 2 closest sequences and progressively adding more sequences to the alignment

A

progressive sequence alignments

32
Q

What is the pro of Progressive Sequence Alignment?

A

permits rapid alignment of hundredsthousands of sequences

33
Q

What is the con of Progressive Sequence Alignment?

A

final alignment depends on the order in which sequences are joined; not guaranteed to provide most accurate alignments

34
Q

What are the examples of Progressive Sequence Alignment?

35
Q

What are the 3 stages of ClustalIW algorithm?

A

STAGE 1: create pairwise alignment of every protein included in MSA
STAGE 2: guide tree is calculated from the distance (similarity) matrix
STAGE 3: MSA is created based on guide tree

36
Q

two ways to construct guide tree

A

Unweighted Pair Group Method of Arithmetic Averages (UPGMA)
Neighbor-Joining Method

38
Q

compute a suboptimal solution using a progressive alignment strategy, and then modify the alignment using dynamic programming or other methods until a solution converges

A

Iterative Approaches

39
Q

What is the advantage of Iterative Approach over Progressive Sequence Alignment?

A

overcome alignment errors by iterative refinment

40
Q

What is an example of Iterative Approach?

41
Q

What does MAFFT mean?

A

Multiple Alignment using Fast Fourier Transform

42
Q

example of multiple alignment package that is considered to be highly accurate based on recent benchmarking studies

43
Q

use information about the multiple sequence alignment as it is being generated to guide the pairwise alignments

A

consistency-based methods

44
Q

example of Consistency-based approach

45
Q

What does t-coffee mean

A

tree-based consistency objective function for alignment evaluation

46
Q

include all possible pairwise global alignments of the input sequences and the 10-highest scoring local alignments

47
Q

True or False: every pair of aligned residues is assigned a weight

48
Q

based on the idea that the tertiary structures evolve more slowly than primary sequences

A

structure-based approaches

49
Q

accuracy of msa is improved by including information about the 3-dimensional structure of one or more members of the group of proteins being aligned

A

structure-based approaches

50
Q

a compilation of both multiple sequence alignments and profil HMMs of protein families

51
Q

What does Pfam mean?

A

Protein Family Database of Profile HMMs