lecture 5- Introduction to BioinformaticsComputational Genomics Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Why sequence information matters?

A

Amino-acid sequence determines protein structure and function

amino acid sequence–>structure–>function

Similar sequences have similar structures and in turn, similar structures have similar function

Proteins are the building blocks of the cell and play a central role on the biological processes that sustain life

Analysis of biological sequences can help us identify sequence elements and programmes controling processes involved in life, including the unravel DNA function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How is sequence analysis useful?

A

for

-organism identification

-evolution and phylogenetics

-TF binding sites

-gene prediction

-identifying protein families

-recognising protein domains and functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is Sequence comparison

A

Sequence comparison and the ascertainment of sequence similarity is central to all computational genomics.

Sequence similarity imply function similarity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why genes in different species have similar DNA?

A

Shared evolutionary history: Species that descend from a common ancestor will share stretches of conserved DNA

Evolutionary convergence: Two species may evolve similar phenotypes in response to the same environment. DNA sequences of both species may converge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why genes with similar sequence tend to have similar function?

A

Sequence similarity –> Evolutionary relationship –> Related function

Gene sequences after speciation events will accumulate random mutations. Over many generations sequence divergence between the species will increase.

The same gene in both species will nevertheless conserve sequence similarity to preserve gene function

Genes with similar function but low sequence similarity may code proteins with similar 3D structures. Ultimately what is evolutionarily conserved is protein structure because structure impinges function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are orthologs?

A

Orthologs: Genes in the genomes of different species that are related because descend from a common ancestor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are paralogs?

A

Paralogs: Genes in the same genome that are related due to a duplication event in the last common ancestor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are homologs

A

Homologs: Any paralog or ortholog genes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how are Alpha- and beta-globin amino acid sequences different and similar

A

Alpha- and beta-globin amino acid sequences are considerably different. However, their secondary structure is relatively conserved.
Sickle cell anaemia is caused by a mutation in β-globin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why genes with similar sequence tend to have similar function?

A

Sequence similarity –> Structure similarity –> Related function

Convergent evolution occurs when two distinct species evolve similar traits, for example due to shared environment. Similar traits may result from proteins with similar function. Protein with similar function tend to have similar structure and similar sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Convergent evolution of flavone synthases

A

Flowering plants typically contain two structurally and catalytically convergent types of flavone synthases, that show dissimilar DNA sequence

Flavone production in most plants is catalysed by membrane bound cytochrome P 450 FNS2.

Plants in the Apiaceae employ a soluble type I FNSs belonging to the non-homologous 2-oxoglutarate-dependent dioxygenase family.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Summary

A

Similar sequences have similar structures and in turn, similar structures have similar function

Sequence similarity may have an evolutionary origin when the compared sequences have a common ancestor or may arise independently in two species due to evolutionary convergence.

Sequence similarity may occur due to speciation in orthologs or due to gene duplication in paralogs. Both orthologs and paralogs have an evolutionary origin and are termed homologs.

Two sequences that are similar due to evolutionary convergence are termed analogs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Quiz: Two genes may have similar function if they share high

A

Homology

Homology may arise due to orthology and paralogy. Both refer to loci shared by descent. The best answer is homology.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is sequence alignment

A

The goals of an aligner is to distinguish regions of equivalence from regions of difference, avoid meaningless alignments, and find homologous regions due to a common evolutionary ancestor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Local and Global Alignment

A

Global alignment finds the best possible alignment between two sequences across the full range of the sequence

-Global alignment applicable to very similar sequences with approximately the same length, e.g. same gene in closely related species.

Local alignment finds regions of similarity between two sequences
-Local alignment is useful when only parts of a sequence are related, e.g. proteins that share common domains but otherwise completely unrelated. Domains are conserved amino acid sequences that carry out similar functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Pairwise vs Multiple Sequence Alignment

A

Pairwise alignment finds similarities between two sequences.

Multiple sequence alignment can be used to identify patterns common to protein families, to build phylogenetic trees, to help predict secondary and tertiary structures of proteins.

Multiple sequence alignment is more reliable because ambiguities in pairwise comparisons can be resolved when comparing additional sequences

17
Q

Scoring alignments

A

Scoring is a numerical representation of the quality of the alignment

18
Q

Quiz: When aligning two highly similar sequences which aligner is most appropriate

A

Pairwise global

19
Q

BLAST search of sequence databases

A

BLAST is a fast gapped local aligner used to search sequence databases to find homologous proteins and gene sequences. As multiple sequences are analysed, an alignment with a similar score can happen by chance.

E-value is the significance of the alignment and corresponds to the expected number of alignments with identical or better score that could have arisen by chance.

20
Q

BLASTp and BLASTn differences

A

Nucleotide blast will return ‘HITS’ (sequences) that are closely related to your query sequence at the genetic level i.e. they have the same nucleotide sequence.
—- very good if you want to compare evolutionarily close organisms, or to check PRIMER SPECIFICITY

Protein BLAST will return ‘HITS’ that are similar at the protein level. Because of redundancy in codons for amino acids this is a good way to ‘cast the net wider’ i.e. can have completely different nt sequence but the same protein.

21
Q

Quiz: Blast is what type of aligner?

A

Local pairwise

Blast is fast gapped local pairwise aligner, and is particularly suited to quickly compare thousands of sequences with low homology