Protein Structures Flashcards

Question 1

Q

Levinthal’s paradox

Answer

A

Levinthal’s paradox is a thought experiment, also constituting a self-reference in the theory of protein folding. In 1969, Cyrus Levinthal noted that, because of the very large number of degrees of freedom in an unfolded polypeptide chain, the molecule has an astronomical number of possible conformations.

Question 2

Q

Chameleon Sequences

Answer

A

One Sequence with More than One Fold
Some amino-acid sequences can assume different secondary
structures in different structural contexts

The concept that the secondary structure of a protein is essentially determined locally by the amino-acid sequence is at the heart of most methods of secondary structure prediction; it also underlies some of the computational approaches to predicting tertiary structure directly from sequence. Although this concept appears to be valid for many sequences, as the database of protein structures has grown, a number of exceptions have been found. Some stretches of sequence up to seven residues in length have been identified that adopt an alpha-helical conformation in the context of one protein fold but form a beta strand when embedded in the sequence of a protein with a different overall fold. These sequences have been dubbed chameleon sequences for their tendency to change their appearance with their surroundings

Question 3

Q

conformation switches

Answer

A

Protein conformational switches alter their shape upon receiving an input signal, such as ligand binding, chemical modification, or change in environment.

Question 4

Q

Secondary Structure Elements

Answer

A

alpha-Helix, amphiphatic alpha Helix, beta sheet

Question 5

Q

PDB - Protein Data Bank

Answer

A

http://www.rcsb.org or http://www.pdb.org
• central repository for biomolecular structures:
- experimental: NMR, X-ray, neutron, EM
- theoretical (separate site)
- structural “version tracking”
• fixed format(s) for representation of structural data:
- PDB format
- mmCIF format
• search engine
• some analyses

Sequence
Coordinates
Links to relevant DBs
Citing paper
Taxonomy
Chemical information
Experimental conditions and artifacts

Question 6

Q

Uni Prot

Answer

A

3 Layers of UniProt:
• the UniProt Archive (UniParc):
- UniProtKB + all other protein sequences publicly available
- completeness

• the UniProt Reference Clusters (UniRef):
- non-redundant views of UniProtKB + selected
UniParc sets
- speed

• the UniProt Knowledgebase (UniProtKB)
- central database of annotated protein sequences
and functional information
- UniProtKB/Swiss-Prot + UniProtKB/TrEMBL

Question 7

Q

Swiss- Prot and Trembl

Answer

A

Swiss-Prot / TrEMBL - 2017-06-14
• Swiss-Prot (554,860)
- Manually annotated and reviewed.
- Records with information extracted from
literature and curator-evaluated computational
analysis.
• TrEMBL (87,291,332)
- Automatically annotated and not reviewed.
- Records that await full manual annotation.

Question 8

Q

Predicting Secondary Structures

Answer

A

Chou-Fasman method: The Chou-Fasman method is an empirical technique for the prediction of secondary structures in proteins, originally developed in the 1970s by Peter Y. Chou and Gerald D. Fasman. The method is based on analyses of the relative frequencies of each amino acid in alpha helices, beta sheets, and turns based on known protein structures solved with X-ray crystallography. From these frequencies a set of probability parameters were derived for the appearance of each amino acid in each secondary structure type, and these parameters are used to predict the probability that a given sequence of amino acids would form a helix, a beta strand, or a turn in a protein. The method is at most about 50-60% accurate in identifying correct secondary structures, which is signicantly less accurate than the modern machine learning-based techniques. (Wikipedia)

Question 9

Q

Structure Similarity

Answer

A

• when are two structures similar?
• given two protein structures, what is their largest common
substructure? The structures of bacteriochlorophyll-A (4bcl) and the transmembrane part of porin (2omf) would be appropriate for
this question.
• which atoms in a protein structure A correspond to which atoms in protein structure B? The myoglobin and leghemoglobin structures would be appropriate structures for this question.

Question 10

Q

Structural Alignment

Answer

A

Structural Alignment: Structural alignment attempt to establish homology between two or more polymer structures based on their shape and three- dimensional conformation. this process is usually applied to protein tertiary structures but can also be used for large RNA
molecules.

Question 11

Q

3D Matching

Answer

A

• collection of (possibly typed) atoms or groups of atoms (“points”) in some given relative 3D placement.
• the placement of a group of atoms is defined by the position of a reference point (e.g., the center of an atom) and the orientation of a reference direction.
• the type can be the atom ID, the amino-acid ID, etc…
Two structures A and B match if:
1. Correspondence:
There is a one-to-one map between their points.
2. Alignment:
There exists a rigid-body transform T such that the RMSD between the points in A and those in T(B) is less than some threshold ε

Question 12

Q

Root Mean Square Deviation (RMSD)

Answer

A

RMSD: In bioinformatics, the root-mean-square deviation of atomic positions (or simply root-mean-square deviation, RMSD) is the measure of the average distance between the atoms (usually the backbone atoms) of superimposed proteins. Note that RMSD
calculation can be applied to other, non-protein molecules, such as small organic molecules. In the study of globular protein
conformations, one customarily measures the similarity in 3D structure by the RMSD of the C atomic coordinates after optimal rigid body superposition

Question 13

Q

Alignement operations

Answer

A

Translation and rotation

Question 14

Q

Double Dynamic Programming

Answer

A

SSAP - Sequential Structure Alignment Program used for CATH database
• lower level
- fix coordinate frame on the backbone of one residue
- align residue environments
• upper level
- cumulates scores of similarities in residue environments

Question 15

Q

CE algorithm

Answer

A

Protein structure alignment by incremental Combinatorial Extension (CE) of the optimal path.
Define Alignment Fragment Pair (AFP) as a continuous segment of protein A (submatrix) aligned against a continuous segment of protein B (submatrix) - without gaps. An alignment is a path of AFPs s.t. for every two consecutive AFPs there may be gaps inserted into either A or B, but not into both. That is, for every two consecutive AFPs i and i+1 of length m

p(i+1)A=piA+m and p(i+1)B=piB+m

or
p(i+1)A=piA+m and p(i+1)B>piB+m

or
p(i+1)A>piA+m and p(i+1)B=piB+m
where piA is the starting position of AFP i in protein A.

Question 16

Q

CE algorithm step by step

Answer

Study These Flashcards

A

goal: Find a “good” local alignment for structures of proteins A and B
basic idea:

select some initial AFP
build an alignment path by incrementally adding AFPs in a way that satisfies the conditions on the previous slide
repeat step (2) until the length of each protein is
traversed, or until no “good” AFPs remain

Question 17

Q

CE problems

Answer

Study These Flashcards

A

• how do we choose the starting AFP?
• what are the criteria for adding AFPs to our alignment path
• how do we know when to stop? That is, at what point do we know that there no “good” AFPs left
There are various heuristics that could be used to supply answers to the above questions.
To assess how good the alignment produced by CE is - i.e. its significance, we can compare it to the alignments of random
pairs of structures and compute the Z-score of the corresponding RMSD values.

Question 18

Q

Contact maps

Answer

Study These Flashcards

A

A protein contact map represents the distance between all possible amino acid residue pairs of a three-dimensional protein structure using a binary two-dimensional matrix. For two residues i and j, the ij element of the matrix is 1 if the two residues are closer than a predetermined threshold, and 0 otherwise. Various contact definitions have been proposed: The distance between the Cα-Cα atom with threshold 6-12 Å; distance between Cβ-Cβ atoms with threshold 6-12 Å (Cα is used for Glycine); and distance between the side-chain centers of mass.

Question 19

Q

DALI algorithm

Answer

Study These Flashcards

A

DALI: distance alignment method
splits proteins into hexapeptides
pairwise comparisons of all RMSD values between all fragments of both structures
accounts for the possibility of order changes of structures
database FSSP (Families of Structurally Similar Proteins)
link http://ekhidna.biocenter.helsinki.fi/dali/
database, standalone program, webservice

Protein Structures Flashcards

(19 cards)