Lecture 6 - protein structure and function 3 Flashcards
what is a motif?
A MOTIF is usually a sequence of amino acids which is predictive of belonging to a particular group. For proteins, this means we can use these motifs to predict which proteins belong to a particular protein family, the kinase motifs are examples of this.
Motif - ATP binding site (Walker motif), you will meet others as you go through the course and in honours years.
what is a domain?
A DOMAIN is a structural entity, and usually refers to part of the protein structure that can fold and function independently. Proteins are often made of many domains linked together.
e.g. SH2 domain, Kinase domain, Bromodomain
what are domains and motifs?
A level up from domains are amino acid sequence motifs that can often define functional characteristics
These are distinct from amino acid domains which tend to relate more to structural units
what is bioinformatics?
Can tell you about sequence similarities and conservation
Structural similarities
Provide strong hints at function (e.g. kinase signature sequence)
Computational analysis can provide important insight into potential protein function.
In this case, the sequence contains many key ‘signatures’, characteristic of a protein kinase.
what is the key to understanding protein function
to determine its structure. it Allows us to understand the arrangement of atoms in 3D space and gain insight into mechanism. Think enzyme active site
what is X-ray crystallography?
X-ray crystallography enables us to visualise protein structures at the atomic level and enhances our understanding of protein function. X-ray crystallography can be considered a form of microscopy. The amount of detail or the resolution of any microscope is limited by the wavelength of the electro-magnetic radiation used.
With light microscopy, where the shortest wavelength is about 300 nm, one can see individual cells and sub-cellular organelles. With electron microscopy, where the wavelength may be below 10 nm, one can see detailed cellular architecture and the shapes of large protein molecules.
In order to see proteins in atomic detail, we need to work with electro-magnetic radiation with a wavelength of around 0.1 nm [or 1Å] = X-rays.
what is a BLAST search?
BLAST for Basic Local Alignment Search Tool is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of proteins.
A BLAST search enables a researcher to compare a query sequence with a database of sequences, and identify sequences that resemble the query sequence.
Different types of BLASTs are available according to the query sequences.
For example, following the discovery of a previously unknown gene in mouse, a scientist will typically perform a BLAST search of the human genome to see if humans carry a similar gene; BLAST will identify sequences in the human genome that resemble the mouse gene based on similarity of sequence. Can you find a yeast version? If you can, there may be a deletion strain etc.
what is a MSA?
A multiple sequence alignment (MSA) is a sequence alignment of three or more protein sequences.
From the resulting MSA, sequence similarity can be inferred and phylogenetic analysis can be conducted to assess the sequences’ shared evolutionary origins.
Multiple sequence alignment is often used to assess sequence conservation of protein domains, tertiary and secondary structures, and even individual amino acids.
Because three or more sequences of biologically relevant length are almost impossible to align by hand, computational algorithms are used to produce and analyse the alignments.
MSAs require more sophisticated methodologies than pairwise alignment because they are more computationally complex.
What are other methods of structural determination?
A drawback of crystallography is that the protein in question must crystallise in an ordered fashion (not all proteins do).
Many only crystallise at non-physiological pH or [Salt] - hence relevance may be debatable.
This also means that you see a single static image of a protein, and get no indication of its dynamics.
Several other methods may be used to determine the structure of a protein, including NMR spectroscopy, and electron microscopy.
NMR - information on structure under more physiological conditions (in aqueous buffer).
EM - overall shape of the molecule
what does structure give us?
An understanding of the structure/function relationship - how does the 3D arrangement of the atoms relate to the function of the protein.
Insight into mechanism.
Can use this to design inhibitors or activators.
Help design targeted drugs
how does Angiotensin convert enzyme inhibitors (ACE inhibitors)?
Inhibit production of angiotensin-II, act to lower blood pressure.
Widely prescribed for high blood pressure.
Design of new, more effective drugs (fewer side-effects, increased efficacy, reduced need for combination therapy) driven by a deeper understanding of the structure.
2009 saw 163 million prescriptions in the US, costing around $120 million.
Development of cheap alternative drug by rational design halved this bill in one year.
what is homology modelling?
Experimental elucidation of a protein structure may often be delayed by difficulties in obtaining sufficient amount of material (cloning, expression and purification of milligram quantities of the protein) and difficulties associated with crystallisation.
It is not surprising that methods dealing with the prediction of protein structure have gained much interest – Alphafold2
Among these methods, the method of homology modelling usually provides the most reliable result.
The use of this method is based on the observation that two proteins belonging to the same family (and sharing similar amino acid sequences), will have similar three-dimensional structures.
In reality, the degree of conservation of protein three-dimensional structure within a family is much higher than conservation of the sequence.
The term “homology modelling” refers to modelling a protein 3D structure using a known experimental structure of a homologous protein (the template).
The “low-resolution” structure provided by homology modelling contains sufficient information about the spatial arrangement of important residues in the protein.
In the pharmaceutical industry homology modelling is valuable in structure-based drug discovery and drug design
what is Size exclusion chromatography?
Size exclusion chromatography
A useful simple approach to begin to fractionate complex mixtures.
Works best on water soluble proteins.Easy and cheap.
Separation on the basis of size
what is proteomics?
Proteomics is the large-scale analysis of proteins, in particular in complex mixtures such as cells or organelles or viruses.
Proteomics has been enabled by the accumulation of DNA and protein databases, improvements in computer alogrithms for searching, and particularly in Mass Spectrometry.
Able to generate huge amount of data:
Analyse composition of complex mixtures from a single sample
Protein identification
Post-translational modification
Detailed analysis of a complex mixture
what do proteins carry?
Many amino acid R-groups, plus the amino and carboxy termini of proteins have the potential to carry a charge, depending on the pH. Depending on the primary sequence, a protein may be either positively or negatively charged.
At the isoelectric point, a protein has no net charge.
at a pH above the isoelectric point, a protein carries a net negative charge—below it, a net positive charge.
This is shown here for a single amino acid, Glycine.
what are some important generalists?
Proteins are susceptible to degradation by proteases. Cells/proteins must be kept ice-cold. Protease inhibitors must be used.
Protein conformation is dependent on the pH and ionic conditions within a cell. Breaking cells open means you need to keep control of this - [Salt], pH are crucially important for protein stability.
You need a reliable assay for your protein of interest.
Mammalian cells contain up to 1010 protein molecules, reflecting >10,000 species of protein.
how do you separate on the basis of charge?
Anion Exchange chromatography
Positively charged resin/beads, binds negatively charged proteins
Cation Exchange chromatography
Negatively charged resin/beads, binds positively charged proteins
is there correlation between exons and domains?
Significant correlation between the borders of exons and domains for both invertebrates and vertebrates.
Extensive exon shuffling events during evolution significantly contributed to the shaping of eukaryotic proteomes.
how is that important for protein structures?
For ‘small’ proteins this is valid.
For larger proteins, e.g. 1000 aa’s,
this is unlikely to ever fold by itself into the final structure.
Sub-dividing the structure into domains which fold autonomously means that each domain reaches its energy minima (stable final conformation) sometimes while the rest of the protein is being synthesised on the ribosome.
‘Linker’ regions between domains are often unstructured. Act like a flexible hinge
how do you define the key residues within (for example) the active site ?
While identifying structures like α-helix or ß-sheet or domains (SH2 etc.) is straightforward, defining key residues within (for example) the active site of an enzyme can be difficult.
This is particularly true in enzymes, because key residues at the active site may be very well spaced out in a protein primary sequence but end up close together in the folded protein.
what is the Structure and function from protein sequence using in silico approaches
Because of the sequencing of genomes of many organisms, we can predict the amino acid sequence of many proteins.
We can use the predicted amino acid sequence to predict structure (e.g. what regions are α-helix, ß-sheet, etc).
We can search for domains within our protein based on both structure and sequence alignment (SH2 domains, etc.)
We can compare our protein of interest with the data bases from other organisms to see if we can identify related proteins (family members)
Note - there are many proteins for which this approach doesn’t work…but this is evolving.