9. Proteomics Flashcards
What is the main method of proteomics?
Mass spectrometry
What is the first step of proteomics?
Extracting and isolating the proteins
How are the proteins separated before they enter mass spectrometry?
- Extract the proteins from the cells and run on a gel.
- This separates the proteins by size.
- Then the gel is sliced up into size chunks.
- Each slice is put into individual test tubes and digested with trypsin.
- This is spun down and the solution containing the proteins is taken off the top.
- The proteins are then separated using liquid chromatography.
- Then the peptides are ionised and enter the Mass spec.
What is trypsin?
- A protease
- It cuts at C-terminal side of every lysine and arginine
How does chromatography separate the proteins?
Either by:
1. Net charge
2. hydrophobicity
3. Or another chemical property of the peptide
Why does chromatography separate proteins by chemical property?
As these are different for different peptides
How does mass spectrometry identify peptides?
Using their mass charge ratio
How does every peptide fragment have its own unique weight?
- This is due to the unique amino acid composition.
- Each amino acid has a slightly different mass.
- By measuring the mass of the peptide, you limit the possibilities of what a peptide can be.
How accurately does MS measure the mass of peptides?
To 4dp or more
What are the main steps of mass spectrometry?
MS1 and MS2
What happens during MS1 of mass spec?
- The peptide enters the orbitrap.
- The orbitrap is a metal thumb in a vacuum with a magnetic field.
- The peptide emits radio waves with a frequency that is proportional to the mass charge ratio of the peptide.
- Each peptide has a slightly different signature due to the different amino acid composition.
- The emitted radio waves are used to calculate the mass charge ratio and create a spectra.
What happens during MS2 of mass spec?
- The peptide enters the 2nd chamber.
- The peptide encounters argon ion which smash up the peptides into short fragments or even single amino acids.
- Then, the mass charge ratio of the fragments is measured.
What information do you get for every peptide that is run in the mass spec?
- The MS1 spectra which tells you the mass charge ratio of the whole peptide.
- The MS2 spectra tells you the mass charge ratio of the amino acids.
What are the MS1 and MS2 spectra compared to to identify the protein?
- Theoretical spectra of thousands of proteins that you think are in the sample.
- A computer program calculate the theoretical MS1 and MS2 spectra for every proteins you think might be in the sample.
- This is then compared to the real spectra and match them together.
What is the key limitation of mass spec?
- You provide the list of proteins to the MS.
- This means if you don’t feed it the right proteins or proteins from the right organisms, you won’t get any matches back even if some are there.
- This is similar to a closed system.
- But you can go back and match the spectra to the correct list of proteins.
- This does create a bias
What small changes can throw off protein identification in MS?
- An SNP.
- This happens when an SNP changes an amino acid.
What else about protein expression needs to be considered during mass spec?
- We are diploid organisms with two chromosomes and two forms of every protein.
- this can depend on which chromosome is being expressed.
How do mass spec match the protein spectra to the correct protein?
- Every spectra is run through a search engine to figure out which spectra is what.
- The MS1 and MS2 spectra are compared to the theoretical spectra from your list of proteins.
- Each comparison is given a PEP score to measure how good the match is.
What is the PEP score?
- This measures any slight differences between spectra.
- PEP = posterior error probability.
- This is a statistical process.
- The fit between the 2 spectra is rarely perfect and always some small differences.
What part of the mass spec measures MS1?
The orbitrap
What part of the mass spec measures MS2?
The linear ion trap.
How are peptide samples fed into mass spec?
- MS work really fast and lots of peptides enter MS1 together and measured at the same time.
- There is a mix of different peptides.
- The top 15 most emitting peptides get selected from MS1 to enter MS2 and be ionised.
Why are less complex samples better for mass spec?
- Less complex samples mean fewer peptides enter MS1 together.
- This means a higher proportion of the peptides enter MS2.
- This increases the coverage of the peptides.
- This continues until all the proteins have entered MS2.
Why are small samples better for mass spec?
- This reduces complexity at MS1.
- This is why you separate proteins by size and charge first.
- This is more expensive though as more MS runs are needed.
How is the MS2 spectra interpreted to identify amino acids?
- You measure between the peaks.
(don’t need to know how this works) - But this is where the uncertainty in interpreting the sample comes in
How do you know the real and theoretical spectra are a good match?
By using decoy databases
What are decoy databases?
- A list of proteins the same size as your real list but it is made up of things that cannot be in your sample.
- This decoy list is normally generated by reversing the sequence of all the proteins in the real list.
- Both the real and decoy databases are searched for hits and if a decoy protein creates a match it is flagged.
How are the most likely matches for your real spectra determined?
- The mass spec program generates a list with the best matching hits at the top of the list.
- These good hits will match your list of real human proteins.
- As the matches worsen, you start seeing hits for the decoy proteins.
- This is normally the point at which you disregard the hits.
What is the false discovery rate?
- The point at which 99% of the hits are more likely to match real proteins than decoys.
- There is still a measurable chance that they are wrong.
- This is 1%.
- This matches are still correct most of the time.
What does the decoy list mean?
You can’t just use a massive list of all proteins ever to match your spectra to.
Why can’t you use a massive list of proteins to match your spectra to?
- A list containing lots or all known proteins is impossible to use.
- You would need to make a decoy list for this list of proteins.
- A decoy list of all these would create sequences that match real proteins.
- This results in you not knowing if a protein hit is real and what is for a decoy.
How big is the ideal list of proteins to feed a mass spec?
- We don’t know what the ideal size is.
- We know about the size of the list of all human proteins works well.
- You can’t search for lots of proteins, but you also need to search more than just the species you are testing.
Why is SILAC not used much anymore?
- It is expensive
- Newer methods are better
What is SILAC?
- A way to differentiate and compare between 2 samples using 1 mass spec run.
- The cells are grown in specific media where the carbon and nitrogen in arginine and lysine are the heavy isotopes.
- These are carbon 13 and nitrogen 15.
- The proteins from the treated sample are chemically identical, just slightly heavier.
- Only a mass spec can tell the difference between the 2 samples.
- You can tell which proteins form the heavy cells and which ones form the light cells
Why are the heavy isotopes in SILAC put in the arginine and lysine?
- Trypsin is used to cut up the proteins.
- Trypsin cuts at lysine and arginine.
- Putting the heavy isotopes in these amino acids ensure there is at least 1 heavy amino acid in each peptide.
How is SILAC used to examine samples of peptides?
- They compare peptides from different samples like different conditions.
- You can then work out a ratio of the peptides from heavy and light cells.
- You can then tell if there is more protein expression in one of the populations of cells.
What can SILAC be used to compare between?
- It can be used to compare the same protein in different samples under different conditions.
- You can’t compare between different proteins.
What is a tandem mass tag?
- It is a chemical tag that can be added to proteins following extraction to identify them.
- It contains a reporter, a balance and a reactive element.
How do tandem mass tags work?
- The reporter and balance together weigh the same.
- Once they break down inside the MS they break down in different ways which varies the weight of the samples.
- The weight of the reporter varies between tags. This is used to separate samples.
- The reactive part binds to the peptide and is always the same weight.
How many samples can be measured together using tandem mass tags?
16 samples
What are tandem mass tags used to do?
See the relative abundance of peptide between samples.
What needs to be considered when using MS comparisons?
- More than 1 peptide can come from the same protein.
- The same peptide produced can come from more than 1 protein.
- Not all these peptides agree on the ratio of change between proteins.
What is the simplest comparison to make using an MS?
- To compare proteins from infected and uninfected cells.
- This kind of experiment will generate data on 5000-6000 different proteins.
- This data can be hard to interpret.
- You need to consider confidence of the hits, doing biological repeats or combining different data sets.
What is human cytomegalovirus?
- A herpes virus that infects 60-90% of people.
- In most healthy humans, the virus persists in a latent state in a variety of myeloid cells.
- Occasional reactivation is controlled by a healthy immune response.
- Immune-compromised patients are at risk.
- Using proteomics to find latently infected cells can be very useful, but how do we do this?
What can high throughput quantitative proteomics be used to do?
Look at problems in virology
How was high throughput proteomics used to identify HCMV latently infected cells?
- Labeled samples of surface proteins from infected and uninfected cells.
- Mass spec found a difference in the protein expression on the cell surface.
- This gave a protein target.
- This was an unbiased survey.
What is a pull down experiment?
- An immune precipitation experiment that determines interactions between 2 or more proteins.
- Specific proteins are isolated using an affinity ligand.
How is proteomics used in pull down experiments?
- Use quantitative proteomics.
- This can tell you if the pulled down protein is upregulated in certain conditions/cells.
What can MS proteomics data be combined with?
RNAseq transcriptomics data
What are proteomics and transcriptomics used to investigate?
- Situations where the level of transcription changes the levels of protein expression don’t follow the change.
- This can indicate what proteins have critical roles in disrupting viral infection.
Why do some viruses activate the ubiquitin ligase system?
- Because it is faster to degrade proteins than switching off gene transcription.
- This is because many proteins have long half-lives.
What did this combined approach show happened to MRE11 in adenovirus infection?
- MRE11 seemed to disappear in viral infection without a change in transcription.
- MRE11 plays a critical role in dsDNA break repair and needs to be knocked out for adenovirus to replicate its linear genome.
- The adenovirus linear genome looks like dsDNA breaks.
- Transcriptomics and proteomics showed adenoviruses redirects the ubiquitin ligase system to degrade MRE11.
How can de novo assembly from transcriptomic data be used in proteomics?
- De novo assembly can be used to infer which proteins should be present in the sample.
- This is a good way to generate the reference list for mass spec.
- You don’t need to do whole genome sequencing.
- It means the reference lists are specific to the sample so all SNPs are accounted for.
What did using de novo assembly and proteomics discover about adenovirus?
It was making an extra protein that no one had found before.
How was transcriptomics and proteomics used in SARS-CoV-2?
- The sequence data of SARS-CoV-2 was analysed as the start of 2020.
- 2 variant were found
- 1 variant was missing some amino acids in the mRNA of the spike protein.
- It was predicted this deletion would produce different spike petides and this was confirmed by MS/MS.
- this deletion was shown to be important in SARS-CoV-2 pathogenesis.