Machine-Learning Guided Protein Engineering Flashcards
1
Q
Explain “protein landscape” given example of protein with 12 aas.
A
- Protein sequence space represents all possible sequences for a protein or gene.
- Sequence space has one dimension per amino acid in the sequence -> highly dimensional spaces
- DE visualized as a series of steps within a 3D fitness landscape
- 20^n unique variations of the protein, n = number of amino acids in the chain, 20^12 = 4x10^15 variants
2
Q
Explain DML.
A
- Predicts how multiple mutations in RBD of CoV2 will impact RBS function (ACE2 binding or Ab escape)
- Combinatorial mutation libraries are generated via yeast display
- ACE2 binding population is identified via flow cytometry
- Categorize variants into population that escapes Ab and one that doesn’t
- Deep sequence the selected variants
- Feed the function and sequence info to the model
- Generates data on positions and function landscapes
- Model predicts probability of binding to ACE2 (predicts escape, synthetic and natural lineages, possible future variants)