Exam III Flashcards

Question

Electron Beams

Answer 1

- Electron beams cast molecular shadows - Electron microscope shoots beams of e- through the sample - Different shadows, depending on the orientation of the molecule

Answer 2

- Images of the shadows are noisy - Reduce image noise by AVERAGING -- Align similar images -- Average multiple aligned images

Answer 3

Then construct 3D images from all the averaged (smooth) 2D shadows Make final model by positions atoms within the 3D density

Answer 4

1. line 2. licorice style 3. van der waals spheres 4. ribbon/cartoon 5. surface models

Answer 5

- Bonded atoms are connected by simple lines - Not easy to understand protein structure; pile of lines

Answer 6

- Highlights molecular bonds - Sticks (cylinders) connect bonded atoms instead of lines

Answer 7

Model atomic sizes; atoms are represented by solid spheres

Answer 8

- Reveal protein backbone (alpha helices/beta sheets) - Interpolates the positions of the alpha carbons

Answer 9

Reveal protein accessibility SAS: solvent accessible surface SES: solvent excluded surface

Answer 10

1. Use UniProt to get protein sequence 2. Search PDB for similar proteins 3. Limit search to ones with non-polymer ligands 4. Now you can see if small-molecules (drugs) bind to related proteins

Answer 11

- Sequence alignment matches similar sequence regions -- Adds gaps to amino-acid (or nucleotide) sequences so that similar regions line up - Aligning sequences can show structural, functional, and evolutionary relationships between proteins -- Not all AA mismatches are equally bad (same charge/type)

Answer 12

Improved alignment accuracy - Penalize different substitution based on how unlikely they are - Changes based on how bad the substitution would be to protein

Answer 13

Performs reliable MSA → gets percent identity between 2 protein sequences * = identical : = similar . = kinda similar

Answer 14

Reveal 3D similarities and align based on shape 1. Start with seq alignment to identify equivalent amino acids 2. Rotate/translate one protein until it's superimposed on the other 3. Measure the distance between the 2 protein to judge how similar they are

Answer 15

- Calculates 3D distances between atoms (not if a mismatch) - Looks at flexibility between proteins and deviations in atom positions from REFERENCE structure - To overlap proteins – minimize the RMSD as much as possible

Answer 16

- Proteins with similar sequences, often similar structures and functions → but not true 100% - Helpful to search by structural simialriy alone (low RMSD) - Can rapidly search for experimental and predicted structures similar to query protein

Answer 17

- Represent local protein shapes as 3Di letters. -- Encode for how a part of the protein interacts with its neighbors in 3D space (geometry barcode) - Quickly discard proteins with no small matching 3Di chunks in common (no chance of them being similar) - For remaining, perform more expensive alignment of 3Di letters (similar to traditional sequence alignment) - Rank remaining “hits” using more traditional/expensive structural metrics (not RMSD, but conceptually similar; overall fold similarity and local structural similarity)

Answer 18

Predict protein structures Goal: make a reliable protein model from any amino-acid sequence -- Human genome only encodes ~75,000+ proteins -- There are at most only several thousand unique protein folds Modeling builds on solved protein structures -- Solve enough structures so we can model the rest -- Number that needs to be solved depends on our ability to model High similar sequences have similar structures -- Proteins with homologous sequences (>30% sequence identity) tend to have similar structures -- 25% of known protein sequences are homologous to other sequences

Answer 19

1. Get protein seq on UnitProt 2. Identify homologous sequences in the PDB 3. Align query sequence with homologues 4. Find structurally conserved regions 5. Identify structurally variable regions 6. Generate coordinates for conserved regions -- Identical AA: transfer all atom coordinates (XYZ) to query protein -- Similar AA: transfer backbone coordinates and replace with side chain atoms -- Different AA: transfer only the backbone coordinates (XYZ) to query sequence 7. Generate coordinates for variable regions 8. Add side chains 9. Refine structure 10. validate structure

Answer 20

Conservation suggests structural roles 1. High sequence conservation -- Tend to be stable, at protein’s core -- Secondary structures (helices, sheets, etc.) 2. Low sequence conservation -- Tend to be least stable, most flexible, on protein’s surface -- Often loops and turns

Answer 21

Revolutionized protein structure prediction very accurate even without good template protein for predicting the structure of a protein

Answer 22

1. MSA of related proteins to identify amino acids that tend to evolve together 2. Co-evolving residues probably interacting…try to guess at those interactions 3. Also tries to predict residue-residue distance, trained on known structures 4. Final model subjected to minimization (not ML, but force field)

Answer 23

- Automates homology modeling - Automated, online server for homology modeling (traditional method) ***Not necessarily the best, but probably the easiest

Answer 24

Must be disease-related and specific 1. Related to a disease 2. Essential 3. Specific pocket – not a pocket that binds a common metabolite (many side effects)

Answer 25

- Disease-associated mutations reveal drugable proteins

Answer 26

- Helps identify drug targets for neglected disease research - Focuses on neglected diseases (bacterial and eukarotic pathogens)

Answer 27

- strategy for assessing drugability - The protein is a member of a protein family that contains other druggable proteins -- protein family: proteins with similar sequences, strucutres, and functions FLAWED → -- Biased towards proteins that have been previously drugged -- Not always true that all members of the same family are equally druggable

Answer 28

- Organizes proteins by structural relationships - Structural classification of proteins database to see evolutionary relationships 1. Class: fold type 2. fold 3. superfamily 4. family 5. protein domain 6. species

Answer 29

- groups proteins by structural features 1. Class: Fold type (e.g., beta sheets; same as SCOP) 2. Architecture: Structurally similar, but no evidence of homology (same as SCOP fold) 3. Topology/fold: Group by structural features. 4. Homologous superfamily: Distant common ancestor (same as SCOP superfamily)

Answer 30

- Reveal protein sites amenable to high affinity binding - Binding fragments tend to cluster there - Good for identifying protein where ligands/drugs/chemical probes might bind

Answer 31

- Usually cavities ⇒ best for small-molecule binding Protein-protein interactions: two large, flat surfaces -- Have a reliable way to design small molecules that disrupt those interactions Binding pockets need specific features for druggabiltiy -- Cavities with features… -- Hydrogen-bondiong opportunities -- Electrostatic interactions Greasy pocket → hydrophobic interactions are often important but non-specific

Answer 32

Experimental method that identifies druggable hot spots 1. Soak a protein in an aqueous solution of ~6 organic probes 2. Aligning the structures resolved via X-ray crystallography 3. Identifying regions where probes tend to congregate

Answer 33

- Detects ligand (fragment) binding through spectral shifts - When a ligand binds a protein, the protein atoms it touched and themselves in a different environment - Causes a detectable shift in the NMR spectra of the atoms - Chemical shift with increasing ligand concentrations

Answer 34

- Identifies potential binding hotspots computationally - A computational method that is faster and easier - FTMap virutalyl floods protein models with chemically diverse, small organic probes (a kind of docking) - Protein regions where organic probes consistently congregate are often druggable - FTMAP web server provides easy access to druggabiliy tools

Answer 35

- Provides a tool for detecting druggable protein pockets - Based on the fpocket program for druaggable-pocket detection -- Fpocket is a command-line porgram

Answer 36

- occur on distinct time scales

Answer 37

- Reveals information about protein flexibility - NMR is powerful, but it is also: -- Time and resource intensive -- Sometimes difficult to perform -- Limited in applicability (e.g. small proteins)

Answer 38

- Ligand binding can alter dynamics - Proteins resolved with different bound ligands (or no ligand) can have different shapes - Crystallography provides only limited info about dynamics

Answer 39

- Reveal dynamic molecular details - Simulations: Very detailed -- Down to femtosecond time resolution (one quadrillionth, or 1e-15 of a second -- Down angstrom spatial resolution (1e-10 of a meter, or a tenth of nanometer) -- “Single molecule experiment”

Answer 40

- determine simulated motions - forces that act on the atoms - force field: energy functions. and parameters

Answer 41

forces that act on the atoms - bonding stretching - angle bending - van der Waals - bond rotations (torsions) - electrostatics

Answer 42

- There is no bonding breaking and formation. - So can’t model catalysis, for example - Would require quantum-mechanics calculations

Answer 43

- parameters based on small molecules - Spectroscopy data -- Bond stretching parameters -- Angle bending parameters - High-level QM calculations -- Atomic charges -- Dihedral parameters

Answer 44

- Crystal structures: usually no hydrogen atoms - Where to add hydrogen atoms depends on the pH - need to optimize h-bond network

Answer 45

- at any pH in the body - Arginine and lysine are protonated - Histidines are wild card (at 7.4, both protonated and neutral forms present)

Answer 46

- Add counter ions (NaCl) -- To neutralize system electrically -- To simulate physiological concentrations (150 mM)

Answer 47

Immerse the protein in a box of (explicit) water molecules -- use periodic boundary conditions

Answer 48

1. Simulations with explicit waters are more computationally intensive, but generally more accurate 2. Simulations with implicit waters are faster, but less accurate

Answer 49

- PDB files just include coordinates, atom names, etc - Nothing about the stiffness of the bonds, the partial atomic charges, etc. - You must parameterize the structure according to the selected force field.

Answer 50

Molecular dynamics simulates atomic motion over time 1. initial atomic model 2. calculate molecular forces acting on each atom 3. move each atom according to those forces 4. advance simulation time by 1 or 2 fs

Answer 51

- E minimization refines molecular structures - Simulations explore diverse energy landscapes - MD simulations not only produce energy minima, but various higher energy conformations

Answer 52

Simplifies simulations Assumption: because of high friction, average acceleration of molecule is very small -- Constantly crashing into water molecules, so reasonable in many cases -- Note: does not mean velocity is 0 Total acceleration (0) is assumed to be a function of: -- Forces acting on atoms (Fi ) -- Drag of water (yi vi mi ; no explicit water molecules) -- A random force (Ri ) caused by Brownian motion * can solve for the velocity of molecules to predict molecular motions -- rigid body physics -- excluded volumes -- electrostatics

Answer 53

- Molecules (including proteins) bounce around a lot in solution. -- MD captures these movements. -- It’s useful to align each frame (conformation) of the simulation to a single standard (usually the first frame) - Translate and rotate each frame so as to minimize the RMSD

Answer 54

- useful for monitoring protein dynamics - Distance measurements to monitor pocket opening and closing - Using distance to monitor electrostatic interactions.

Answer 55

“Floppiness” of the atoms you’re analyzing. -- flexibility of specific atom positions -- deviation from a reference, averaged over time -- highlights dynamic regions The higher the RMSF ⇒ the more flexibility or floppiness of the particle

Answer 56

RMSD is the deviation from a reference, “averaged” over the atoms. (distance differences of structure) RMSF is the deviation from a reference, averaged over time. (flexibility of atoms)

Answer 57

average each of the coordinates, so that the average location for the points (1,3) and (3,5) ⇒ (2,4)

Answer 58

“A statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.” - Simplifies molecular-motion data - describes major variances - visualizes structural variations - captures atomic positional variance

Answer 59

Measure structural changes and fluctuations RMSD: how different is a protein’s shape (conformation) relative to some reference shape? RMSF: On average, how much does a given atom move relative to its mean position over the course of a simulation?

Answer 60

Imagine calculating PCA in 3N dimensional space (N = number of atoms in the protein). For each conformation (frame) in the simulation, calculate the first two principal components. Plot all those, and bin them into a 2D histogram… -- During simulations, regions of conformational space near energy minimas are more heavily sampled. So this histogram says something about the energetic landscape of our protein.

Answer 61

- Simplify multidimensional data - reduces complexity while retaining key information (dimensionality reduction) --- lower order components explain most of the data 1. First principal component: the line that passes through your data points in the direction they are most spread out, so you can see the overall pattern clearly. 2. Second component: a line that best captures the remaining pattern, rotated 90 degrees from the first component.

Answer 62

- Molecular dynamics captures multiple proteins conformations - Experimental methods for resolving protein structures reveal only limited (discrete) conformations. - Molecular dynamics simulations can sample conformational space more continuously. *** Clustering identifies representative conformations - From all the many conformations sampled over the course of an MD trajectory, which ones are most representative? (The “ensemble.”)

Answer 63

1. For each molecule protein conformation in your library simulation trajectory, calculate the set of associated “nearest neighbors.” Near here means sufficiently similar (e.g., per a Tanimoto coefficient) within a user-specified RMSD of each other. 2. Are there any molecules conformations that have no near neighbors? These are “singletons.” Remove them from the pool of molecules. 3. Which compound conformation has the largest set of nearest neighbors? Remove those from the pool, but remember that particularly popular compound conformation (a “centroid”). 4. Repeat step 3 until there are no remaining compounds conformations in the pool. ***The set of centroids is your diversity set representative conformational ensemble.

Answer 64

Some structural analyses are too computationally intense to apply to every conformation. Computer docking (future lecture) is a good example.

Answer 65

- Require dynamic identification methods **Only apparent when a drug binds - Cannot see them in crystal structures otherwise - Find cryptic binding pockets using molecular dynamics simulations - Can identify cryptic pockets without having to resolve crystal structures with pocket-opening ligands. - Durggabilty simulations can identify cryptic allosteric pockets

Answer 66

Perform simulations of proteins in the present of many, many small organic probes Identify “interactions spots”

Answer 67

- Molecular docking predicts ligand binding poses and affinities (preview) - Predicting small-molecule/target binding in silico. 1. Ligand pose prediction (“docking”) 2. Affinity prediction (“scoring”): maps binding geometry to a score that is correlated with affinity

Answer 68

Clustering: Extract diverse receptor conformations from the simulation. Accounting for receptor flexibility can improve docking accuracy

Answer 69

1. Dock each of the library molecules into each of the receptor conformations. 2. Each small molecule maps to a whole spectrum (ensemble) of docking scores. 3. Library compounds are ranked by some ensemble based metric

Answer 70

Integrated MD → clustering → docking workflow identifies potent inhibitors *** Key interactions are predicted to stabilize ligand binding

Answer 71

Binding energy: how strong (or how tightly) the ligand binds the protein - Drugs that bind tightly can better compete with natural molecules in the cell that might bind at the same location

Answer 72

State 1: A protein and ligand are floating in solution far apart from each other. Don’t even “feel” each other’s presence. State 2: The ligand is bound to the protein and so forms many molecular interactions with the binding pocket. Binding energy = the difference between these states.

Answer 73

- evaluate molecular binding computationally 1. Ligand pose prediction (“docking”) 2. Binding-energy prediction (“scoring”): binding geometry → score that correlates with energy -- Main advantage here is speed, but at the expense of accuracy

Answer 74

-- differ in accuracy and speed - Force-field scoring functions. - Non-bonded interactions between the protein and ligand - Pose-strain energies between the bonded atoms of ligand (some methods). - Implicit solvent (some methods).

Answer 75

- predicts binding with weights 1. Count the number of predicted interactions between the protein and ligand. 2. Combine those counts into a score, weighting each of the counts to give the predictions that best match experiment (regression, training)

Answer 76

- databases reveal patterns in molecular interactions 1. Look at large databases of protein-ligand complexes. 2. How often do certain atoms on the ligand come within certain atoms on the receptor? 3. If atoms are close to each other more than you’d expect given random chance, they probably participate in energetically favorable interactions.

Answer 77

- identify hidden patterns - In trying to find patterns in data, the creator imposes no pre-conceied assumptions - The program itself finds these patterns. *** improves binding predictions

Answer 78

- mimic biological processing - relies on connections and weights of neurons (determining the strength of connections) - Artificial neural networks process data through layered structures

Answer 79

- starts with encoding data in the input layer 1. Encode info about the protein-ligand pose 2. Systematically adjust the strength of the connections in the hidden layer (learns from systematic adjustments) 3. The output layer encodes the correct binding energy *** most neural networks are much more complicated than this

Answer 80

The goal is to predict experimental binding affinities from 3D ligand poses Need a vast database of protein-ligand structures, with thousands of associated experimentally determined binding affinities

Answer 81

Simulations can increase the accuracy of binding energy estimates Much more computationally intensive than scoring functions, BUT can be more accurate *** binding free energies can be directly simulated

Answer 82

- Energy differences depend on molecular probabilties - Molecules spend more time in energetically favorable states. ***Boltzmann equation

Answer 83

Given the probabilities that your system is in one state or another (e.g., ligand-bound and unbound states), you can calculate the energy difference between the two states: *** energy difference between 2 equally probable states (probability, temp, boltzmann's constant)

Answer 84

State function: the path taken from point A to point B does not impact the “state” at those 2 points - simplify energy challenges - dependent only on end points *** (Binding) free energy is a state function

Answer 85

- depend on route taken

Answer 86

- inspired early advancements in chemistry Middle ages: Transmuting “base metals” such as lead into gold. ***In some ways, the precursor to early chemistry and medicine. Since binding energy is a state function, it doesn’t matter hope the ligand gets into the pocket

Answer 87

- accelerates energy calculations - Instead of one (very long) binding simulation… -- Two disappearing (alchemical) simulations. -- “Disappearing” means slowing turning down the electrostatic and van der Waals forces. ** relative free energy calculations: even small changes impact/improve affinity *** ghost part of the ligand to guide chemical optimization

Answer 88

- Computationally intensive - Molecular dynamics force fields are not perfect - simulated different states for long enough to sample all the major conformations? - disappeared your molecules slowly enough?

Answer 89

- predicts molecular interactions - predicts molecular recognition in silico -- ligand pose prediction -- affinity prediction by mapping binding geometry to a score that is correlated with affinity *** known binding sites make docking easier

Answer 90

1. local: known pocket -- find position of ligand in binding site 2. global: no known pocket -- more difficult bc need to search for the binding site as well as the position of ligand in the binding site

Answer 91

1. Receiver operating characteristic (ROC) curves 2. Enrichment factors

Answer 92

ROC AUC (area under curve) meaning ⇒ The area under this curve is the probability that a randomly picked active will rank better than a randomly picked inactive (or decoy) molecule.

Answer 93

For each cut off there are: False positive (FP) True positive (TP) False negative (FN) True negative (TN) *** ROC curves graph FPR vs TPR for every possible cutoff

Answer 94

Together the number of “ground truth negatives”

Answer 95

number of “ground truth positives”

Answer 96

- Assesses the entire screen from best - predicted ligand to worst -- useful for benchmarking and comparing new VS methods - BUT only care about the top -scoring compounds (the one’s you’ll recommend for experimental testing)

Answer 97

Calculate the percentage of all compounds in your screen that are true ligands. For every possible cutoff: 1. Calculate the percentage of compounds above cutoff that are true ligands. 2. Calculate how many times higher (or lower) that percentage is than the allcompound percentage

Answer 98

- Doesn’t evaluate best predicted and worst predicted binders equally. - Shows how well your virtual screen performed among those compounds you’ll likely recommend for testing.

Answer 99

Docking has contributed to the development of transformative drugs ex/ treating Cox-2, alzheimer's, and HIV

Answer 100

Large Language Model Ex/ ChatGPT, Claude, Gemini, llama, GROk - Understands/generates human-like text. -- Answer questions -- Provide explanations -- Generate creative content - Learns from a vast (internet-scale) dataset of text to predict the next word in a sentence. One model → many different applications.

Answer 101

Pros: - It’s super fun - Most professionals will use it in the future, and I want to prepare you for your future careers Cons: - Inappropriate use can 100% wreck any changes of success in a future career before you even get started - Inappropriate use can 100% wreck your otherwise successful career after it’s started - There’s a tiny chance robots will one day end humanity goal: find a middle ground

Answer 102

Lead optimization improves efficacy and safety A crucial stage in the drug discovery process Improves initial ligand found in high-throughput or virtual screen

Answer 103

1. Target ID and validation 2. Hit ID and optimization 3. Lead optimization 4. Candidate selection

Answer 104

1. Potency (affinity): Increase the drug's ability to bind and modulate its target 2. Selectivity: Minimize off-target interactions and reduce side effects 3. Pharmacokinetics: Optimize absorption, distribution, metabolism, and excretion (ADME) properties 4. Solubility: Enhance drug solubility to improve bioavailability

Answer 105

Dissociation constant (Kd): measure of affinity IC50: concentration of drug required to get 50% of the max possible activity (dependent on experimental setup, e.g., temperature, substrate concentration, etc.) EC50: often used to measure impact on phenotype; conc needed to have half the impact on phenotype *** lower values == stronger binding -- critical for effective therapeutic action

Answer 106

- Chemical moieties that can replace a part of a molecule while retaining target binding and activity - Similar physical/chemical/electronic/size properties ** Improve potency, selectivity, etc., and reduce toxicity Benefits: Exploration of chemical space, intellectual property protection, improved chemical properties, overcoming drug resistance, etc.

Answer 107

Adding fragments can enhance potency of the drug -- lead optimization

Answer 108

1. Fragment addition/swapping 2. Merging 3. Linking (preserves poses when not linked)

Answer 109

- Input structures give out finger prints to create label set - Recommends fragment additions - The receptor and parent are voxelated - Uses ML for this type of drug discovery 1. Takes a bunch of proteins and ligand structures 2. Parent ligand and protein become voxal grids (grids of 24x24x24 points of 3D space) - For each atom in the receptor, we see how they contribute to protein-ligand interactions - Basically mapping atom positions onto a grid - The fragment is vectorized - Converted fragments to fingerprint vectors using RDKFingerprint algorithm. (0s and 1s)

Answer 110

- Final model: five days to converge (GPU) - To prospectively evaluate a single receptor/parent complex (at inference time) can easily run on a CPU (~30 seconds)

Answer 111

- A separate look-up table (label set) of known fragments with associated RDKFingerprints - Cosine similarity to find fragment most like prediction - Label set independent of TRAIN/VAL/TEST sets -- One can use the trained DeepFrag model with different label sets -- It can be general or customized (fragments of interest)

Answer 112

QSAR assumes that molecules with similar structures often have similar biological activities. This is a key principle in QSAR modeling, as it relies on identifying structural similarities among molecules to predict their biological activities.

Answer 113

Using too many descriptors can reduce model interpretability and lead to overfitting, making predictions less generalizable.

Answer 114

Separating data ensures that the model is evaluated on unseen data, preventing overfitting and improving generalizability. If a model is trained and tested on the same data, it may simply memorize patterns rather than learning general relationships, leading to poor performance on new compounds.

Answer 115

Small structural changes can sometimes lead to large, unpredictable changes in biological activity, making modeling difficult. While QSAR assumes that structurally similar molecules have similar activities, real-world data often show "activity cliffs," where minor modifications drastically alter binding affinity.

Answer 116

Advanced machine learning methods can capture complex, non-linear relationships between molecular descriptors and biological activity. Many QSAR relationships are not purely linear, and methods like neural networks and random forests can model intricate dependencies that linear regression cannot.

Answer 117

X-ray crystallography provides high-resolution structures but typically captures proteins in a static crystalline state, which may not reflect their natural flexibility.

Answer 118

NMR spectroscopy requires high concentrations of soluble, non-aggregated protein, which can be difficult to achieve for many proteins.

Answer 119

Cryo-ET does not require crystallization or staining, allowing researchers to study molecules in their natural environment. Unlike X-ray crystallography, which requires crystallization, or certain electron microscopy methods that require staining, Cryo-ET allows molecules to be observed in a near-native frozen state, preserving their biological structure.

Answer 120

Proteins are large and flexible, and their flexibility can make crystallization difficult due to entropic penalties.

Answer 121

Averaging multiple similar molecular images reduces noise and improves the clarity of the final 3D structure. The raw images obtained from Cryo-ET are often noisy due to the low electron dose used to prevent sample damage. By aligning and averaging multiple similar molecular projections, researchers can enhance signal quality and improve resolution.

Answer 122

Flash freezing prevents the formation of ice crystals, preserving biological structures in a near-native state.

Answer 123

Computational flipping optimizes the hydrogen-bonding network, ensuring correct side-chain orientation in the electron density map. In X-ray crystallography, electron density maps do not always clearly distinguish between nitrogen, oxygen, and carbon atoms in certain amino acid side chains. Computationally flipping these residues helps optimize hydrogen bonding and improves model accuracy.

Answer 124

Different molecular visualization models emphasize various aspects of proteins, such as backbone organization, atomic interactions, or solvent accessibility.

Answer 125

UniProt provides protein sequence and functional information so researchers can explore protein properties, structures, and interactions.

Answer 126

The PDB is a repository of protein structures that allows researchers to analyze molecular conformations, interactions, and binding sites.

Answer 127

Advanced PDB searches allow users to refine queries based on chemical, sequence, and structural attributes. Unlike simple keyword searches, advanced PDB searches support filtering by chemical descriptors, sequence motifs, and ligand interactions, making it easier to find proteins with specific structural or functional properties.

Answer 128

Sequence alignment reveals structural, functional, and evolutionary relationships between proteins.

Answer 129

Substitution matrices assign scores to amino acid substitutions based on their likelihood, improving alignment accuracy.

Answer 130

RMSD quantifies a kind of average distance between equivalent atoms in two aligned protein structures, providing a measure of structural similarity.

Answer 131

Homology modeling allows researchers to predict protein structures by leveraging known structures of homologous proteins.

Answer 132

These regions are difficult to model accurately because they are flexible and often lack similar regions in known protein structures.

Answer 133

It assumes all members of a protein family are equally druggable, which is not necessarily true

Answer 134

A cavity with hydrogen-bonding and electrostatic interaction potential

Answer 135

Because such mutations suggest that the protein plays a functional role in the disease and may have sites important for regulation or binding

Answer 136

Its search feature can identify essential proteins in bacterial and eukaryotic pathogens with available crystal structures

Answer 137

By revealing different conformations of the same protein in complexes with various ligands, suggesting the protein samples multiple dynamic states.

Answer 138

Bonded and non-bonded interactions are represented using spring and potential energy functions that model chemical bonding and physical forces, respectively.

Answer 139

QM is applied to chemically active regions where classical force fields fail to model effects like bond formation and proton transfer.

Answer 140

The lock-and-key model assumes a rigid receptor with a preformed binding site, whereas modern models recognize that proteins exist in multiple conformations and that ligands can selectively bind or stabilize these dynamic states, expanding the set of druggable targets.

Answer 141

Cryptic sites are typically absent from static crystallographic structures because they exist in low-population, transient conformations that may only be exposed during protein motion—motions that can be captured by molecular dynamics simulations.

Answer 142

RCS better accounts for receptor flexibility by docking into an ensemble of protein conformations, but it still suffers from limited conformational sampling and scoring inaccuracies.

Answer 143

Alchemical methods rely on the fact that free energy is a state function, so even a non-physical transformation (like “disappearing” a ligand) can yield an accurate ΔG, but the approach is highly sensitive to insufficient conformational sampling during simulations.

Answer 144

Docking-based scoring functions are computationally inexpensive and allow screening of large compound libraries, making them practical for early-stage filtering, despite their lower accuracy compared to alchemical methods.

Answer 145

aMD improves sampling by artificially lowering energy barriers between conformational states, allowing transitions that would otherwise be rare, but this introduces artifacts that can affect the physical accuracy of structural interpretations.

Answer 146

Force field parameters, such as bond stiffness values and partial atomic charges, must be assigned to the coordinates because this information is not typically included in standard PDB files.

Answer 147

The force field calculates the forces between bonded atoms (e.g., bond stretching, angle bending, dihedral torsions) and non-bonded atoms (van der Waals forces and electrostatics).

Answer 148

Assigning correct protonation states to residues like aspartic acid, glutamic acid, lysine, and arginine is crucial; histidine, especially, often needs careful evaluation near neutral pH.

Answer 149

Explicit solvent models involve simulating many individual water molecules, whereas implicit models represent the solvent as a continuous medium, trading atomic detail for speed.

Answer 150

To minimize the root-mean-square deviation (RMSD) by aligning each frame to a reference frame

Answer 151

The average deviation of atomic positions from their mean position over time

Answer 152

To reduce the dimensionality of molecular motion data by identifying the principal axes of variance

Answer 153

The average acceleration of the molecule is very small due to high friction

Answer 154

To monitor conformational changes and interactions between residues over time

Answer 155

To identify a small set of representative protein conformations from a simulation that samples many structures.

Answer 156

They are pockets that appear in some protein conformations, often only upon ligand binding.

Answer 157

DruGUI uses molecular dynamics with small probe molecules to reveal dynamic interaction hotspots.

Answer 158

It accounts for protein flexibility by docking each compound into a set of diverse receptor conformations.

Answer 159

To predict a binding pose and generate a score that hopefully correlates with binding affinity.

Answer 160

Local docking is used when the binding site is known, while global docking is generally necessary when the binding site is unknown.

Answer 161

The proportion of known ligands in the top-ranked compounds is four times higher than in the overall screening library.

Answer 162

The probability that a randomly chosen active compound will rank higher than a randomly chosen other (probably inactive) compound.

Answer 163

ROC curves provide a full performance overview across all thresholds but may be less informative when the top-scoring compounds are the primary concern.