QSAR Flashcards

1
Q

What does QSAR stand for?

A

Quantitative structure-activity relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why is QSAR used?

A

To decide which compounds to make out of a set of compounds with known biological activity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Does QSAR require any knowledge of the receptor, active site, or the mechanism of action of the compounds?

A

No.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is required of the compounds used to build a QSAR equation?

A

They must all act in the same way against the same receptor or active site.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Outline the general procedure of creating the QSAR equation.

A
  • Select a set of molecules with the same activities on the same receptor.
  • Calculate features.
  • Divide the set into two subgroups (training and testing).
  • Build a model using the training set.
  • Test the model on the testing set.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What two sets are the molecules divided into?

A

Testing and training.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the training group used for?

A

Building the QSAR equation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the testing group used for?

A

To test the QSAR equation, preventing overfitting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Outline the steps in the preparation of the structures for QSAR.

A
  • Draw the compounds.
  • Clean up the structure by performing a molecular mechanics geometry optimisation.
  • Identify key rotatable bonds and perform a conformation search.
  • Perform a semi-empirical quantum mechanical geometry optimisation on the lowest energy conformation identified in step 3.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe molecular mechanics geometry optimisation.

A
  • Considers atoms as balls with mass and springs with a force constant.
  • Does not consider electrons.
  • Fast.
  • Low quality but okay for a quick clean up of a drawn structure.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe semi-empirical quantum mechanical geometry optimisation.

A
  • The valence electrons are used to construct molecular orbitals.
  • The inner electrons are approximated via a parameter set (fully calculating would take too long).
  • Slower than molecular mechanics but better quality.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why do energy minimisation techniques tend to find the nearest local minimum in the energy surface?

A

All energy minimisation techniques concentrate on searching ‘downhill’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is conformational searching prefered over energy minimisation?

A

Because when energy minimisation is used alone it is not capable of finding the global minimum energy. If a much deeper energy minimum is separated from the current location by an area of greater energy then it won’t be accessible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Outline conformation searching.

A

Each rotatable bond is stepped round in small increments and the energies of the resulting conformations are calculated.
This is used to find the approximate position of the global minimum energy well.
After this, a high-quality energy minimisation technique can be used to refine the structure down to the global minimum energy conformation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a conformational explosion?

A

An increase in rotatable bonds leads to a massive increase in possible conformations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

GIve some categories of molecular descriptors.

A

Solubility, electronic, steric, lipophilic, constitutional, geometrical, topological.

17
Q

Give some quickly calculated molecular descriptors.

A

Molecular weight, dimensions.

18
Q

Give some molecular descriptors that take a while to calculate.

A

HOMO-LUMO energy gap, polarisability, partial atomic charge, dipole moment.

19
Q

How is the QSAR equation constructed?

A

One uses multiple regression. This calculates an equation describing the relationship between a single dependent y variable and several explanatory x variables.
It is very important to choose variables which are not correlated.

20
Q

Describe simple multiple regression.

A

All the input x variables are used in the equation to predict y (bio-activity).

21
Q

Describe stepwise multiple regression.

A

A selection algorithm is used to choose a subset of input x variables.

22
Q

Explain the concept of overfitting.

A

The risk that an apparently good regression equation will be found, based on a chance numerical relationship between the y variable and one or more x variables, rather than a genuinely predictive relationship.
The equation will fit the training set well but will be useless in predicting the activity of the testing set.

23
Q

What is the best way to avoid an overfitted regression equation?

A

To use just a few carefully selected x variables and use as many data points as possible for each one.

24
Q

What is cross-validation?

A

It is a technique used to estimate the true predictive power of the model.

25
Q

How is cross-validation carried out?

A

Systematically, sets of the data are left out and then used to validate the equation derived from the remaining datapoints.