AI in Drug Discovery Flashcards

Week 6 - Friday (4th October 2024)

1
Q

Where can AI be used in drug discovery?

A
  • Molecular library screening
  • Target identification
  • Preclinical studies
  • Drug repurposing
  • De novo drug design
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the benefits of AI-driven approaches?

A
  • Reduced animal use
  • Toxicity predictions earlier in the clinical pipeline
  • Faster and easier synthesis
  • Significantly reduced screening requirements
  • Significantly less expenditure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

AI vs. Machine learning vs. Deep learning

A
  • AI is any technique where machines attempt to mimic human behaviour
  • Machine learning is a subset of AI whereby statistics is used to enable machines/algorithms to improve with experience
  • Deep learning is a subset of ML which makes the computation of multi-layer neural networks feasible
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Subtypes of machine learning

A
  • Unsupervised learning: Unlabelled data is given to an algorithm and it tries to find patterns
  • Supervised learning: Labelled data is given to an algorithm and it tries to fit or understand how the labels relate to the data
  • Reinforcement learning: An algorithm tries to interpret data with constant positive and/or negative feedback
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Properties of unsupervised machine learning

A
  • Data is analysed in an unbiased manner
  • Requires very little manual intervention - less arduous
  • Can be used to discover anomalies in data
  • Identifies sets of items that often occur together
  • Is heavily used for data visualisation and interpretation (the algorithm doesn’t know what the categories are but it sorts them out for you)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Disadvantages of unsupervised machine learning

A
  • Human interpretation is needed to see if the predicted clustering visualisation makes sense
  • You cannot easily get precise reasons for why the clusters were assigned in a particular way
  • Accuracy of the clustering is hard to measure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Supervised machine learning subtypes

A
  • Classification: The algorithm tries to learn how to predict a label for a sample given its features
  • Regression: The algorithm tries to learn how to predict a value for a sample given its features
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Properties of supervised machine learning

A
  • The ultimate goal is to be able to take a set of features from unseen data and predict their labels or values
  • This is done by learning from a previously generated set of data and generating a model that is able to predict labels or values based on the features of the new data
  • Fitting the data at different iterations and optimising the line of best fit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Disadvantages of supervised machine learning

A
  • Requires labelling of data or assignment to groups (can be costly)
  • Data requirements can be high (minimum hundreds of data points)
  • Interpretation of algorithms can be hard
  • Can be computationally costly
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Reinforcement learning

A
  • The algorithm learns through trial and error by making predictions and receiving positive or negative feedback and adjusting itself to improve
  • Much slower than other ML types as it involves feedback loops whereby new data is collected or labelled
  • Reinforcement learning has the potential to learn accurate models with significantly fewer data points than supervised learning
  • Requires lots of manual interaction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Case study: DDR1 inhibitor

A
  • A novel DDR1 inhibitor with high patentability was discovered in only 46 days using ML
  • Existing compounds and their IC50 values against DDR1 were used in a supervised learning model to predict IC50 for 30000 potential compounds
  • These compounds were reduced to 40 with the best IC50
  • These 40 were reduced to 6 with the best patentability
  • Of these 6, 4 were effective and one compound has now completed Phase 2a clinical trials
How well did you know this?
1
Not at all
2
3
4
5
Perfectly