8. Drug Discovery Flashcards

Question 1

Q

give examples of chemical & biological data

Answer

A

drug info
drug target
drug side effects
drug chemical interaction

Question 2

Q

where can this info be obtained

Answer

A

PubChem = structures & chemical activities, PDB = structure of proteins

Question 3

Q

how long do drug discoveries usually take

Answer

A

decade +, hence why data analysis & algos can expedite the process

Question 4

Q

how can ml be used in drug discovery & development

Answer

A

predicting drug-target interactions between chemical compounds and biological targets (proteins)

Question 5

Q

how can ml be used to predict effects

Answer

A

determine adverse side effects or unintentional therapeutic effects (unacceptable toxicities):

   - drug/drug interaction
   - multi-target interaction

Question 6

Q

how can ml be used post-marketing

Answer

A

finding patterns in drug-related adverse events because clinical trials are for a limited duration and only study limited patient characteristics. models can represent multidimensional space and determine the relationship of drug variables to adverse events

Question 7

Q

what is the pharmacological space

Answer

A

integration of chemical space & genomic space to infer unknown drug-target interactions

Question 8

Q

how are unknown drug-target interactions found by ml

Answer

A

integration of chemical & genomic space to create pharmacological space

embed known interactions between compounds & proteins
regression models are learned to map the pharmacological space (between genomic & chemicals)
interacting compound-protein pairs are predicted by connecting compounds & proteins that are closer than a threshold (similarity scores are computed)

Question 9

Q

how are feature based similarity scores computed

Answer

A

inner product of the chemical and genomical vectors

Question 10

Q

what is GNN

Answer

A

graph neural network
a graph is a matrix that represents some information between two points (i and j)

a GNN demonstrates thi by passing node features as message along it’s edges. each node that is connected to other nodes, aggregates the messages from it’s neighbours via these edge connections

Question 11

Q

what are the limitations of GCN

Answer

A

every node sums features of the neighbouring nodes, but not itself unless there’s a self-loop

the adjacency matrix is not normalised, so multiplication of values can cause large scale differences

Question 12

Q

what is GCN

Answer

A

a GNN that solves limitations by using normalisation of values & a self-loop to include the node itself in the sum of the neighbouring node features

Question 13

Q

what is vgae

Answer

A

variational graph auto-enconder

doesn’t use a fixed latent representation for inputs. rather, it learns the mean & sd of the latent distribution so unknown outputs can be generated

Question 14

Q