Predicting the Frequencies of Drug Side effects

A central issue in drug risk-benefit assessment is the identification of the frequencies of side effects in humans. Currently, these frequencies are experimentally determined in randomised controlled clinical trials.

We developed a novel machine learning framework for computationally predicting the frequencies of drug side effects. Our matrix decomposition algorithm learns latent signatures of drugs and side effects that are both reproducible and biologically interpretable. We show the usefulness of our approach on 759 structurally and therapeutically diverse drugs and 994 side effects from all human physiological systems.

Our approach can be applied to any drug or compound for which a few side effect frequencies have been identified, in order to predict the frequencies of further, yet unidentified, side effects. We also show that our model is informative of the biology underlying drug activity: individual components of the drug signatures are related to the distinct anatomical categories of the drugs and to the specific drug routes of administration.


Predicting the Frequency of Drug Side effects
Diego Galeano, Shantao Li, Mark Gerstein & Alberto Paccanaro
Nature Communications; doi: (2020).
Predicting the Frequency of Drug Side effects
Diego Galeano and Alberto Paccanaro. BiorXiv, 594465; doi: 10.1101/594465 (2019).

Supplementary Data

Supplementary Code:

GitHub repository