Can automated AI drug discovery platforms increase the confidence of pharma in AI techniques?

Is it possible to overcome the frustration of early adopters of AI drug discovery techniques?

Published in

Receptor.AI

7 min readJul 25, 2022

There is no doubt that Artificial Intelligence is a disruptive technology for drug discovery. The AlphaFold alone, which de facto solved an eternal problem of the protein structure prediction, is enough to claim this. However, there is a certain disappointment in AI drug discovery techniques. The early adopters of AI technologies quickly realised that AI is not a silver bullet which will magically eliminate all the complexity and cost of the drug discovery process.

Receptor.AI has studied and spoken to many parties who, over the last few years, have used AI drug discovery tools and taken their feedback to help overcome these challenges with our solutions.

The current feeling about AI drug discovery is surprisingly different among big pharma executives and from SMEs & academia. SME biotech still lacks full access to AI drug discovery technologies because current solutions are mostly tailored for big players and thus excessively expensive and complex for the smaller ones. Individual stages of drug discovery, such as virtual screening or ADMET assessment, can give academia and SME biotech a huge boost from AI technologies, but access to these technologies is limited due to excessive prices. This is what Receptor AI will address to better serve this essential market’s demands in the drug development landscape.

Early adopters found that the AI drug discovery workflow was weak

The biggest obstacles in AI-based drug discovery are, quite expectedly, the quality of data and poorly determined endpoints. Even such seemingly obvious things as virtual screening become surprisingly complex if formulated in machine learning terms. We want to classify the compounds into “effective” and “ineffective” for particular diseases, which looks like a trivial ML problem, similar to finding the images of cats among other pictures in computer vision. However, unlike the images, which either or do not contain the cat’s face, we can’t even formalise what is “effective” when we speak about molecules. The ultimate goal is to find a compound, which cures the disease, but it’s hard physically to get enough training data for this endpoint. That is why R&D teams are forced to use proxy endpoints, such as the strength of binding to a particular target protein.

In a way, all modern AI-based drug discovery searches for the ligands (something that binds) but not for the drugs (something that cures). The binding free energy (or the binding score, which is an approximation of the binding free energy) is a perfect metric for finding the ligands, but not necessarily for finding successful drugs. Indeed, the perfectly binding ligand could fail miserably as a drug due to toxicity, poor ADME properties, harmful off-target interactions and a dozen other non-trivial reasons. That is why Receptor.AI applies AI-based virtual screening and utilises more than 40 predictive models, which estimate the multitude of biological factors, ranging from basic ADME-Tox/PK properties to the proteome-wide off-target interactions assessment.

All these filters are AI models themselves, which require reliable data. A lot of high-quality data! Unfortunately, not all of the data accumulated by big pharma is directly usable for training the AI models. Most of this data is generated by simple high-throughput assays, which are able to perform gargantuan amounts of parallel measurements. However, these measurements usually have low predictive power in terms of the final in vivo endpoint. The AI models trained on these data demonstrate mediocre results not because the model architecture is bad (it is usually state-of-the-art) or the training dataset is too small (it is actually huge), but just because wrong proxy endpoints are used. Going back to analogy with computer vision, the model trained on the images of isolated patches of fur is unlikely to tell the cat out of the dog, regardless of the quality and quantity of these training images.

Automation and customisation are the key

It is also becoming more and more obvious that there is no such thing as a “universal drug discovery workflow”, which is usually referred to as a “drug discovery pipeline” and visualised as a linear sequence of steps.

The general progression from hits to leads, to optimised leads and to drug candidates is usually respected, but the technical details of each stage are far from being linear and well-determined. Each project requires the correct choice and deep customisation (often iterative) of the AI techniques. If the AI-based platform suits all the drug discovery projects, it is unlikely to produce meaningful results for any of them.

Based on discussion with multiple biotech and pharma companies, Receptor.AI came to the necessity of automation and fine control over each stage of the AI drug discovery workflow:

Data preparation and quality assessment of the training and test datasets.
Correlation of proxy values with biological endpoints and selection of the most significant proxy representation.
Smart AI assistant of data and model quality control.
Mapping from the raw data to the most meaningful proxy values.
Choosing the most suitable AI model architecture and training protocol.
Performing model training, assessment and tuning.
Managing different model versions and parameter sets.
Deploying the model and monitoring its performance in real-world projects.
Customisable AI pipeline to flexibly approach specific drug discovery projects

Ironically, many of these tasks related to the training and assessment of AI models could be automated by means of other domain-oriented AI models. The AI could assist in evaluating raw data quality, monitoring model performance, suggesting the best features extraction and data transformations techniques, etc.

At the same time, we are too far from the general AI, which will think instead of us, so the platform must allow effective cooperation of multiple human minds: AI engineers, chemists, biologists, and managers. The structure of these interactions could be very different in different pharma companies and even differ from one project to another within the same company. The AI platform should be flexible enough to accommodate all these variations.

Currently, most of these tasks are performed manually by the AI departments of the pharma companies, which often operate with metrics and techniques that are opaque to drug R&D departments. As a result, there is often a lack of coherence and poor understanding of the common goal, the trivial issues lead to significant delays and wasting of resources and the AI models, which show outstanding results in synthetic benchmarks, suddenly fail in real-world situations.

Better tools for overcoming frustration

Receptor AI provides the tools for building effective and customisable AI workflows, which are dedicated for drug discovery. There are several general-purpose systems of this kind, but their adaptation to drug discovery is a non-trivial task by itself, which requires advanced knowledge of both subject areas. The price and complexity of such adaptation could be prohibitive for most pharma and biotech companies because it requires additional investment into the automation of an already questionable technology stack.

To address most of these issues, Receptor.AI currently develops the next-generation AI platform for drug discovery, which is not only AI-powered but also AI-automated and AI-assisted.

The platform is fully configurable and includes an easy-to-use pipeline constructor, which allows the user to design its own drug discovery workflow. The platform automates and controls all the routine drug discovery ML tasks focusing on successful drug design, not ligand design: data quality assessment, data filtering, properties and features extraction, model architecture selection, continuous model re-training/tuning triggered by the changes of data or model architecture, deployment of the models, testing their performance, version control of model architectures and parameters, etc.

Such a system solves the majority of technical issues which are repetitively faced by the AI department of the pharma companies in each new project, leading to a clean, effective and organised working environment. Most of the routine tasks are not only automated but are going to be monitored by “advisory AI”, which adapts to the internal workflow of the company and balances the tasks and resources accordingly.

What is also important, the platform could be deployed in two operation modes:

Completely on-premise, to ensure data protection, access rights and security for big pharma;
Cloud-based, which provides almost infinite scalability and accessibility for SMEs.

For now, all major elements of the platform are fully functional in-house and are used in the pilot projects, which Receptor.AI performs in collaboration with several research institutions and CROs. The full system could be deployed on-premise or in the secure cloud and adapted to the needs of the pharma company.

There is also a SaaS solution for super-fast multi-billion scale virtual screening (within an hour), which exposes the most popular modules of the platform by means of a user-friendly interface designed to be “a Google search for novel hit compounds”. The SaaS is the most useful for small and medium biotech companies and academic institutions, which require cheap and hassle-free solutions for individual stages of drug discovery workflow.

Artificial Intelligence changes the world in real-time. However, drug discovery companies have become somewhat sceptical about the immediate utility of AI in their everyday activities. We believe that this skepticism is a typical frustration of the early adopters, which will soon be cured by the next-generation AI-based tools. Receptor.AI is currently developing such tools and is ready to provide them to the customers.

Contact ai@receptor.ai to speak to our team about how we can optimise your drug development pipeline.

Can automated AI drug discovery platforms increase the confidence of pharma in AI techniques?

Is it possible to overcome the frustration of early adopters of AI drug discovery techniques?

Early adopters found that the AI drug discovery workflow was weak

Automation and customisation are the key

Better tools for overcoming frustration

Written by Receptor.AI Company