AI meets DEL: Is this the most powerful combo in modern drug discovery?

The power of AI may bring DEL screening to the next level

Receptor.AI Company
Receptor.AI
Published in
7 min readSep 23, 2022

--

DELs in a nutshell

DNA-encoded libraries are revolutionizing modern drug discovery by allowing an unprecedented amount of small molecules to be screened automatically. The technology is based on the conjugation of small molecules with unique DNA tags, which could be used to identify those which bind to the protein target of interest.

In contrast to traditional HTS, the DEL technology offers the possibility to screen an overwhelming amount (hundred of millions) of molecular species in a single experiment.

The traditional screening techniques rely mostly on biochemical assays, which allow some kind of automatic readout. Each compound must be placed on the dedicated cell on the plate, severely limiting the number of molecules that could be evaluated simultaneously. Even the most advanced screening robots are limited by the physical size of the plates and the latency of the corresponding biochemical reactions. Even if physical detection techniques, such as surface plasmon resonance or capacitance sensors, are employed instead of biochemical assays, the “one cell — one compound” restrictions still hamper the scaling of the screening.

The DELs benefit from the DNA tags that could be read out by modern sequencing techniques with an overwhelming sensitivity and selectivity. In theory, even a single tagged molecule could be detected among the millions of others. At the same time, the number of tags with particular sequences present in the sample could also be determined quantitatively. All this allows for employing radically different screening paradigms based on physical binding rather than any kind of functional activity.

The idea under all DELs screening techniques is to immobilize the target proteins and incubate the DNA-tagged small molecules with them. Those molecules which bind to the target will be trapped, while the rest could be washed out. Next, the tags are sequenced, and the corresponding species of molecules binding to the target are identified and quantified.

The target proteins could be immobilized on magnetic beads, porous resin or various solid surfaces. It is even possible to utilize the proteins expressed on the cell’s surface.

Advantages

The DELs possess a number of characteristics which make them extremely attractive in drug discovery endeavours:

  • They are extremely cheap: the cost per compound could be as low as $0,0001, which is unreachable with any other screening techniques nowadays.
  • The throughput exceeds all other methods reaching millions of compounds in a single experiment.
  • Modern technologies of DEL’s creation favour combinatorial chemistry and incremental synthesis, which suit the large virtual combinatorial chemical libraries perfectly.
  • It is usually easy to screen huge series of compounds based on a particular parent molecule or scaffold, which is beneficial to lead the expansion and lead optimization.

It’s not all roses

Despite the tremendous hype associated with DELs and numerous cases of their successful applications, this technology is not flawless. The drawbacks, as usual, are the continuation of advantages.

  • Not all target proteins could be easily purified and immobilized without loss of physiological conformation and/or activity.
  • Physical binding doesn’t imply activity. The binding may occur in a functionally irrelevant site or may not affect the protein function. It is also possible that the binding only happens to the isolated and immobilized protein but not in the native conditions in the cell.
  • The DNA tags are high in volume and impose significant sterical and kinetic constraints on the ligands. Particularly, they may block the binding to deep and narrow pockets and prevent the molecules from rotating freely (in fact, the ligand may only bind in the “tail out” orientation).
  • Low overall chemical novelty and diversity are caused by combinatorial synthesis strategies. The library itself could be massive, but the number of unique scaffolds remains limited.
  • It is hard to convert diverse enumerated chemical spaces into DELs. They have to be decomposed into the combinatorial space with the synthesis maps, which is not always easy or even feasible.
  • The amount of data generated in DEL screenings exceeds human analysis capabilities.

Combining AI with DELs: a powerful combo

DELs are all about big data, which makes them perfectly compatible with machine learning. The ML is unable to mitigate the inherent physics limitations of DELs, but it can help with data management at all important stages.

The DELs could be immense in variety, but they still cover only a tiny spot in the vast chemical space of the drug-like molecules. The AI-based virtual screening can explore much larger chemical spaces, which would never be accessible to physical screening, at a fraction of the cost. At the same time, the AI-based in silico screening accuracy relies on the amount and quality of data used for model training. The DELs provide more data on molecular binding than was ever accessible before. These factors allow to build two strategies of combining AI virtual screening and DEL screening in an integrated pipeline:

AI first, DEL second

The AI virtual screening could serve as a preparatory step for DEL creation. The drug-target interaction AI models could quickly assess trillions of molecules from very large and diverse chemical spaces. The generative AI models can also generate chemical spaces of any imaginable size and diversity and screen them against the target protein.

The results of such massive virtual screening allow identifying the most promising novel chemical scaffolds and patterns, which are likely to bind best with the target protein.

The number of such scaffolds could be rather small (tens to hundreds). Each scaffold could be expanded into the vast DEL by trying out different substituents, using the power of combinatorial chemistry. The expansion itself could also be guided by AI trained to generate DEL-compatible variations of the parent molecule. After that, the DELs could be screened experimentally to get the most promising hit compounds.

An advantage of this scheme is achieving much higher chemical diversity and the coverage of chemical space than is available to the DELs themselves. The libraries could be smaller thanks to AI-based rational design, making them cheaper and easier to work with. The resources could then be used to perform several iterations of the rational design and tuning instead of a single super-large screening.

DEL first, AI second

The DEL screenings produce enormous amounts of data containing information regarding the binding propensity of millions of compounds. What is even more important, this data comprises positive and negative samples at the same time.

The dataset originating from the DEL screenings could be used to train the AI models of the drug-target interaction. The size of such a dataset will exceed all available information on the drug-target interactions from other sources (such as experimentally resolved complexes, biochemical affinity assays or clinical trial data). Although it will be biased towards using combinatorial chemistry techniques, it is expected to result in extremely accurate AI models.

The model could then be applied to unbiased commercial chemical spaces with much higher diversity than the parent DEL itself. In other words, the AI-based virtual screening will propagate the knowledge obtained in the DEL experiment to the huge chemical universe. The usual pipeline of AI-based drug discovery will follow, resulting in a set of hit candidates.

Iterative AI-DEL drug development

The previous two scenarios could be combined together to achieve even better complementarity of the DEL and AI screening techniques.

The results of previous rounds of screening (either AI-based or DEL-based) could be subject to series expansion using the AI scaffold-based generators and directed to the second round of AI and DEL screening.

Taking into account the critical dependence of the AI model on the quality and quantity of data, the usage of DEL screening instead of traditional biochemical assays will lead to much faster convergence of this iterative approach to the compounds with superior affinity and selectivity.

The Receptor.AI SaaS platform is DEL-ready

The AI-based drug discovery SaaS platform from Receptor.AI is developed with interoperability with DELs in mind.

Our platform could be used in all three modes described above:

  1. Screening of the huge chemical space to identify the scaffolds for subsequent DEL generation and screening.
  2. Automated data preparation and training the AI models based on the data of DEL screenings and applying them to the huge chemical space to get the diverse hit compounds.
  3. Iterative usage of AI and DEL screening.

The platform not only provides seamless automated virtual screening of huge chemical spaces but also DEL workflows for super easy uploading and integration of the custom chemical databases, such as combinatorial databases used for DEL creation.

The platform is also capable of very fast and easy integration of new experimental endpoints as filters for virtual screening. The data of DEL screenings could be uploaded into the system, which will trigger automatic training of the predictive AI model and its integration into the user interface as a custom virtual screening filtering option.

The results of the virtual screening are presented in an informative and easy-to-use web interface, which allows chemists to select the molecules for further expansion into the DEL libraries.

Finally, Receptor.AI provides services for scaffold-based molecular generation by AI, which could simplify DEL creation and pre-screening evaluation.

--

--

Receptor.AI Company
Receptor.AI

Official account of RECEPTOR.AI company. We make the cell membranes druggable to provide new treatments for cancer and cardiovascular diseases.