Automated Assignment of Active Compounds to Non-Primary Sites Helps Deep-learning Uncover Allosteric Modulators

August 09, 2021
Events, ACS2021

At the American Chemical Society Fall 2021 National Meeting & Expo, Atomwise members were selected to present their research and work. Learn what our Atoms have been working on below and visit Atomwise at ACS Fall 2021 National Meeting for other presentation sessions. 


SDOSaulo de Oliveira, PhD 

Atomwise Co-Authors: Paweł Gniewek, Venkatesh Mysore, Kate A. Stafford, Henry van den Bedem

Title: Automated Assignment of Active Compounds to Non-Primary Sites Helps Deep-Learning Uncover Allosteric Modulators

Division: Computers in Chemistry

View Presentation

ACS Fall 2021 Presentation - Saulo de Oliveira



As the field of machine learning matures and novel deep learning (DL) architectures emerge, there is increasing interest and success in the application of these techniques to structure-based drug design. High-performance DL methods in tasks such as virtual high throughput screening (vHTS) and binding pose prediction require large amounts of accurately annotated data. Many proteins contain more than one binding site, for example allosteric or cryptic sites. Those sites are highly valuable for structure-based drug discovery because they provide opportunities to overcome challenges in selectivity or drug-resistant mutations. However, compound activity measurements are rarely unambiguously mapped to a particular site in available databases or the literature, posing a significant challenge in training DL approaches to recognize binders to non-primary sites. To address these challenges, we developed a fully-automated binding site annotation pipeline. We used our method to identify the most probable binding sites for compounds with measured activity for known multi-site targets. Our results show that molecular docking alone consistently fails to identify the correct binding site. For these multi-site targets, we also find that one in five compounds that lacked a clear annotation were mapped to a non-primary site, suggesting incorrect assignments are frequent enough to affect model performance. To test this hypothesis, we incorporated our annotations into a larger data set and used this set to train and validate vHTS and pose prediction deep learning models. We find that better annotations at the site-level translate into improved prediction performance for non-primary sites, both in the context of pose prediction and vHTS. These findings support that better annotations provide a clear path towards more generalizable models and show promise for the computational detection of novel allosteric inhibitors.


Join our team

Our team is comprised of over 80 PhD scientists who contribute to a high-performance academic-like culture that fosters robust scientific and technical excellence. We strongly believe that data wins over opinions, and aim for as little dogma as possible in our decision making. Learn more about our team and opportunities at Atomwise.