Repository logo

Combating small-molecule aggregation with machine learning

Published version

Change log


Lee, K 
Yang, A 
Lin, YC 
Reker, D 
Bernardes, GJL 


Drug discovery is fuelled by small-molecules, either as tools to interrogate biology or as leads for investigational therapeutics.1-3 The successful development of such entities entails a transformative power on disease modulation, but a large proportion of them fail to reach the clinic due to attrition.4 In that regard, it is increasingly recognized that a significant fraction of screening molecules present low aqueous solubility and form nano/microscale agglomerates. These colloidal aggregates can bind unspecifically to proteins, inducing local denaturation and apparent inhibition.5 While undirected interactions between small molecules and proteins are not desirable, most biological assays lack resolution to distinguish between ‘pathological’ and ‘directed’ target engagement; thereby incorrectly presenting the aggregating molecule as a positive result. Indeed, colloidal aggregates account for the largest source of false positive hits in high-throughput screens, surpassing other well-documented ‘con artists’, such as the pan-assay interference compounds



3404 Medicinal and Biomolecular Chemistry, 34 Chemical Sciences, Networking and Information Technology R&D (NITRD), Machine Learning and Artificial Intelligence, Biotechnology

Journal Title

Cell Reports Physical Science

Conference Name

Journal ISSN


Volume Title



Elsevier BV
Royal Society (URF\R\180019)