17:00
Quantum Machine Learning for Structure-Based Virtual Screening of the Entire Medicinal Chemical Space
-
It has been estimated based on the graph theory that there are at least 1060 organic molecules that are relevant for small-molecule drug discovery. Using machine learning to estimate the binding free energies for screening of large chemical libraries to search for the tightly binding inhibitors would take a considerable amount of computational resources, yet it is not possible to explore the entire biologically relevant chemical space. Quantum computing provides a unique opportunity to accomplish such a computational task in the near future. Here, we demonstrate how to use 512 occupancies to describe the structures of protein-ligand complexes, how to convert the classical occupancies to the quantum states using nine qubits, and to estimate the binding free energies (Gbind) of the complexes using quantum machine learning. We showed that it is possible to use only 450 parameters to prepare the quantum states for describing the structure of one protein-ligand complex. In this work the entire 2020 PDBbind dataset was adopted as the training set, and we used 45 parameters as the first attempt to construct the model for predicting the binding free energies (Gbind). The Pearson correlation coefficient (PCC) between the estimated binding free energies and the corresponding experimental values are 0.49. By slightly increasing to 1,440 parameters for constructing the neural network model for the prediction of the Gbind, the PCC is improved to be 0.78, which is even slightly better than to the results achieved by recent classical convolutional neural network models using more than millions of parameters. In this work, for the first time, we demonstrated the feasibility of using quantum computers to explore the entire medicinal chemical space with a concrete, implementable approach. LIN
(Research Center for Applied Science, Academia Sinica)