Link to paper The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract Molecular conformation generation (MCG) is an important problem in drug discovery Traditional methods have been developed to solve MCG, such as systematic searching, model-building, random searching, etc. Recently, deep learning based MCG methods have been developed A simple and cheap algorithm (parameter-free) based on traditional methods is comparable to or even outperforms deep learning based MCG methods Code of the proposed algorithm is available online Paper Content Introduction Molecular conformation generation is important for drug discovery It is related to many drug design tasks Traditional MCG uses conformational search and energy minimization RDKit is a popular cheminformatics software Distance geometry and direct coordinate methods are used Diffusion models are also used Deep learning models are evaluated with Coverage and Matching A simple algorithm based on traditional approaches outperforms deep learning models Related work Classical methods in computational chemistry Development of deep learning Data-driven solutions proposed by researchers Classical methods Traditional MCG paradigm involves conformational search, energy minimization, and energy evaluation Conformational search problem is a combinatorial explosion problem Popular conformational search methods include system search, random search, model-building, distance geometry, and molecular dynamics Energy evaluation methods include force field and electronic structure methods Force field methods are less accurate than electronic structure methods, but are faster Deep learning methods Deep learning methods outperform traditional methods on the GEOM benchmark Earlier work used VAE to generate atomic coordinates directly, but it could not maintain translation and rotation equivariance Later works use intermediate structures such as interatomic distances or torsion angles to generate conformations Diffusion models have been applied to the conformation generation task Method Proposed a method based on RDKit with clustering post-processing Used three samplers to generate diverse and low-energy conformations Applied unsupervised cluster algorithm to select conformations with consideration of diversity and energy Sampled with uniform, geometric, and energy samplers in the ratio of 1:1:4 Experiment Datasets and setup used for benchmarking 10 competitive baselines compared Results show method outperforms most baselines Ablation study conducted to demonstrate more diverse conformations can easily achieve better results Benchmarking should be done according to requirements of downstream applications Conclusion Algorithm outperforms deep learning models Suggest community rethink benchmark in MCG Deep learning can help build effective MCG models RDKit + Clustering algorithm proposed Performance on GEOM-QM9 and GEOM-Drugs Ablation studies for number of samples and sampler type on GEOM-QM9