Awesome Small Molecule Machine Learning / Papers / Survey papers and books |
| Critical assessment of AI in drug discovery | | | Walters and Barzilay, 2021. |
| Deep Learning for Molecules and Materials | | | White, 2021. |
| Defining and Exploring Chemical Spaces | | | Coley, 2020. |
| Learning Molecular Representations for Medicinal Chemistry | | | Chuang et al, 2020. |
| Applications of Deep Learning in Molecule Generation and Molecular Property Prediction | | | Walters and Barzilay, 2020. |
| Transfer Learning for Drug Discovery | | | Cai et al, 2020. |
Awesome Small Molecule Machine Learning / Papers / Representation, transfer learning, and few-shot learning |
| SELFIES and the future of molecular string representations | | | Krenn et al, 2022. |
| Molecular Contrastive Learning of Representations via Graph Neural Networks | | | Wang et al, 2022. . [ ] |
| ChemBERTa-2: Towards Chemical Foundation Models | | | Ahmad et al, 2021. . [ ] |
| E(n) Equivariant Graph Neural Networks | | | Satorras et al, 2021. . [ ] |
| FS-Mol: A Few-Shot Learning Dataset of Molecules | | | Stanley et al, 2021. . [ ] |
| ATOM3D: Tasks On Molecules in Three Dimensions | | | Townshend et al, 2021. |
| X-MOL: large-scale pre-training for molecular understanding and diverse molecular analysis | | | Xue et al, 2021. . [ ] |
| Do Transformers Really Perform Bad for Graph Representation? (Graphormer paper) | | | Ying et al, 2021. . [ ] |
| Attention-Based Learning on Molecular Ensembles | | | Chuang and Keiser, 2020. |
| Inductive transfer learning for molecular activity prediction: Next-Gen QSAR Models with MolPMoFiT | | | Li and Fourches, 2020. . [ ] |
| Molecule Attention Transformer | | | Maziarka et al, 2020. . [ ] |
| Meta-Learning GNN Initializations for Low-Resource Molecular Property Prediction | | | Nguyen et al., 2020. [ ] |
| Self-Supervised Graph Transformer on Large-Scale Molecular Data (GROVER paper) | | | Rong et al., 2020. . [ ] |
| Strategies for Pre-training Graph Neural Networks | | | Hu et al, 2019. . [ ] |
| Analyzing Learned Molecular Representations for Property Prediction (Chemprop) | | | Yang et al, 2019. . [ ] |
| PotentialNet for Molecular Property Prediction | | | Feinberg et al, 2018. |
| Low Data Drug Discovery with One-Shot Learning | | | Altae-Tran et al, 2017. |
Awesome Small Molecule Machine Learning / Papers / Generative algorithms |
| Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation | | | Bengio et al, 2021. . [ ] |
| Molecular generation by Fast Assembly of (Deep)SMILES fragments | | | Berenger and Tsuda, 2021. . [ ] |
| Amortized Tree Generation for Bottom-up Synthesis Planning and Synthesizable Molecular Design | | | Gao et al, 2021. . [ ] |
| R-group replacement database for medicinal chemistry | | | Takeuchi et al, 2021. |
| Deep Generative Models for 3D Linker Design | | | Imrie et al, 2020. . [ ] |
| Hierarchical Generation of Molecular Graphs using Structural Motifs | | | Jin et al, 2020. . [ ] |
| CReM: chemically reasonable mutations framework for structure generation | | | Polishchuk, 2020. . [ ] |
| GuacaMol: Benchmarking Models for de Novo Molecular Design | | | Brown, 2019. . [ ] |
| MolecularRNN: Generating realistic molecular graphs with optimized properties | | | Popova et al, 2019. |
| Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation | | | You et al, 2019. . [ ] |
| Optimization of Molecules via Deep Reinforcement Learning | | | Zhou et al, 2019. . [ ] [ ] |
| Junction Tree Variational Autoencoder for Molecular Graph Generation | | | Jin et al, 2018. . [ ] |
| De Novo Design of Bioactive Small Molecules by Artificial Intelligence | | | Merk et al, 2018. |
Awesome Small Molecule Machine Learning / Papers / Hit finding and potency prediciton |
| EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction | | | Stärk et al, 2022. . [ ] |
| A practical guide to large-scale docking | | | Bender et al, 2021. |
| DOCKSTRING: easy molecular docking yields better benchmarks for ligand design | | | García-Ortegón et al, 2021. . [ ] [ ] |
| Accelerating high-throughput virtual screening through molecular pool-based active learning | | | Graff et al, 2021. . [ ] |
| Deep Docking: A Deep Learning Platform for Augmentation of Structure Based Drug Discovery | | | Gentile et al, 2020. . [ ] |
| Adding Stochastic Negative Examples into Machine Learning Improves Molecular Bioactivity Prediction | | | Cáceres et al, 2020. |
| Ultra-large library docking for discovering new chemotypes | | | Lin et al, 2019. |
Awesome Small Molecule Machine Learning / Papers / ADME and toxicity prediction |
| A Graph Neural Network Approach to Molecule Carcinogenicity Prediction | | | Fradkin et al, 2022. |
| CardioTox net: a robust predictor for hERG channel blockade based on deep learning meta-feature ensembles | | | Karim et al, 2021. . [ ] |
| Validating ADME QSAR Models Using Marketed Drugs | | | Siramshetty et al, 2021. |
| Bayer’s in silico ADMET platform: a journey of machine learning over the past two decades | | | Göller et al, 2020. |
| DeepHIT: a deep learning framework for prediction of hERG-induced cardiotoxicity | | | Ryu et al, 2020. . [ ] |
| Deep Learning-Based Prediction of Drug-Induced Cardiotoxicity | | | Cai et al, 2019. . [ ] |
| Support Vector Machine model for hERG inhibitory activities based on the integrated hERG database using descriptor selection by NSGA-II | | | Ogura et al, 2019. . [ ] |
| In Silico Absorption, Distribution, Metabolism, Excretion, and Pharmacokinetics (ADME-PK): Utility and Best Practices | | | Lombardo et al, 2018. |
Awesome Small Molecule Machine Learning / Papers / Synthetic accessability and retrosynthetic planning |
| Data augmentation and pretraining for template-based retrosynthetic prediction in computer-aided synthesis planning | | | Fortunato et al, 2020. |
| Reinforcement Learning for Bioretrosynthesis | | | Koch et al, 2020. |
| Learning Graph Models for Retrosynthesis Prediction | | | Somnath et al, 2020. |
| Retrosynthesis Prediction with Conditional Graph Logic Network | | | Dai et al, 2019. . [ ] |
| SCScore: Synthetic Complexity Learned from a Reaction Corpus | | | Coley et al, 2018. . [ ] [ ] |
Awesome Small Molecule Machine Learning / Papers / DNA-encoded libraries (DELs) |
| Machine Learning on DNA-Encoded Library Count Data Using an Uncertainty-Aware Probabilistic Loss Function | | | Lim et al, 2022. . [ ] |
| Machine Learning on DNA-Encoded Libraries: A New Paradigm for Hit Finding | | | McCloskey et al, 2020. |
Awesome Small Molecule Machine Learning / Papers / Visualization and interpretability |
| ChemInformatics Model Explorer (CIME): Exploratory analysis of chemical model explanations | | | Humer et al, 2021. . [ ] |
| Benchmarks for interpretation of QSAR models | | | Matveieva and Polishchuk, 2021. . [ ] |
| Integrating the Structure–Activity Relationship Matrix Method with Molecular Grid Maps and Activity Landscape Models for Medicinal Chemistry Applications | | | Atsushi et al, 2019. |
| Finding Constellations in Chemical Space Through Core Analysis | | | Naveja and Medina-Franco, 2019. |
Awesome Small Molecule Machine Learning / Papers / MS/MS prediction |
| MassFormer: Tandem Mass Spectrum Prediction for Small Molecules using Graph Transformers | | | Young et al, 2023. . [ ] |
| Prefix-Tree Decoding for Predicting Mass Spectra from Molecules | | | Goldman el al, 2023. . [ ] |
| 3DMolMS: prediction of tandem mass spectra from 3D molecular conformations | | | Hong et al, 2023. . [ ] |
| CFM-ID 4.0: More Accurate ESI-MS/MS Spectral Prediction and Compound Identification | | | Wang et al, 2021. |
| Rapid Prediction of Electron–Ionization Mass Spectrometry Using Neural Networks | | | Wei et al, 2019. . [ ] |
Awesome Small Molecule Machine Learning / Data sets |
| ADME@NCATS | | | |
| AMED Cardiotoxicity Database | | | |
| BindingDB | | | |
| ChEMBL | | | |
| DrugBank | | | |
| DrugMatrix | | | |
| Enamine Real database | | | |
| hERG Central | | | |
| MoleculeNet | | | |
| MONA: DB of Mass spec + other readouts | | | |
| NPASS database of natural products | | | |
| PubChem | | | |
| The Open Reaction Database | | | |
| Therapeutic Data Commons | | | |
| Zinc | | | |
| |
| AutoDock Vina | | | |
| BioPandas | | | |
| Chemprop | 1,827 | 11 months ago | |
| DeepChem | | | [ ] |
| Open Babel | | | |
| pdb-tools | | | |
| PyTorch Geometric | | | |
| rd_filters | 132 | about 2 years ago | |
| Small-World Search | | | |
| TorchDrug | | | |
Awesome Small Molecule Machine Learning / Blogs |
| Hyperparameter Space | | | |
| Is Life Worth Living | | | |
| Practical Cheminformatics | | | |
| RDKit Blog | | | |
| |
| Regina Barzilay | | | |
| Bob the Grumpy Med Chemist | | | |
| John Chodera | | | |
| Connor W. Coley | | | |
| Greg Landrum | | | |
| pen(Taka) | | | |
| Bharath Ramsundar | | | |
| Marwin Segler | | | |
| Patrick Walters | | | |
| |
| Awesome Cheminformatics | 707 | over 1 year ago | |
| Awesome Drug Discovery | 42 | almost 4 years ago | |
| Awesome Explainable Graph Reasoning | 1,949 | over 3 years ago | |
| Awesome Python Chemistry | 1,152 | about 1 year ago | |
| deeplearning-biology | 2,026 | about 1 year ago | |