Mol-Instructions
MolDataset
A dataset and tools package designed to support the training and evaluation of large language models for molecular biology tasks
[ICLR 2024] Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models
255 stars
7 watching
16 forks
Language: Python
last commit: about 1 year ago
Linked from 1 awesome list
ai-for-sciencebiomedicaldatasetdatasetsgenerationiclr2024instructioninstruction-followinginstructionslarge-language-modelsllamamol-instructionmoleculenatural-language-processingproteinresourcescience
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | An implementation of a multimodal language model with capabilities for comprehension and generation | 585 |
| | A collection of multilingual language models trained on a dataset of instructions and responses in various languages. | 94 |
| | Improves performance of vision language tasks by integrating computer vision capabilities into large language models | 314 |
| | A benchmarking suite for multimodal in-context learning models | 31 |
| | Automated machine learning with tree search optimization | 16 |
| | A Julia-based toolkit for graph-based molecule modeling and chemoinformatics analysis | 202 |
| | Extending pretraining models to handle multiple modalities by aligning language and video representations | 751 |
| | A comprehensive solution to machine learning assignments on Coursera with MATLAB code | 55 |
| | A framework for editing knowledge in large language models | 1,981 |
| | Tool for validating and standardizing chemical structures to improve data quality and facilitate comparisons. | 163 |
| | A research-friendly codebase for experimenting with multi-agent reinforcement learning in JAX | 749 |
| | A large language model designed to understand and generate instructions with accompanying visual content | 360 |
| | An implementation of a general-purpose robot learning model using multimodal prompts | 781 |
| | A Kotlin implementation of monadic types for functional programming. | 10 |
| | Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. | 87 |