Mol-Instructions
MolDataset
A dataset and tools package designed to support the training and evaluation of large language models for molecular biology tasks
[ICLR 2024] Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models
255 stars
7 watching
16 forks
Language: Python
last commit: 4 months ago
Linked from 1 awesome list
ai-for-sciencebiomedicaldatasetdatasetsgenerationiclr2024instructioninstruction-followinginstructionslarge-language-modelsllamamol-instructionmoleculenatural-language-processingproteinresourcescience
Related projects:
Repository | Description | Stars |
---|---|---|
| An implementation of a multimodal language model with capabilities for comprehension and generation | 585 |
| A collection of multilingual language models trained on a dataset of instructions and responses in various languages. | 94 |
| Improves performance of vision language tasks by integrating computer vision capabilities into large language models | 314 |
| A benchmarking suite for multimodal in-context learning models | 31 |
| Automated machine learning with tree search optimization | 16 |
| A Julia-based toolkit for graph-based molecule modeling and chemoinformatics analysis | 202 |
| Extending pretraining models to handle multiple modalities by aligning language and video representations | 751 |
| A comprehensive solution to machine learning assignments on Coursera with MATLAB code | 55 |
| A framework for editing knowledge in large language models | 1,981 |
| Tool for validating and standardizing chemical structures to improve data quality and facilitate comparisons. | 163 |
| A research-friendly codebase for experimenting with multi-agent reinforcement learning in JAX | 749 |
| A large language model designed to understand and generate instructions with accompanying visual content | 360 |
| An implementation of a general-purpose robot learning model using multimodal prompts | 781 |
| A Kotlin implementation of monadic types for functional programming. | 10 |
| Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. | 87 |