Mol-Instructions

MolDataset

A dataset and tools package designed to support the training and evaluation of large language models for molecular biology tasks

[ICLR 2024] Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models

GitHub

252 stars
7 watching
16 forks
Language: Python
last commit: 25 days ago
Linked from 1 awesome list

ai-for-sciencebiomedicaldatasetdatasetsgenerationiclr2024instructioninstruction-followinginstructionslarge-language-modelsllamamol-instructionmoleculenatural-language-processingproteinresourcescience

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
ailab-cvc/seed An implementation of a multimodal language model with capabilities for comprehension and generation 576
mbzuai-nlp/bactrian-x A collection of multilingual language models trained on a dataset of instructions and responses in various languages. 94
byungkwanlee/moai Improves performance of vision language tasks by integrating computer vision capabilities into large language models 311
ys-zong/vl-icl A benchmarking suite for multimodal in-context learning models 28
herilalaina/mosaic_ml Automated machine learning with tree search optimization 16
mojaie/moleculargraph.jl A Julia-based toolkit for graph-based molecule modeling and chemoinformatics analysis 202
pku-yuangroup/languagebind Extending pretraining models to handle multiple modalities by aligning language and video representations 723
zlpure/machine-learning--coursera A comprehensive solution to machine learning assignments on Coursera with MATLAB code 55
zjunlp/easyedit A framework that provides an easy-to-use interface for editing knowledge in large language models. 1,931
mcs07/molvs Tool for validating and standardizing chemical structures to improve data quality and facilitate comparisons. 159
instadeepai/mava A research-friendly codebase for experimenting with multi-agent reinforcement learning in JAX 734
dcdmllm/cheetah A large language model designed to understand and generate instructions with accompanying visual content 356
vimalabs/vima An implementation of a general-purpose robot learning model using multimodal prompts 774
mplatvoet/funktional A Kotlin implementation of monadic types for functional programming. 10
fuxiaoliu/mmc Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. 84