LLMDataHub
Datasets
A curated collection of high-quality datasets for training large language models.
A quick guide (especially) for trending instruction finetuning datasets
3k stars
50 watching
169 forks
last commit: 12 months ago
Linked from 1 awesome list
chatbotchatgptdatasetllm
Related projects:
Repository | Description | Stars |
---|---|---|
mooler0410/llmspracticalguide | A curated list of resources to help developers navigate the landscape of large language models and their applications in NLP | 9,489 |
mlabonne/llm-course | A comprehensive course and resource package on building and deploying Large Language Models (LLMs) | 39,120 |
lm-sys/fastchat | An open platform for training, serving, and evaluating large language models used in chatbots. | 36,975 |
young-geng/easylm | A framework for training and serving large language models using JAX/Flax | 2,409 |
rasbt/llms-from-scratch | Developing and pretraining a GPT-like Large Language Model from scratch | 32,908 |
bobazooba/xllm | A tool for training and fine-tuning large language models using advanced techniques | 380 |
instruction-tuning-with-gpt-4/gpt-4-llm | This project generates instruction-following data using GPT-4 to fine-tune large language models for real-world tasks. | 4,210 |
alpha-vllm/llama2-accessory | An open-source toolkit for pretraining and fine-tuning large language models | 2,720 |
thunlp/plmpapers | Compiles and organizes key papers on pre-trained language models, providing a resource for developers and researchers. | 3,328 |
phoebussi/alpaca-cot | Provides a unified interface for fine-tuning large language models with parameter-efficient methods and instruction collection data | 2,619 |
huggingface/alignment-handbook | Provides training recipes and resources to align language models with human preferences | 4,677 |
shm007g/llama-cult-and-more | Provides insights and practical guides for building and using large language models. | 427 |
peremartra/large-language-model-notebooks-course | A practical course teaching large language models and their applications through hands-on projects using OpenAI API and Hugging Face library. | 1,281 |
optimalscale/lmflow | A toolkit for finetuning large language models and providing efficient inference capabilities | 8,273 |
hiyouga/llama-factory | A unified platform for fine-tuning multiple large language models with various training approaches and methods | 34,436 |