LLMDataHub
Datasets
A curated collection of high-quality datasets for training large language models.
A quick guide (especially) for trending instruction finetuning datasets
3k stars
50 watching
174 forks
last commit: over 1 year ago
Linked from 1 awesome list
chatbotchatgptdatasetllm
Related projects:
Repository | Description | Stars |
---|---|---|
| A curated list of resources to help developers navigate the landscape of large language models and their applications in NLP | 9,551 |
| A comprehensive course and resource package on building and deploying Large Language Models (LLMs) | 40,053 |
| An open platform for training, serving, and evaluating large language models used in chatbots. | 37,269 |
| A framework for training and serving large language models using JAX/Flax | 2,428 |
| Developing and pretraining a GPT-like Large Language Model from scratch | 35,405 |
| A tool for training and fine-tuning large language models using advanced techniques | 387 |
| This project generates instruction-following data using GPT-4 to fine-tune large language models for real-world tasks. | 4,244 |
| An open-source toolkit for pretraining and fine-tuning large language models | 2,732 |
| Compiles and organizes key papers on pre-trained language models, providing a resource for developers and researchers. | 3,331 |
| Provides a unified interface for fine-tuning large language models with parameter-efficient methods and instruction collection data | 2,640 |
| Provides recipes and guidelines for training language models to align with human preferences and AI goals | 4,800 |
| Provides insights and practical guides for building and using large language models. | 427 |
| A practical course teaching large language models and their applications through hands-on projects using OpenAI API and Hugging Face library. | 1,338 |
| A toolkit for fine-tuning and inferring large machine learning models | 8,312 |
| A tool for efficiently fine-tuning large language models across multiple architectures and methods. | 36,219 |