alpaca-chinese-dataset
Chinese prompt dataset
A dataset for training and fine-tuning large language models on Chinese text prompts.
alpaca中文指令微调数据集
392 stars
7 watching
25 forks
last commit: almost 2 years ago alpacachatglmllm
Related projects:
Repository | Description | Stars |
---|---|---|
| Develops and maintains a Chinese language model finetuned on LLaMA, used for text generation and summarization tasks. | 711 |
| Develops a multimodal Chinese language model with visual capabilities | 429 |
| Provides a resource library for training Chinese conversation models with pre-processed datasets and a framework for fine-tuning the models | 1,162 |
| A dataset of multi-turn conversations between users and AI models. | 164 |
| Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks | 230 |
| A cleaned and curated version of an Alpaca dataset used to train a large language model | 1,525 |
| Recreated weights from Stanford Alpaca model fine-tuned for specific task | 406 |
| A collection of datasets and tools for NLP tasks on Chinese texts, including part-of-speech tagging, named entity recognition, and question answering. | 529 |
| A parallel corpus of Asian languages with linguistic annotations and data formats for natural language processing research. | 49 |
| A research project that develops a Traditional-Chinese instruction-following language model using Alpaca as a basis. | 134 |
| A large dataset of human matting images and corresponding results for training person segmentation models. | 615 |
| Exploring various LLMs and their applications in natural language processing and related areas | 1,854 |
| Trains and evaluates a Chinese language model using adversarial training on a large corpus. | 140 |
| An implementation of a large language model for Chinese text processing, focusing on MoE (Multi-Headed Attention) architecture and incorporating a vast vocabulary. | 645 |
| A large-scale Chinese corpus for pre-training language models. | 927 |