CDial-GPT

Conversation Dataset

A large-scale Chinese conversation dataset and pre-trained dialog models for text generation

A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models

GitHub

2k stars
28 watching
255 forks
Language: Python
last commit: over 1 year ago
dialoguegptgpt-2lcccpytorchtext-generation

Related projects:

Repository Description Stars
thu-coai/opd A large-scale pre-trained dialogue model for Chinese language 74
thu-coai/eva Pre-trained chatbot models for Chinese open-domain dialogue systems 305
neukg/techgpt-2.0 An advanced language model designed to generate human-like responses in various domains and applications 101
clue-ai/chatyuan Large language model for dialogue support in multiple languages 1,902
skyworkaigc/skytext-chinese-gpt3 An AI-powered text generation model trained on Chinese data to perform various tasks such as conversation, translation, and content creation. 419
radi-cho/datasetgpt A command-line interface to generate textual datasets with Large Language Models 293
google-research-datasets/dstc8-schema-guided-dialogue A collection of datasets and tools for developing virtual assistants that can understand and respond to human conversations 548
zcli-charlie/batgpt A large language model designed to support long context conversations with improved efficiency and effectiveness 38
zake7749/gossiping-chinese-corpus A collection of question-answer pairs extracted from online Chinese forums. 238
imcaspar/gpt2-ml A collection of pre-trained GPT2 models and training scripts for multiple languages, including Chinese. 1,716
thu-coai/safety-prompts Provides a dataset of safety prompts to evaluate and improve the safety of large language models. 870
aceimnorstuvwxz/dgk_lost_conv A collection of preprocessed Chinese conversation corpora for use in natural language processing tasks. 1,088
ailab-cvc/gpt4tools An intelligent system that enables automatic control and utilization of visual foundation models to interact with images in conversational settings. 760
2noise/chattts A generative speech model designed to synthesize natural and expressive dialogue in interactive conversations. 32,347
hikariming/chat-dataset-baseline Provides a resource library for training Chinese conversation models with pre-processed datasets and a framework for fine-tuning the models 1,157