Gossiping-Chinese-Corpus

Forum dataset

A collection of question-answer pairs extracted from online Chinese forums.

PTT 八卦版問答中文語料

GitHub

238 stars
13 watching
35 forks
Language: Jupyter Notebook
last commit: about 1 month ago
Linked from 1 awesome list

chatbotchatbot-corpuschinese-chatbotchinese-corpuschinese-datasetchinese-nlpcorpusdatasetdialogpttquestion-answering

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
candlewill/dialog_corpus A collection of datasets used to train and improve chatbot systems in both English and Chinese. 2,033
chatopera/insuranceqa-corpus-zh An insurance industry conversation corpus with pre-processed data for natural language processing and question answering tasks. 1,020
hikariming/chat-dataset-baseline Provides a resource library for training Chinese conversation models with pre-processed datasets and a framework for fine-tuning the models 1,157
abbey4799/cutegpt A conversational language model developed to improve understanding of complex instructions and Chinese vocabulary. 62
crownpku/small-chinese-corpus A collection of datasets and tools for NLP tasks on Chinese texts, including part-of-speech tagging, named entity recognition, and question answering. 531
thu-coai/cdial-gpt A large-scale Chinese conversation dataset and pre-trained dialog models for text generation 1,782
suprityoung/zhongjing Develops a large language model capable of handling complex medical conversations with high accuracy and professionalism. 316
clue-ai/chatyuan Large language model for dialogue support in multiple languages 1,902
cluebenchmark/cluecorpus2020 A large-scale pre-training corpus for Chinese language models 925
songys/chatbot_data Data collection and model development for a conversational AI chatbot focused on emotional wellness support in Korean. 355
aceimnorstuvwxz/dgk_lost_conv A collection of preprocessed Chinese conversation corpora for use in natural language processing tasks. 1,088
thu-coai/eva Pre-trained chatbot models for Chinese open-domain dialogue systems 305
clue-ai/chatyuan-7b An updated version of a large language model designed to improve performance on multiple tasks and datasets 13
wangrongsheng/ivygpt Develops large language models to support medical diagnoses and provide helpful suggestions 59
cluebenchmark/cluepretrainedmodels Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models. 804