dgk_lost_conv

Conversation corpus

A collection of preprocessed Chinese conversation corpora for use in natural language processing tasks.

dgk_lost_conv 中文对白语料 chinese conversation corpus

GitHub

1k stars
68 watching
443 forks
Language: Python
last commit: over 3 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
thu-coai/cdial-gpt A large-scale Chinese conversation dataset and pre-trained dialog models for text generation 1,782
zake7749/gossiping-chinese-corpus A collection of question-answer pairs extracted from online Chinese forums. 238
chatopera/insuranceqa-corpus-zh An insurance industry conversation corpus with pre-processed data for natural language processing and question answering tasks. 1,020
hikariming/chat-dataset-baseline Provides a resource library for training Chinese conversation models with pre-processed datasets and a framework for fine-tuning the models 1,157
scutcyr/bianque Develops and deploys conversational AI models for health-related applications by leveraging large-scale datasets and collaborative research 732
clue-ai/chatyuan Large language model for dialogue support in multiple languages 1,902
zcli-charlie/batgpt A large language model designed to support long context conversations with improved efficiency and effectiveness 38
cluebenchmark/cluecorpus2020 A large-scale pre-training corpus for Chinese language models 925
abbey4799/cutegpt A conversational language model developed to improve understanding of complex instructions and Chinese vocabulary. 62
andrewnguonly/chatabstractions Provides a framework for creating custom chat models with dynamic failover and load balancing features 79
crownpku/small-chinese-corpus A collection of datasets and tools for NLP tasks on Chinese texts, including part-of-speech tagging, named entity recognition, and question answering. 531
mmirman/caledon A programming language for logical and conversational interactions with computers using dependently typed higher order logic. 170
thu-coai/eva Pre-trained chatbot models for Chinese open-domain dialogue systems 305
candlewill/dialog_corpus A collection of datasets used to train and improve chatbot systems in both English and Chinese. 2,033
thu-coai/opd A large-scale pre-trained dialogue model for Chinese language 74