cmrc2018
Reading dataset
A collection of data for evaluating Chinese machine reading comprehension systems
A Span-Extraction Dataset for Chinese Machine Reading Comprehension (CMRC 2018)
419 stars
12 watching
87 forks
Language: Python
last commit: over 2 years ago bertnatural-language-processingquestion-answeringreading-comprehension
Related projects:
Repository | Description | Stars |
---|---|---|
ymcui/macbert | Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks | 646 |
crownpku/small-chinese-corpus | A collection of datasets and tools for NLP tasks on Chinese texts, including part-of-speech tagging, named entity recognition, and question answering. | 529 |
ymcui/chinese-mixtral | Develops and releases Mixtral-based models for natural language processing tasks with a focus on Chinese text generation and understanding | 589 |
michael-wzhu/promptcblue | A large-scale instruction-tuning dataset for multi-task and few-shot learning in the medical domain | 328 |
mengtingwan/goodreads | Provides code samples and notebooks to download, read, and analyze Goodreads datasets for research purposes. | 252 |
ymcui/chinese-electra | Provides pre-trained Chinese language models based on the ELECTRA framework for natural language processing tasks | 1,405 |
ymcui/chinese-mobilebert | An implementation of MobileBERT, a pre-trained language model, in Python for NLP tasks. | 81 |
ydli-ai/csl | A large-scale dataset for natural language processing tasks focused on Chinese scientific literature, providing tools and benchmarks for NLP research. | 582 |
hit-scir/semeval-2016 | A benchmarking dataset and evaluation framework for semantic dependency parsing in Chinese language texts. | 135 |
ymcui/chinese-xlnet | Provides pre-trained models for Chinese natural language processing tasks using the XLNet architecture | 1,652 |
techascent/tech.ml.dataset | A Clojure library for efficient tabular data processing and analysis | 687 |
felixgithub2017/mmcu | Measures the understanding of massive multitask Chinese datasets using large language models | 87 |
ymcui/pert | Develops a pre-trained language model to learn semantic knowledge from permuted text without mask labels | 356 |
ymcui/lert | A pre-trained language model designed to leverage linguistic features and outperform comparable baselines on Chinese natural language understanding tasks. | 202 |
pratyushmaini/llm_dataset_inference | Detects whether a given text sequence is part of the training data used to train a large language model. | 23 |