cmrc2018
Reading dataset
A collection of data for evaluating Chinese machine reading comprehension systems
A Span-Extraction Dataset for Chinese Machine Reading Comprehension (CMRC 2018)
415 stars
12 watching
87 forks
Language: Python
last commit: over 2 years ago bertnatural-language-processingquestion-answeringreading-comprehension
Related projects:
Repository | Description | Stars |
---|---|---|
ymcui/macbert | Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks | 645 |
crownpku/small-chinese-corpus | A collection of datasets and tools for NLP tasks on Chinese texts, including part-of-speech tagging, named entity recognition, and question answering. | 531 |
ymcui/chinese-mixtral | Develops and releases Mixtral-based models for natural language processing tasks with a focus on Chinese text generation and understanding | 584 |
michael-wzhu/promptcblue | A large-scale instruction-tuning dataset for multi-task and few-shot learning in the medical domain | 323 |
mengtingwan/goodreads | Provides code samples and notebooks to download, read, and analyze Goodreads datasets for research purposes. | 251 |
ymcui/chinese-electra | Provides pre-trained Chinese language models based on the ELECTRA framework for natural language processing tasks | 1,403 |
ymcui/chinese-mobilebert | An implementation of MobileBERT, a pre-trained language model, in Python for NLP tasks. | 80 |
ydli-ai/csl | A large-scale dataset for natural language processing tasks focused on Chinese scientific literature, providing tools and benchmarks for NLP research. | 568 |
hit-scir/semeval-2016 | A benchmarking dataset and evaluation framework for semantic dependency parsing in Chinese language texts. | 135 |
ymcui/chinese-xlnet | Provides pre-trained models for Chinese natural language processing tasks using the XLNet architecture | 1,653 |
techascent/tech.ml.dataset | A Clojure library for efficient tabular data processing and analysis | 681 |
felixgithub2017/mmcu | Evaluates the semantic understanding capabilities of large Chinese language models using a multimodal dataset. | 87 |
ymcui/pert | Develops a pre-trained language model to learn semantic knowledge from permuted text without mask labels | 354 |
ymcui/lert | A pre-trained language model designed to leverage linguistic features and outperform comparable baselines on Chinese natural language understanding tasks. | 202 |
pratyushmaini/llm_dataset_inference | Detects whether a given text sequence is part of the training data used to train a large language model. | 23 |