CSL
Chinese Scientific Dataset
A large-scale dataset for natural language processing tasks focused on Chinese scientific literature, providing tools and benchmarks for NLP research.
[COLING 2022] CSL: A Large-scale Chinese Scientific Literature Dataset 中文科学文献数据集
582 stars
15 watching
57 forks
Language: Python
last commit: over 1 year ago chinese-nlpdatasetmachine-learningscientific-publications
Related projects:
Repository | Description | Stars |
---|---|---|
| A benchmarking suite for multimodal in-context learning models | 31 |
| A series of large language models trained from scratch to excel in multiple NLP tasks | 7,743 |
| A comprehensive V library for high-performance scientific computations and artificial intelligence. | 358 |
| This project provides pre-trained models and tools for natural language understanding (NLU) and generation (NLG) tasks in Chinese. | 439 |
| A collection of data for evaluating Chinese machine reading comprehension systems | 419 |
| A collection of datasets and tools for NLP tasks on Chinese texts, including part-of-speech tagging, named entity recognition, and question answering. | 529 |
| A dataset manipulation library built on top of tech.ml.dataset, providing a simplified API for data processing and analysis. | 308 |
| A toolkit for natural language processing research enabling multitask learning and transfer learning. | 1,650 |
| A Clojure library for efficient tabular data processing and analysis | 687 |
| Provides pre-trained models for Chinese natural language processing tasks using the XLNet architecture | 1,652 |
| A language identification model that supports over 2000 languages and can be used for various NLP tasks. | 106 |
| Provides pre-trained machine learning models for natural language processing tasks using Clojure and the clj-djl framework. | 0 |
| A collection of Urdu language datasets for various NLP tasks and applications | 71 |
| A Clojure machine learning library providing idiomatic and harmonized support for various classification, regression, clustering, and unsupervised models. | 220 |
| A Julia implementation of popular machine learning algorithms and interfaces. | 547 |