CSL

Chinese Scientific Dataset

A large-scale dataset for natural language processing tasks focused on Chinese scientific literature, providing tools and benchmarks for NLP research.

[COLING 2022] CSL: A Large-scale Chinese Scientific Literature Dataset 中文科学文献数据集

GitHub

568 stars
15 watching
58 forks
Language: Python
last commit: over 1 year ago
chinese-nlpdatasetmachine-learningscientific-publications

Related projects:

Repository Description Stars
ys-zong/vl-icl A benchmarking suite for multimodal in-context learning models 28
01-ai/yi A series of large language models trained from scratch to excel in multiple NLP tasks 7,699
vlang/vsl A comprehensive V library for high-performance scientific computations and artificial intelligence. 355
yunwentechnology/unilm This project provides pre-trained models for natural language understanding and generation tasks using the UniLM architecture. 438
ymcui/cmrc2018 A collection of data for evaluating Chinese machine reading comprehension systems 415
crownpku/small-chinese-corpus A collection of datasets and tools for NLP tasks on Chinese texts, including part-of-speech tagging, named entity recognition, and question answering. 531
scicloj/tablecloth A dataset manipulation library built on top of tech.ml.dataset, providing a simplified API for data processing and analysis. 303
nyu-mll/jiant A toolkit for natural language processing research enabling multitask learning and transfer learning. 1,644
techascent/tech.ml.dataset A Clojure library for efficient tabular data processing and analysis 681
ymcui/chinese-xlnet Provides pre-trained models for Chinese natural language processing tasks using the XLNet architecture 1,653
cisnlp/glotlid A language identification model that supports over 2000 languages and can be used for various NLP tasks. 90
scicloj/scicloj.ml.clj-djl Provides pre-trained machine learning models for natural language processing tasks using Clojure and the clj-djl framework. 0
mirfan899/urdu A collection of Urdu language datasets for various NLP tasks and applications 71
scicloj/scicloj.ml A machine learning library built on top of Clojure with a focus on data preprocessing and model creation 216
cstjean/scikitlearn.jl A Julia implementation of popular machine learning algorithms and interfaces. 544