nlp-datasets
NLP datasets
A curated list of Natural Language Processing datasets used to train and evaluate NLP models.
A list of datasets/corpora for NLP tasks, in reverse chronological order.
919 stars
81 watching
253 forks
last commit: about 5 years ago
Linked from 3 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
| A collection of Urdu language datasets for various NLP tasks and applications | 71 |
| A curated collection of NLP datasets and resources for Bahasa Indonesia | 496 |
| A collection of pre-trained natural language processing models | 170 |
| A tool that simplifies the process of preparing and manipulating natural language processing datasets | 243 |
| A collection of libraries and tools for natural language processing and reinforcement learning. | 39 |
| A collection of annotated NLP resources for the Indonesian language | 279 |
| Provides a collection of datasets for natural language processing in Ukrainian. | 57 |
| A linguistic framework for natural language processing tasks. | 216 |
| Comprehensive resource for learning natural language processing (NLP) with a structured course outline and recommended readings. | 834 |
| A comprehensive toolkit for Natural Language Processing tasks in Indic languages, providing pre-trained models and datasets. | 825 |
| A collection of Ruby Natural Language Processing libraries and tools | 1,272 |
| A Python-based library providing common text processing and Natural Language Processing tools for Indian languages | 561 |
| A repository of pre-trained NLP models and corpora for text processing. | 990 |
| A curated collection of Norwegian NLP resources, including models, libraries, and datasets. | 178 |
| A Python library offering natural language processing capabilities for pre-modern languages | 843 |