nlp-datasets
Text datasets
A collection of text datasets for use in Natural Language Processing
Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP)
6k stars
234 watching
965 forks
last commit: about 2 years ago
Linked from 5 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
| A curated list of Natural Language Processing datasets used to train and evaluate NLP models. | 919 |
| A comprehensive repository tracking progress in NLP tasks and their corresponding datasets. | 22,742 |
| A collection of Urdu language datasets for various NLP tasks and applications | 71 |
| A curated collection of NLP datasets and resources for Bahasa Indonesia | 496 |
| A comprehensive NLP library for building conversational AI systems with entity extraction, sentiment analysis, language identification, and more. | 6,301 |
| An NLP project offering various text classification models and techniques for deep learning exploration | 7,881 |
| A Python library for natural language processing tasks in many human languages. | 7,315 |
| A collection of pre-trained natural language processing models | 170 |
| A comprehensive toolkit for natural language processing tasks in Python. | 13,694 |
| A toolkit for creating and using natural language prompts to enable large language models to generalize to new tasks. | 2,718 |
| A curated collection of German language resources and tools for natural language processing | 453 |
| Provides a collection of datasets for natural language processing in Ukrainian. | 57 |
| A Java-based suite of tools for natural language processing and analysis | 9,727 |
| A curated collection of 100 essential NLP papers for researchers and developers to understand the foundations of natural language processing | 3,762 |
| A curated collection of natural language processing resources and libraries for developers to access and build upon | 73 |