nlp-datasets
Text datasets
A collection of text datasets for use in Natural Language Processing
Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP)
6k stars
234 watching
963 forks
last commit: almost 2 years ago
Linked from 5 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
karthikncode/nlp-datasets | A curated list of Natural Language Processing datasets used to train and evaluate NLP models. | 919 |
sebastianruder/nlp-progress | A comprehensive repository tracking progress in NLP tasks and their corresponding datasets. | 22,715 |
mirfan899/urdu | A collection of Urdu language datasets for various NLP tasks and applications | 71 |
louisowen6/nlp_bahasa_resources | A curated collection of NLP datasets and resources for Bahasa Indonesia | 489 |
axa-group/nlp.js | A comprehensive NLP library for building conversational AI systems with entity extraction, sentiment analysis, language identification, and more. | 6,283 |
brightmart/text_classification | An NLP project offering various text classification models and techniques for deep learning exploration | 7,861 |
stanfordnlp/stanza | A Python library for natural language processing tasks in many human languages. | 7,294 |
balavenkatesh3322/nlp-pretrained-model | A collection of pre-trained natural language processing models | 170 |
nltk/nltk | A comprehensive toolkit for natural language processing tasks in Python. | 13,646 |
bigscience-workshop/promptsource | A toolkit for creating and using natural language prompts to enable large language models to generalize to new tasks. | 2,700 |
adbar/german-nlp | A curated collection of German language resources and tools for natural language processing | 451 |
fido-ai/ua-datasets | Provides a collection of datasets for natural language processing in Ukrainian. | 56 |
stanfordnlp/corenlp | A Java-based suite of tools for natural language processing and analysis | 9,704 |
mhagiwara/100-nlp-papers | A curated collection of 100 essential NLP papers for researchers and developers to understand the foundations of natural language processing | 3,753 |
pawangeek/deep-nlp-resources | A curated collection of natural language processing resources and libraries for developers to access and build upon | 72 |