ua-datasets

Ukrainian NLP datasets

Provides a collection of datasets for natural language processing in Ukrainian.

A collection of datasets for Ukrainian language

GitHub

56 stars
3 watching
2 forks
Language: Python
last commit: 4 months ago
Linked from 1 awesome list

datasetnatural-language-processingnlpnlp-datasetsquestion-answeringtext-classificationtoken-classificationukrainian-language

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
robinhad/kruk A collection of Ukrainian language models and datasets for natural language processing tasks. 84
helsinki-nlp/ukrainianlt A collection of Ukrainian language tools and resources for machine translation, natural language processing, and text translation. 30
mirfan899/urdu A collection of Urdu language datasets for various NLP tasks and applications 71
karthikncode/nlp-datasets A curated list of Natural Language Processing datasets used to train and evaluate NLP models. 919
lang-uk/ner-uk A Ukrainian NER corpus and annotation dataset for training and evaluating named entity recognition models. 90
grammarly/ua-gec A collection of annotated data and tools for improving the grammar and fluency of Ukrainian texts. 255
alexa/massive A collection of tools and modeling code for a large multilingual Natural Language Understanding dataset 538
poio-nlp/poio-corpus A collection of language resources extracted from publicly available sources. 7
amakukha/stemmers_ukrainian A novel stemmer for the Ukrainian language trained with AI 28
universaldependencies/ud_ukrainian-iu A dataset of annotated text in Ukrainian with standardized formatting and annotation guidelines. 27
piskvorky/gensim-data A repository of pre-trained NLP models and corpora for text processing. 988
pkuchmiichuk/ua-coref A dataset and tools for coreference resolution in Ukrainian language using OntoNotes 5.0 data and machine translation models. 7
sandeep42/anuvada This is an open source PyTorch library providing tools and models to explain the predictions of deep neural networks for natural language processing tasks. 19
nytud/hulu A collection of linguistic datasets and benchmarks for natural language understanding tasks 9
sdadas/polish-nlp-resources Pre-trained models and resources for Natural Language Processing in Polish 323