ua-coref
Ukrainian Coref Dataset
A dataset and tools for coreference resolution in Ukrainian language using OntoNotes 5.0 data and machine translation models.
Silver Data for Coreference Resolution in Ukrainian
7 stars
1 watching
0 forks
Language: Python
last commit: 12 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
universaldependencies/ud_ukrainian-iu | A dataset of annotated text in Ukrainian with standardized formatting and annotation guidelines. | 27 |
fido-ai/ua-datasets | Provides a collection of datasets for natural language processing in Ukrainian. | 57 |
amakukha/stemmers_ukrainian | A novel stemmer for the Ukrainian language trained with AI | 28 |
robinhad/kruk | A collection of Ukrainian language models and datasets for natural language processing tasks. | 86 |
grammarly/ua-gec | A collection of annotated data and tools for improving the grammar and fluency of Ukrainian texts. | 255 |
kzl/universal-computation | An official codebase providing a framework for using Pretrained Transformers as universal computation engines in various tasks and domains. | 245 |
felixgwu/img_classification_pk_pytorch | A PyTorch project for comparing image classification models and facilitating quick experiment setup | 366 |
helsinki-nlp/ukrainianlt | A collection of Ukrainian language tools and resources for machine translation, natural language processing, and text translation. | 30 |
pymorphy2/pymorphy2 | A morphological analyzer and generator for Russian and Ukrainian languages | 1,127 |
brown-uk/corpus | Creating a balanced corpus of modern Ukrainian language with 1 million words, based on the Brown Corpus model. | 110 |
kefirski/bytenet | A Pytorch implementation of a neural network model for machine translation | 47 |
khrystyna-skopyk/ukr_spell_check | Spelling correction system for the Ukrainian language using noisy channel model | 3 |
pku-yuangroup/chat-univi | A framework for unified visual representation in image and video understanding models, enabling efficient training of large language models on multimodal data. | 895 |
pythainlp/prachathai-67k | An article classification dataset created from news articles scraped from Prachathai.com with multiple benchmark models for multi-label classification | 16 |