ua-gec
Ukrainian text correction dataset
A collection of annotated data and tools for improving the grammar and fluency of Ukrainian texts.
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
255 stars
13 watching
22 forks
Language: Macaulay2
last commit: 10 months ago
Linked from 1 awesome list
corpuscorpus-datacorpus-toolsdatasetgecgrammatical-error-correctionnatural-language-processingnlp-datasetsukrainian-language
Related projects:
Repository | Description | Stars |
---|---|---|
fido-ai/ua-datasets | Provides a collection of datasets for natural language processing in Ukrainian. | 57 |
proger/uk4b | Develops pretraining and finetuning techniques for language models using metadata-conditioned text generation | 18 |
universaldependencies/ud_ukrainian-iu | A dataset of annotated text in Ukrainian with standardized formatting and annotation guidelines. | 27 |
brown-uk/corpus | Creating a balanced corpus of modern Ukrainian language with 1 million words, based on the Brown Corpus model. | 110 |
khrystyna-skopyk/ukr_spell_check | Spelling correction system for the Ukrainian language using noisy channel model | 3 |
pkuchmiichuk/ua-coref | A dataset and tools for coreference resolution in Ukrainian language using OntoNotes 5.0 data and machine translation models. | 7 |
thu-coai/cdial-gpt | A large-scale Chinese conversation dataset and pre-trained dialog models for text generation | 1,799 |
brown-uk/nlp_uk | Demonstrates NLP API from LanguageTool for Ukrainian language using Groovy | 72 |
universaldependencies/ud_galician-ctg | This is a collection of annotated text data for the Galician language. | 1 |
lang-uk/ner-uk | A Ukrainian NER corpus and annotation dataset for training and evaluating named entity recognition models. | 90 |
amakukha/stemmers_ukrainian | A novel stemmer for the Ukrainian language trained with AI | 28 |
kateryna-bobrovnyk/ukr-twi-corpus | A collection of Ukrainian Twitter texts for linguistic analysis and research | 15 |
nytud/hucola | A collection of 9,076 annotated sentences in Hungarian to evaluate linguistic acceptability and grammaticality | 1 |
irakli97/frequency_dictionary_ge_363_202 | A merged dataset of Georgian words with frequency information | 2 |