corpus
Language corpus
Creating a balanced corpus of modern Ukrainian language with 1 million words, based on the Brown Corpus model.
Браунський корпус української мови
110 stars
23 watching
14 forks
Language: Groovy
last commit: 5 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| Demonstrates NLP API from LanguageTool for Ukrainian language using Groovy | 72 |
| A tool to generate dictionaries for Ukrainian language by combining morphological rules and POS tags | 560 |
| A collection of Ukrainian Twitter texts for linguistic analysis and research | 15 |
| A collection of Ukrainian folktales translated into English for linguistic and literary research purposes. | 0 |
| A dataset of annotated text in Ukrainian with standardized formatting and annotation guidelines. | 27 |
| A dictionary of Ukrainian abbreviations with definitions and comments | 3 |
| A novel stemmer for the Ukrainian language trained with AI | 28 |
| A collection of 9,076 annotated sentences in Hungarian to evaluate linguistic acceptability and grammaticality | 1 |
| A dictionary of Ukrainian words with tone annotations generated from expert ratings and machine learning models | 47 |
| A dictionary of word stresses in the Ukrainian language | 19 |
| A Ukrainian NER corpus and annotation dataset for training and evaluating named entity recognition models. | 90 |
| A collection of annotated data and tools for improving the grammar and fluency of Ukrainian texts. | 255 |
| A large annotated corpus of Hungarian text with various linguistic annotations, split into development and test datasets for natural language processing tasks. | 2 |
| A dictionary of words with different pronunciation and/or meanings in Ukrainian | 3 |
| A multilingual parallel corpus created from translations of the Bible. | 177 |