korkor_pilot
Hungarian Corpus
A large annotated corpus of Hungarian text with various linguistic annotations, split into development and test datasets for natural language processing tasks.
2 stars
2 watching
1 forks
Language: Python
last commit: about 2 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| A Hungarian language named entity annotated corpus containing 1 million tokens with morphological annotation layers and various source files. | 15 |
| A collection of 9,076 annotated sentences in Hungarian to evaluate linguistic acceptability and grammaticality | 1 |
| A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language. | 0 |
| A large annotated dataset of Hungarian text with over 30 entity types derived from various sources and formats. | 1 |
| An open-source wrapper around LLMs to extract structured data from text | 1,638 |
| A collection of Ukrainian Twitter texts for linguistic analysis and research | 15 |
| A large corpus of Hungarian novels with annotated texts and metadata, developed by the Department of Digital Humanities at Eötvös Loránd University. | 4 |
| An attempt to transcribe Cyrillic text into Hungarian script for a large dataset of WWII prisoner-of-war records | 23 |
| A text processing system designed to handle various tasks in Hungarian language processing using Python and TSV-based data exchange. | 28 |
| A programming language based on Hungarian notation with the aim of improving source code readability and avoiding ambiguities. | 8 |
| Harmonized tagset and annotation scheme for Hungarian morphological analysers | 4 |
| A Pytorch implementation of a neural network model for machine translation | 47 |
| Creating a balanced corpus of modern Ukrainian language with 1 million words, based on the Brown Corpus model. | 110 |
| An industrial-strength natural language processing library for Hungarian language text analysis | 158 |
| A lightweight Python library for natural language processing in Hungarian | 29 |