regenykorpusz
Novel Corpus
A large corpus of Hungarian novels with annotated texts and metadata, developed by the Department of Digital Humanities at Eötvös Loránd University.
4 stars
4 watching
1 forks
last commit: 2 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| A comprehensive annotated corpus of Hungarian drama texts, including structural annotations and grammatical features. | 1 |
| A large corpus of annotated Hungarian poems in XML format, with various annotations including grammatical features and sound patterns. | 7 |
| A collection of 9,076 annotated sentences in Hungarian to evaluate linguistic acceptability and grammaticality | 1 |
| A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language. | 0 |
| A large annotated corpus of Hungarian text with various linguistic annotations, split into development and test datasets for natural language processing tasks. | 2 |
| A collection of Ukrainian folktales translated into English for linguistic and literary research purposes. | 0 |
| A collection of Galician language data in JSON format. | 2 |
| Large language models designed to perform well in multiple languages and address performance issues with current multilingual models. | 476 |
| A manually tagged Indonesian language corpus in tab-separated file format | 88 |
| A Hungarian language named entity annotated corpus containing 1 million tokens with morphological annotation layers and various source files. | 15 |
| A manually tagged Indonesian corpus consisting of parse-trees from sentences. | 36 |
| A large-scale bilingual corpus collection for language technology and NLP tasks, containing English-Vietnamese translations and bitexts. | 42 |
| A dataset and annotation scheme for Hungarian causal reasoning tasks. | 1 |
| An industrial-strength natural language processing library for Hungarian language text analysis | 158 |
| A repository of transcribed 19th century German texts from various sources. | 8 |