HuCOLA

Hungarian Corpus

A collection of 9,076 annotated sentences in Hungarian to evaluate linguistic acceptability and grammaticality

Hungarian Corpus of Linguistic Acceptability

GitHub

1 stars
2 watching
0 forks
last commit: 6 months ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
nytud/hucopa A dataset and annotation scheme for Hungarian causal reasoning tasks. 1
nytud/hulu A collection of linguistic datasets and benchmarks for natural language understanding tasks 8
nytud/nytk-nerkor A Hungarian language named entity annotated corpus containing 1 million tokens with morphological annotation layers and various source files. 15
nytud/huws A dataset of manually curated Hungarian sentences with ambiguous wordings that require world knowledge and reasoning for resolution. 1
nytud/panmorph Harmonized tagset and annotation scheme for Hungarian morphological analysers 4
nytud/husst A dataset of annotated sentences for training and evaluating sentiment analysis models in the Hungarian language. 1
vadno/korkor_pilot A large annotated corpus of Hungarian text with various linguistic annotations, split into development and test datasets for natural language processing tasks. 2
poltextlab/hunempoli_corpus A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language. 0
nytud/hunlp-gate A collection of Hungarian NLP tools integrated as GATE processing resources 8
nytud/emtsv A text processing system designed to handle various tasks in Hungarian language processing using Python and TSV-based data exchange. 28
nytud/hadifogoly-adatbazis An attempt to transcribe Cyrillic text into Hungarian script for a large dataset of WWII prisoner-of-war records 23
nytud/quntoken A C++ tokenizer that tokenizes Hungarian text 14
nytud/machine-translation Provides machine translation models and a demo site for Hungarian language translations 5
nytud/huwnli A dataset and toolset for Hungarian anaphora resolution in natural language inference tasks 0
elte-dh/regenykorpusz A large corpus of Hungarian novels with annotated texts and metadata, developed by the Department of Digital Humanities at Eötvös Loránd University. 4