HuLU

Language datasets

A collection of linguistic datasets and benchmarks for natural language understanding tasks

Hungarian Language Understanding Benchmark Kit

GitHub

9 stars
3 watching
0 forks
last commit: 4 months ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
nytud/huws A dataset of manually curated Hungarian sentences with ambiguous wordings that require world knowledge and reasoning for resolution. 1
nytud/husst A dataset and benchmarking kit for evaluating language understanding in Hungarian 1
nytud/hucola A dataset of Hungarian sentences annotated for their grammatical acceptability. 1
nytud/happ A dataset of Hungarian translations of human-language examples to test anaphora resolution algorithms 1
nytud/huwnli A dataset and toolset for Hungarian anaphora resolution in natural language inference tasks 0
nytud/pws A collection of parallel corpora of Winograd schemata in multiple languages 0
nytud/hunlp-gate A collection of Hungarian NLP tools integrated as GATE processing resources 8
nytud/panmorph Harmonized tagset and annotation scheme for Hungarian morphological analysers 4
nytud/machine-translation Provides machine translation models and a demo site for Hungarian language translations 5
xuefuzhao/instructionwild Creating a large-scale user-based instruction dataset for natural language processing research and development 453
alexa/massive A collection of tools and modeling code for a large multilingual Natural Language Understanding dataset 538
turkunlp/wikibert Provides pre-trained language models derived from Wikipedia texts for natural language processing tasks 34
karthikncode/nlp-datasets A curated list of Natural Language Processing datasets used to train and evaluate NLP models. 919
nytud/hadifogoly-adatbazis An attempt to transcribe Cyrillic text into Hungarian script for a large dataset of WWII prisoner-of-war records 23
novakat/nytk-nerkor-cars-ontonotespp A large annotated dataset of Hungarian text with over 30 entity types derived from various sources and formats. 1