HuLU

Language datasets

A collection of linguistic datasets and benchmarks for natural language understanding tasks

Hungarian Language Understanding Benchmark Kit

8 stars

3 watching

0 forks

last commit: about 2 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

oroszgy/awesome-hungarian-nlp

Related projects:

Repository	Description	Stars
nytud/huws	A dataset of manually curated Hungarian sentences with ambiguous wordings that require world knowledge and reasoning for resolution.	1
nytud/husst	A dataset of annotated sentences for training and evaluating sentiment analysis models in the Hungarian language.	1
nytud/hucola	A collection of 9,076 annotated sentences in Hungarian to evaluate linguistic acceptability and grammaticality	1
nytud/happ	A dataset of Hungarian translations of human-language examples to test anaphora resolution algorithms	1
nytud/huwnli	A dataset and toolset for Hungarian anaphora resolution in natural language inference tasks	0
nytud/pws	A collection of parallel corpora of Winograd schemata in multiple languages	0
nytud/hunlp-gate	A collection of Hungarian NLP tools integrated as GATE processing resources	8
nytud/panmorph	Harmonized tagset and annotation scheme for Hungarian morphological analysers	4
nytud/machine-translation	Provides machine translation models and a demo site for Hungarian language translations	5
xuefuzhao/instructionwild	Creating a large-scale user-based instruction dataset for natural language processing research and development	455
alexa/massive	A collection of tools and modeling code for a large multilingual Natural Language Understanding dataset	541
turkunlp/wikibert	Provides pre-trained language models derived from Wikipedia texts for natural language processing tasks	34
karthikncode/nlp-datasets	A curated list of Natural Language Processing datasets used to train and evaluate NLP models.	919
nytud/hadifogoly-adatbazis	An attempt to transcribe Cyrillic text into Hungarian script for a large dataset of WWII prisoner-of-war records	23
novakat/nytk-nerkor-cars-ontonotespp	A large annotated dataset of Hungarian text with over 30 entity types derived from various sources and formats.	1