UD_Hungarian-Szeged

Hungarian text dataset

A corpus of annotated Hungarian text data for machine learning and natural language processing tasks

Hungarian data

GitHub

5 stars
133 watching
0 forks
Language: Shell
last commit: 9 days ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
universaldependencies/ud_galician-treegal A treebank for the Galician language with annotated syntactic and morphological features. 6
universaldependencies/ud_ukrainian-iu A dataset of annotated text in Ukrainian with standardized formatting and annotation guidelines. 28
universaldependencies/ud_galician-ctg This is a collection of annotated text data for the Galician language. 1
universaldependencies/ud_vietnamese-vtb An annotated corpus of Vietnamese language structure 36
nytud/huws A dataset of manually curated Hungarian sentences with ambiguous wordings that require world knowledge and reasoning for resolution. 1
nytud/hucola A dataset of Hungarian sentences annotated for their grammatical acceptability. 1
nytud/husst A dataset and benchmarking kit for evaluating language understanding in Hungarian 1
nytud/hulu A collection of linguistic datasets and benchmarks for natural language understanding tasks 9
huspacy/huspacy An industrial-strength natural language processing library for Hungarian language text analysis 155
nytud/pws A collection of parallel corpora of Winograd schemata in multiple languages 0
nytud/panmorph Harmonized tagset and annotation scheme for Hungarian morphological analysers 4
universaldependencies/docs An online documentation repository providing detailed resources and guides for the Universal Dependencies project 273
nytud/hadifogoly-adatbazis An attempt to transcribe Cyrillic text into Hungarian script for a large dataset of WWII prisoner-of-war records 23
mmihaltz/huwn.rdf Hungarian WordNet data in RDF format for use in semantic web applications 2
novakat/nytk-nerkor-cars-ontonotespp A large annotated dataset of Hungarian text with over 30 entity types derived from various sources and formats. 1