UD_Hungarian-Szeged

Hungarian Treebank

This repository provides a Hungarian language treebank dataset in the Universal Dependencies format.

Hungarian data

GitHub

5 stars
133 watching
0 forks
Language: Shell
last commit: about 1 month ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
universaldependencies/ud_galician-treegal A treebank for the Galician language with annotated syntactic and morphological features. 6
universaldependencies/ud_ukrainian-iu A dataset of annotated text in Ukrainian with standardized formatting and annotation guidelines. 27
universaldependencies/ud_galician-ctg This is a collection of annotated text data for the Galician language. 1
universaldependencies/ud_vietnamese-vtb An annotated corpus of Vietnamese language structure 36
nytud/huws A dataset of manually curated Hungarian sentences with ambiguous wordings that require world knowledge and reasoning for resolution. 1
nytud/hucola A collection of 9,076 annotated sentences in Hungarian to evaluate linguistic acceptability and grammaticality 1
nytud/husst A dataset of annotated sentences for training and evaluating sentiment analysis models in the Hungarian language. 1
nytud/hulu A collection of linguistic datasets and benchmarks for natural language understanding tasks 8
huspacy/huspacy An industrial-strength natural language processing library for Hungarian language text analysis 158
nytud/pws A collection of parallel corpora of Winograd schemata in multiple languages 0
nytud/panmorph Harmonized tagset and annotation scheme for Hungarian morphological analysers 4
universaldependencies/docs An online documentation repository providing detailed resources and guides for the Universal Dependencies project 275
nytud/hadifogoly-adatbazis An attempt to transcribe Cyrillic text into Hungarian script for a large dataset of WWII prisoner-of-war records 23
mmihaltz/huwn.rdf An RDF-based representation of Hungarian WordNet 2
novakat/nytk-nerkor-cars-ontonotespp A large annotated dataset of Hungarian text with over 30 entity types derived from various sources and formats. 1