NYTK-NerKor

Hungarian NE corpus

A Hungarian language named entity annotated corpus containing 1 million tokens with morphological annotation layers and various source files.

The home repository of the NerKor corpus, a Hungarian gold standard named entity annotated corpus containing 1 million tokens.

GitHub

14 stars
7 watching
6 forks
Language: Shell
last commit: about 1 year ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
novakat/nytk-nerkor-cars-ontonotespp A large annotated dataset of Hungarian text with over 30 entity types derived from various sources and formats. 1
nytud/hucola A dataset of Hungarian sentences annotated for their grammatical acceptability. 1
vadno/korkor_pilot A large annotated corpus of Hungarian text with various linguistic annotations, split into development and test datasets for natural language processing tasks. 2
nytud/emtsv A text processing system designed to handle various tasks in Hungarian language processing using Python and TSV-based data exchange. 27
nytud/panmorph Harmonized tagset and annotation scheme for Hungarian morphological analysers 4
nytud/hunlp-gate A collection of Hungarian NLP tools integrated as GATE processing resources 8
lang-uk/ner-uk A Ukrainian NER corpus and annotation dataset for training and evaluating named entity recognition models. 90
nytud/quntoken A C++ tokenizer that tokenizes Hungarian text 14
nytud/emlam Preprocessing and modeling scripts for Hungarian language modeling using Python and TensorFlow. 3
nytud/hucopa A dataset of Hungarian translations of English 'cause-and-effect' questions with plausible alternative answers 1
nytud/huwnli A dataset and toolset for Hungarian anaphora resolution in natural language inference tasks 0
nytud/hulu A collection of linguistic datasets and benchmarks for natural language understanding tasks 9
nytud/hadifogoly-adatbazis An attempt to transcribe Cyrillic text into Hungarian script for a large dataset of WWII prisoner-of-war records 23
elte-dh/regenykorpusz A large corpus of Hungarian novels with annotated texts and metadata, developed by the Department of Digital Humanities at Eötvös Loránd University. 4
nytud/machine-translation Provides machine translation models and a demo site for Hungarian language translations 5