drama-corpus

Drama Corpus

A comprehensive annotated corpus of Hungarian drama texts, including structural annotations and grammatical features.

GitHub

1 stars
1 watching
1 forks
last commit: 6 months ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
elte-dh/poetry-corpus A large corpus of annotated Hungarian poems in XML format, with various annotations including grammatical features and sound patterns. 7
elte-dh/regenykorpusz A large corpus of Hungarian novels with annotated texts and metadata, developed by the Department of Digital Humanities at Eötvös Loránd University. 4
poltextlab/hunempoli_corpus A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language. 0
nytud/hucola A collection of 9,076 annotated sentences in Hungarian to evaluate linguistic acceptability and grammaticality 1
ukrainian-to-english-corpora/folktale_corpus A collection of Ukrainian folktales translated into English for linguistic and literary research purposes. 0
nytud/hucopa A dataset and annotation scheme for Hungarian causal reasoning tasks. 1
famrashel/idn-tagged-corpus A manually tagged Indonesian language corpus in tab-separated file format 88
eleutherai/polyglot Large language models designed to perform well in multiple languages and address performance issues with current multilingual models. 476
vadno/korkor_pilot A large annotated corpus of Hungarian text with various linguistic annotations, split into development and test datasets for natural language processing tasks. 2
bertez/corpora A collection of Galician language data in JSON format. 2
christos-c/bible-corpus A multilingual parallel corpus created from translations of the Bible. 177
jbaiter/archiscribe-corpus A repository of transcribed 19th century German texts from various sources. 8
neodrama/github-drama A curated collection of heated discussions on GitHub 175
qhungngo/evbcorpus A large-scale bilingual corpus collection for language technology and NLP tasks, containing English-Vietnamese translations and bitexts. 42
jbest/typeface-corpus A collection of typeface samples to improve OCR accuracy for natural history collections and digital humanities. 7