drama-corpus

Drama Corpus

A comprehensive annotated corpus of Hungarian drama texts, including structural annotations and grammatical features.

GitHub

1 stars
1 watching
1 forks
last commit: 4 months ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
elte-dh/poetry-corpus A comprehensive poetry corpus with annotated text data in TEI XML format 7
elte-dh/regenykorpusz A large corpus of Hungarian novels with annotated texts and metadata, developed by the Department of Digital Humanities at Eötvös Loránd University. 4
poltextlab/hunempoli_corpus A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language. 0
nytud/hucola A dataset of Hungarian sentences annotated for their grammatical acceptability. 1
ukrainian-to-english-corpora/folktale_corpus A collection of Ukrainian folktales translated into English for linguistic and literary research purposes. 0
nytud/hucopa A dataset of Hungarian translations of English 'cause-and-effect' questions with plausible alternative answers 1
famrashel/idn-tagged-corpus A manually tagged Indonesian language corpus in tab-separated file format 88
eleutherai/polyglot Large language models designed to perform well in multiple languages and address performance issues with current multilingual models. 475
vadno/korkor_pilot A large annotated corpus of Hungarian text with various linguistic annotations, split into development and test datasets for natural language processing tasks. 2
bertez/corpora A collection of Galician language data in JSON format. 2
christos-c/bible-corpus A multilingual parallel corpus created from translations of the Bible. 176
jbaiter/archiscribe-corpus A repository of transcribed 19th century German texts from various sources. 8
neodrama/github-drama A curated collection of heated GitHub discussions 149
qhungngo/evbcorpus A large-scale bilingual corpus collection for language technology and NLP tasks, containing English-Vietnamese translations and bitexts. 42
jbest/typeface-corpus A collection of typeface samples to improve OCR accuracy for natural history collections and digital humanities. 7