bible-corpus

Bible corpus

A multilingual parallel corpus created from translations of the Bible.

A multilingual parallel corpus created from translations of the Bible.

GitHub

176 stars
12 watching
47 forks
last commit: 2 months ago
Linked from 1 awesome list

biblebible-corpuscorpusmultilingualtranslation

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
christos-c/bible-corpus-tools A collection of tools for reading and processing multilingual Bible texts 15
bertez/corpora A collection of Galician language data in JSON format. 2
biblejs/bibleapp An online application that enables users to interact with the Bible via the command line. 306
several27/fakenewscorpus A large dataset of news articles with labeled categories to train fake news recognition algorithms 387
qhungngo/evbcorpus A large-scale bilingual corpus collection for language technology and NLP tasks, containing English-Vietnamese translations and bitexts. 42
brown-uk/corpus Creating a balanced corpus of modern Ukrainian language with 1 million words, based on the Brown Corpus model. 110
ukrainian-to-english-corpora/folktale_corpus A collection of Ukrainian folktales translated into English for linguistic and literary research purposes. 0
josecannete/spanish-corpora A collection of unannotated Spanish text data, compiled from various sources and processed for natural language processing tasks. 92
openbibleinfo/bible-passage-reference-parser An implementation of a Bible passage reference parser 223
nytud/hucopa A dataset of Hungarian translations of English 'cause-and-effect' questions with plausible alternative answers 1
j-min/korean-parallel-corpora A collection of parallel Korean texts used for language processing and machine learning research 12
poltextlab/hunempoli_corpus A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language. 0
bibletime/bibletime A Bible study tool utilizing the Sword library and Qt toolkit. 333
crscardellino/sbwce A collection of linguistic resources and trained word embeddings for the Spanish language. 45
cluebenchmark/cluecorpus2020 A large-scale pre-training corpus for Chinese language models 925