bible-corpus

Bible corpus

A multilingual parallel corpus created from translations of the Bible.

A multilingual parallel corpus created from translations of the Bible.

177 stars

12 watching

47 forks

last commit: almost 2 years ago

Linked from 1 awesome list

biblebible-corpuscorpusmultilingualtranslation

Backlinks from these awesome lists:

richardlitt/low-resource-languages

Related projects:

Repository	Description	Stars
christos-c/bible-corpus-tools	A collection of tools for reading and processing multilingual Bible texts	15
bertez/corpora	A collection of Galician language data in JSON format.	2
biblejs/bibleapp	An online application that enables users to interact with the Bible via the command line.	307
several27/fakenewscorpus	A large dataset of news articles with labeled categories to train fake news recognition algorithms	385
qhungngo/evbcorpus	A large-scale bilingual corpus collection for language technology and NLP tasks, containing English-Vietnamese translations and bitexts.	42
brown-uk/corpus	Creating a balanced corpus of modern Ukrainian language with 1 million words, based on the Brown Corpus model.	110
ukrainian-to-english-corpora/folktale_corpus	A collection of Ukrainian folktales translated into English for linguistic and literary research purposes.	0
josecannete/spanish-corpora	A collection of unannotated Spanish text data, compiled from various sources and processed for natural language processing tasks.	92
openbibleinfo/bible-passage-reference-parser	An implementation of a Bible passage reference parser	223
nytud/hucopa	A dataset and annotation scheme for Hungarian causal reasoning tasks.	1
j-min/korean-parallel-corpora	A collection of parallel Korean texts used for language processing and machine learning research	12
poltextlab/hunempoli_corpus	A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language.	0
bibletime/bibletime	A Bible study tool utilizing the Sword library and Qt toolkit.	337
crscardellino/sbwce	A collection of linguistic resources and trained word embeddings for the Spanish language.	45
cluebenchmark/cluecorpus2020	A large-scale Chinese corpus for pre-training language models.	927