corpora
Language corpus
A collection of Galician language data in JSON format.
This is a collection of corpus of Galician (or related to Galicia) words / Colección de corpus de palabras en galego (ou relacionadas con Galicia)
2 stars
1 watching
0 forks
last commit: almost 9 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
josecannete/spanish-corpora | A collection of unannotated Spanish text data, compiled from various sources and processed for natural language processing tasks. | 92 |
bertez/gl-syllabler | Tool to split Galician words into syllables | 1 |
bertez/rima | A JavaScript library that finds rhyming words in the Galician language | 1 |
openmandrivaassociation/texlive-babel-galician | Provides language-specific support for the TeXLive typesetting system | 1 |
christos-c/bible-corpus | A multilingual parallel corpus created from translations of the Bible. | 176 |
qhungngo/evbcorpus | A large-scale bilingual corpus collection for language technology and NLP tasks, containing English-Vietnamese translations and bitexts. | 42 |
universaldependencies/ud_galician-ctg | This is a collection of annotated text data for the Galician language. | 1 |
crscardellino/sbwce | A collection of linguistic resources and trained word embeddings for the Spanish language. | 45 |
botcenter/spanishwordembeddings | This project generates Spanish word embeddings using fastText on large corpora. | 9 |
conllul/ul_galician-treegal | A collection of linguistic data and tools for processing the Galician language | 0 |
poltextlab/hunempoli_corpus | A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language. | 0 |
elte-dh/regenykorpusz | A large corpus of Hungarian novels with annotated texts and metadata, developed by the Department of Digital Humanities at Eötvös Loránd University. | 4 |
famrashel/idn-tagged-corpus | A manually tagged Indonesian language corpus in tab-separated file format | 88 |
ukrainian-to-english-corpora/folktale_corpus | A collection of Ukrainian folktales translated into English for linguistic and literary research purposes. | 0 |
bsvino/jaiprimer | A documentation project focused on explaining Jonathan Blow's programming language Jai. | 1,811 |