corpora
Language corpus
A collection of Galician language data in JSON format.
This is a collection of corpus of Galician (or related to Galicia) words / Colección de corpus de palabras en galego (ou relacionadas con Galicia)
2 stars
1 watching
0 forks
last commit: almost 10 years ago
Linked from 1 awesome list
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | A collection of unannotated Spanish text data, compiled from various sources and processed for natural language processing tasks. | 92 |
| | Tool to split Galician words into syllables | 1 |
| | A JavaScript library that finds rhyming words in the Galician language | 1 |
| | Provides language-specific support for the TeXLive typesetting system | 1 |
| | A multilingual parallel corpus created from translations of the Bible. | 177 |
| | A large-scale bilingual corpus collection for language technology and NLP tasks, containing English-Vietnamese translations and bitexts. | 42 |
| | This is a collection of annotated text data for the Galician language. | 1 |
| | A collection of linguistic resources and trained word embeddings for the Spanish language. | 45 |
| | This project generates Spanish word embeddings using fastText on large corpora. | 9 |
| | A collection of linguistic data and tools for processing the Galician language | 0 |
| | A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language. | 0 |
| | A large corpus of Hungarian novels with annotated texts and metadata, developed by the Department of Digital Humanities at Eötvös Loránd University. | 4 |
| | A manually tagged Indonesian language corpus in tab-separated file format | 88 |
| | A collection of Ukrainian folktales translated into English for linguistic and literary research purposes. | 0 |
| | A documentation project focused on explaining Jonathan Blow's programming language Jai. | 1,816 |