corpora
Language corpus
A collection of Galician language data in JSON format.
This is a collection of corpus of Galician (or related to Galicia) words / Colección de corpus de palabras en galego (ou relacionadas con Galicia)
2 stars
1 watching
0 forks
last commit: about 9 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| A collection of unannotated Spanish text data, compiled from various sources and processed for natural language processing tasks. | 92 |
| Tool to split Galician words into syllables | 1 |
| A JavaScript library that finds rhyming words in the Galician language | 1 |
| Provides language-specific support for the TeXLive typesetting system | 1 |
| A multilingual parallel corpus created from translations of the Bible. | 177 |
| A large-scale bilingual corpus collection for language technology and NLP tasks, containing English-Vietnamese translations and bitexts. | 42 |
| This is a collection of annotated text data for the Galician language. | 1 |
| A collection of linguistic resources and trained word embeddings for the Spanish language. | 45 |
| This project generates Spanish word embeddings using fastText on large corpora. | 9 |
| A collection of linguistic data and tools for processing the Galician language | 0 |
| A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language. | 0 |
| A large corpus of Hungarian novels with annotated texts and metadata, developed by the Department of Digital Humanities at Eötvös Loránd University. | 4 |
| A manually tagged Indonesian language corpus in tab-separated file format | 88 |
| A collection of Ukrainian folktales translated into English for linguistic and literary research purposes. | 0 |
| A documentation project focused on explaining Jonathan Blow's programming language Jai. | 1,816 |