idn-tagged-corpus
Indonesian Corpus
A manually tagged Indonesian language corpus in tab-separated file format
Indonesian Manually Tagged Corpus
88 stars
7 watching
26 forks
last commit: over 2 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| A manually tagged Indonesian corpus consisting of parse-trees from sentences. | 36 |
| A curated collection of NLP datasets and resources for Bahasa Indonesia | 496 |
| A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language. | 0 |
| A repository of linguistic data for Indonesian words categorized as either standard or non-standard | 29 |
| A collection of parallel Korean texts used for language processing and machine learning research | 12 |
| A collection of annotated NLP resources for the Indonesian language | 279 |
| A collection of Ukrainian folktales translated into English for linguistic and literary research purposes. | 0 |
| A collection of Galician language data in JSON format. | 2 |
| A large corpus of Hungarian novels with annotated texts and metadata, developed by the Department of Digital Humanities at Eötvös Loránd University. | 4 |
| Demonstrates word embedding in Indonesian language using pre-trained Word2vec models | 20 |
| Creates an Indonesia map with province codes | 73 |
| Provides a glossary of terms and explanations for functional programming concepts in a simple and accessible way. | 70 |
| Training data for a handwritten recognition system | 21 |
| Harmonized tagset and annotation scheme for Hungarian morphological analysers | 4 |
| A collection of pre-processed datasets in Bangla language for natural language processing tasks | 0 |