idn-tagged-corpus

Indonesian Corpus

A manually tagged Indonesian language corpus in tab-separated file format

Indonesian Manually Tagged Corpus

88 stars

7 watching

26 forks

last commit: about 4 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

keon/awesome-nlp

Related projects:

Repository	Description	Stars
famrashel/idn-treebank	A manually tagged Indonesian corpus consisting of parse-trees from sentences.	36
louisowen6/nlp_bahasa_resources	A curated collection of NLP datasets and resources for Bahasa Indonesia	496
poltextlab/hunempoli_corpus	A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language.	0
lantip/baku-tidak-baku	A repository of linguistic data for Indonesian words categorized as either standard or non-standard	29
j-min/korean-parallel-corpora	A collection of parallel Korean texts used for language processing and machine learning research	12
kmkurn/id-nlp-resource	A collection of annotated NLP resources for the Indonesian language	279
ukrainian-to-english-corpora/folktale_corpus	A collection of Ukrainian folktales translated into English for linguistic and literary research purposes.	0
bertez/corpora	A collection of Galician language data in JSON format.	2
elte-dh/regenykorpusz	A large corpus of Hungarian novels with annotated texts and metadata, developed by the Department of Digital Humanities at Eötvös Loránd University.	4
galuhsahid/indonesian-word-embedding	Demonstrates word embedding in Indonesian language using pre-trained Word2vec models	20
ans-4175/peta-indonesia-geojson	Creates an Indonesia map with province codes	73
wisn/jargon-pemrograman-fungsional	Provides a glossary of terms and explanations for functional programming concepts in a simple and accessible way.	70
igobronidze/hrs_training_data	Training data for a handwritten recognition system	21
nytud/panmorph	Harmonized tagset and annotation scheme for Hungarian morphological analysers	4
atik-05/bangla_datasets_absa	A collection of pre-processed datasets in Bangla language for natural language processing tasks	0