korean-parallel-corpora

Korean text corpus

A collection of parallel Korean texts used for language processing and machine learning research

Korean Parallel Corpus

GitHub

12 stars
3 watching
3 forks
last commit: almost 10 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
famrashel/idn-tagged-corpus A manually tagged Indonesian language corpus in tab-separated file format 88
poltextlab/hunempoli_corpus A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language. 0
famrashel/idn-treebank A manually tagged Indonesian corpus consisting of parse-trees from sentences. 36
christos-c/bible-corpus A multilingual parallel corpus created from translations of the Bible. 176
matbahasa/talpco A parallel corpus of Asian languages with linguistic annotations and data formats for natural language processing research. 49
kmkurn/id-nlp-resource A collection of annotated NLP resources for the Indonesian language 279
dahlia/seonbi A tool for transforming Korean text into standardized forms 131
sublee/hangulize Automates conversion of words from non-Korean languages to Korean using standardized rules. 213
bertez/corpora A collection of Galician language data in JSON format. 2
ukrainian-to-english-corpora/folktale_corpus A collection of Ukrainian folktales translated into English for linguistic and literary research purposes. 0
open-korean-text/elasticsearch-analysis-openkoreantext An Elasticsearch analyzer plugin for analyzing Korean text using the Open-Korean Text module. 127
konlpy/konlpy A Python package providing tools and libraries for processing Korean text data 1,422
crownpku/small-chinese-corpus A collection of datasets and tools for NLP tasks on Chinese texts, including part-of-speech tagging, named entity recognition, and question answering. 531
nlpai-lab/kullm Korea University Large Language Model developed by researchers at Korea University and HIAI Research Institute. 569
4np/npokitresources A curated list of additional resources for programs broadcasted by the Dutch Public Broadcaster. 2