indosum
Indonesian Summarization Benchmark
Provides a benchmark dataset and tools for training text summarization models in the Indonesian language.
A benchmark dataset for Indonesian text summarization.
76 stars
7 watching
15 forks
Language: Python
last commit: over 5 years ago
Linked from 1 awesome list
indonesianindonesian-languagenatural-language-processingtext-summarization
Related projects:
Repository | Description | Stars |
---|---|---|
indonlp/indonlu | A comprehensive collection of natural language understanding resources and pre-trained models for Indonesian language. | 556 |
andriawan/andkamus | A CLI-based dictionary application written in C++ that maps Indonesian to English words. | 6 |
ivoputzer/cli-args-parser-kata | Implementing a CLI arguments parser to process input in various formats | 5 |
ariya/tebakmasa | A tool to parse Indonesian date and time descriptions into Unix epoch timestamps. | 10 |
kangfend/bahasa | A natural language processing toolkit for the Indonesian language. | 19 |
matbahasa/talpco | A parallel corpus of Asian languages with linguistic annotations and data formats for natural language processing research. | 49 |
azishapidin/indoregion | A Laravel package providing geographical data of Indonesia's administrative regions | 249 |
lantip/baku-tidak-baku | A repository of linguistic data for Indonesian words categorized as either standard or non-standard | 29 |
galuhsahid/indonesian-word-embedding | Demonstrates word embedding in Indonesian language using pre-trained Word2vec models | 20 |
pku-yuangroup/video-bench | Evaluates and benchmarks large language models' video understanding capabilities | 117 |
damoebius/haxebench | A benchmarking project comparing the performance of different programming languages and their compiled outputs in various formats. | 52 |
har07/pysastrawi | A Python port of an Indonesian stemmer library, reducing inflected words to their base form. | 336 |
pythainlp/prachathai-67k | An article classification dataset created from news articles scraped from Prachathai.com with multiple benchmark models for multi-label classification | 16 |
sastrawi/nlp-bahasa-indonesia | A collection of NLP papers and resources for Bahasa Indonesia, including tools and software for text processing tasks such as summarization, parsing, part-of-speech tagging, stemming, and word sense disambiguation. | 186 |
felixgithub2017/mmcu | Evaluates the semantic understanding capabilities of large Chinese language models using a multimodal dataset. | 87 |