sailor-llm

SE Asian model

Develops language models tailored for South-East Asia's linguistic diversity and cultural nuances

[EMNLP-2024] ⚓️ Sailor: Open Language Models for South-East Asia

GitHub

120 stars
8 watching
9 forks
Language: Python
last commit: about 2 months ago
indonesialanguage-modellaomalayseathaivietnam

Related projects:

Repository Description Stars
seallms/seallms Large language models designed to process languages commonly used in Southeast Asia 8
langboat/mengzi3 An 8B and 13B language model based on the Llama architecture with multilingual capabilities. 2,031
ibm-granite/granite-3.0-language-models A collection of lightweight state-of-the-art language models designed to support multilinguality, coding, and reasoning tasks on constrained resources. 232
bilibili/index-1.9b A lightweight, multilingual language model with a long context length 920
elanmart/psmm An implementation of a neural network model for character-level language modeling. 50
vhellendoorn/code-lms A guide to using pre-trained large language models in source code analysis and generation 1,789
xverse-ai/xverse-7b A multilingual large language model developed by XVERSE Technology Inc. 50
orionstarai/orion A family of large language models designed to handle multilingual text and provide strong performance in various tasks such as chat, long context, and retrieval augmented generation. 789
yunwentechnology/unilm This project provides pre-trained models and tools for natural language understanding (NLU) and generation (NLG) tasks in Chinese. 439
csuhan/onellm A framework for training and fine-tuning multimodal language models on various data types 601
nanbeige/nanbeige Develops large language models for text understanding and generation tasks. 85
academic-hammer/hammerllm A large language model pre-trained on Chinese and English data, suitable for natural language processing tasks. 43
ieit-yuan/yuan2.0-m32 A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation 182
eleutherai/polyglot Large language models designed to perform well in multiple languages and address performance issues with current multilingual models. 476
apache/opennlp-models Provides pre-trained binary models for natural language text processing across multiple languages 4