XVERSE-65B
Language Model
A large language model developed by XVERSE Technology Inc. using transformer architecture and fine-tuned on diverse data sets for various applications.
XVERSE-65B: A multilingual large language model developed by XVERSE Technology Inc.
132 stars
5 watching
15 forks
Language: Python
last commit: 11 months ago Related projects:
Repository | Description | Stars |
---|---|---|
| A large language model developed to support multiple languages and applications | 648 |
| Develops and publishes large multilingual language models with advanced mixing-of-experts architecture. | 37 |
| A multilingual large language model developed by XVERSE Technology Inc. | 50 |
| Developed by XVERSE Technology Inc. as a multilingual large language model with a unique mixture-of-experts architecture and fine-tuned for various tasks such as conversation, question answering, and natural language understanding. | 36 |
| A large multimodal model for visual question answering, trained on a dataset of 2.1B image-text pairs and 8.2M instruction sequences. | 78 |
| A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation | 182 |
| An 8B and 13B language model based on the Llama architecture with multilingual capabilities. | 2,031 |
| An open-source chat model built on top of the 52B large language model, with improvements in position encoding, activation function, and layer normalization. | 40 |
| Large-scale language model with improved performance on NLP tasks through distributed training and efficient data processing | 591 |
| This repository provides pre-trained models and code for understanding and generation tasks in multiple languages. | 89 |
| A collection of lightweight state-of-the-art language models designed to support multilinguality, coding, and reasoning tasks on constrained resources. | 232 |
| An upgraded version of SimBERT with integrated retrieval and generation capabilities | 441 |
| This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,167 |
| Provides pre-trained language models derived from Wikipedia texts for natural language processing tasks | 34 |
| Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks | 230 |