lightllm
LLM server
A Python-based framework for serving large language models with low latency and high scalability.
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
3k stars
22 watching
216 forks
Language: Python
last commit: 3 months ago
Linked from 1 awesome list
deep-learninggptllamallmmodel-servingnlpopenai-triton
Related projects:
Repository | Description | Stars |
---|---|---|
| Optimizes large language model inference on limited GPU resources | 5,446 |
| A comprehensive course and resource package on building and deploying Large Language Models (LLMs) | 40,053 |
| An open-source toolkit for pretraining and fine-tuning large language models | 2,732 |
| A framework for training and serving large language models using JAX/Flax | 2,428 |
| Provides inference code and tools for fine-tuning large language models, specifically designed for code generation tasks | 16,097 |
| Large pre-trained language models trained to follow complex instructions using an evolutionary instruction framework | 9,295 |
| A toolkit for fine-tuning and inferring large machine learning models | 8,312 |
| A fast serving framework for large language models and vision language models. | 6,551 |
| A high-performance LLM inference framework written in Rust | 4,677 |
| Enables LLM inference with minimal setup and high performance on various hardware platforms | 69,185 |
| An inference and serving engine for large language models | 31,982 |
| A semantic cache designed to reduce the cost and improve the speed of LLM API calls by storing responses. | 7,293 |
| A machine learning compiler and deployment engine for large language models | 19,396 |
| Generates large language model outputs in high-throughput mode on single GPUs | 9,236 |
| A curated list of resources to help developers navigate the landscape of large language models and their applications in NLP | 9,551 |