sglang

Server framework

A fast serving framework for large language models and vision language models.

SGLang is a fast serving framework for large language models and vision language models.

GitHub

7k stars
60 watching
579 forks
Language: Python
last commit: about 23 hours ago
Linked from 1 awesome list

cudainferencellamallama2llama3llama3-1llavallmllm-servingmoepytorchtransformervlm

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
fminference/flexllmgen Generates large language model outputs in high-throughput mode on single GPUs 9,236
modeltc/lightllm A Python-based framework for serving large language models with low latency and high scalability. 2,691
haotian-liu/llava A system that uses large language and vision models to generate and process visual instructions 20,683
young-geng/easylm A framework for training and serving large language models using JAX/Flax 2,428
alpha-vllm/llama2-accessory An open-source toolkit for pretraining and fine-tuning large language models 2,732
qwenlm/qwen2-vl A multimodal large language model series developed by the Qwen team to understand and process images, videos, and text. 3,613
opengvlab/llama-adapter An implementation of a method for fine-tuning language models to follow instructions with high efficiency and accuracy 5,775
qwenlm/qwen-vl A large vision language model with improved image reasoning and text recognition capabilities, suitable for various multimodal tasks 5,179
scisharp/llamasharp An efficient C#/.NET library for running Large Language Models (LLMs) on local devices 2,750
qwenlm/qwen2.5 A large language model series with various sizes and variants for text generation and understanding. 10,959
sjtu-ipads/powerinfer An efficient Large Language Model inference engine leveraging consumer-grade GPUs on PCs 8,011
shishirpatil/gorilla Enables large language models to interact with external APIs using natural language queries 11,564
optimalscale/lmflow A toolkit for fine-tuning and inferring large machine learning models 8,312
vllm-project/vllm An inference and serving engine for large language models 31,982
pku-yuangroup/video-llava A deep learning framework for generating videos from text inputs and visual features. 3,071