lmdeploy
LLM toolkit
A toolkit for optimizing and serving large language models
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
5k stars
39 watching
439 forks
Language: Python
last commit: about 1 month ago
Linked from 1 awesome list
codellamacuda-kernelsdeepspeedfastertransformerinternlmllamallama2llama3llmllm-inferenceturbomind
Related projects:
Repository | Description | Stars |
---|---|---|
vllm-project/vllm | An inference and serving engine for large language models | 31,982 |
internlm/internlm | A collection of large language models designed to improve reasoning and tool use capabilities in chatbots. | 6,572 |
sjtu-ipads/powerinfer | An efficient Large Language Model inference engine leveraging consumer-grade GPUs on PCs | 8,011 |
lyogavin/airllm | Optimizes large language model inference on limited GPU resources | 5,446 |
alpha-vllm/llama2-accessory | An open-source toolkit for pretraining and fine-tuning large language models | 2,732 |
optimalscale/lmflow | A toolkit for fine-tuning and inferring large machine learning models | 8,312 |
mit-han-lab/llm-awq | An open-source software project that enables efficient and accurate low-bit weight quantization for large language models. | 2,593 |
opengvlab/llama-adapter | An implementation of a method for fine-tuning language models to follow instructions with high efficiency and accuracy | 5,775 |
nomic-ai/gpt4all | An open-source Python client for running Large Language Models (LLMs) locally on any device. | 71,176 |
modeltc/lightllm | A Python-based framework for serving large language models with low latency and high scalability. | 2,691 |
opengvlab/internvl | Develops large language models capable of processing multiple data types and modalities | 6,394 |
young-geng/easylm | A framework for training and serving large language models using JAX/Flax | 2,428 |
eleutherai/lm-evaluation-harness | Provides a unified framework to test generative language models on various evaluation tasks. | 7,200 |
microsoft/deepspeed | A deep learning optimization library that simplifies distributed training and inference on modern computing hardware. | 35,863 |
hiyouga/llama-factory | A tool for efficiently fine-tuning large language models across multiple architectures and methods. | 36,219 |