lmdeploy

LLM toolkit

A toolkit for optimizing and serving large language models

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

GitHub

5k stars
38 watching
427 forks
Language: Python
last commit: 6 days ago
Linked from 1 awesome list

codellamacuda-kernelsdeepspeedfastertransformerinternlmllamallama2llama3llmllm-inferenceturbomind

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
vllm-project/vllm A high-performance inference and serving engine for large language models. 30,303
internlm/internlm Large language models for chatbot and natural language understanding applications 6,473
sjtu-ipads/powerinfer An efficient Large Language Model inference engine leveraging consumer-grade GPUs on PCs 7,964
lyogavin/airllm A Python library that optimizes inference memory usage for large language models on limited GPU resources. 5,259
alpha-vllm/llama2-accessory An open-source toolkit for pretraining and fine-tuning large language models 2,720
optimalscale/lmflow A toolkit for finetuning large language models and providing efficient inference capabilities 8,273
mit-han-lab/llm-awq A tool for efficient and accurate weight quantization in large language models 2,517
opengvlab/llama-adapter An implementation of a method for fine-tuning language models to follow instructions with high efficiency and accuracy 5,754
nomic-ai/gpt4all An open-source Python client for running Large Language Models (LLMs) locally on any device. 70,694
modeltc/lightllm An LLM inference and serving framework providing a lightweight design, scalability, and high-speed performance for large language models. 2,609
opengvlab/internvl A pioneering open-source alternative to commercial multimodal models with a family of large-scale language and vision models. 6,014
young-geng/easylm A framework for training and serving large language models using JAX/Flax 2,409
eleutherai/lm-evaluation-harness Provides a unified framework to test generative language models on various evaluation tasks. 6,970
microsoft/deepspeed A deep learning optimization library that makes distributed training and inference easy, efficient, and effective. 35,463
hiyouga/llama-factory A unified platform for fine-tuning multiple large language models with various training approaches and methods 34,436