llm-awq

LLM Quantizer

An open-source software project that enables efficient and accurate low-bit weight quantization for large language models.

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

GitHub

3k stars
25 watching
212 forks
Language: Python
last commit: about 1 month ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
lyogavin/airllm Optimizes large language model inference on limited GPU resources 5,446
vllm-project/vllm An inference and serving engine for large language models 31,982
optimalscale/lmflow A toolkit for fine-tuning and inferring large machine learning models 8,312
autogptq/autogptq A package for optimizing large language models for efficient inference on GPUs and other hardware platforms. 4,560
opengvlab/llama-adapter An implementation of a method for fine-tuning language models to follow instructions with high efficiency and accuracy 5,775
alpha-vllm/llama2-accessory An open-source toolkit for pretraining and fine-tuning large language models 2,732
intel/neural-compressor Tools and techniques for optimizing large language models on various frameworks and hardware platforms. 2,257
internlm/lmdeploy A toolkit for optimizing and serving large language models 4,854
linkedin/liger-kernel A collection of optimized kernels and post-training loss functions for large language models 3,840
modeltc/lightllm A Python-based framework for serving large language models with low latency and high scalability. 2,691
microsoft/lmops A research initiative focused on developing fundamental technology to improve the performance and efficiency of large language models. 3,747
haotian-liu/llava A system that uses large language and vision models to generate and process visual instructions 20,683
ggerganov/llama.cpp Enables LLM inference with minimal setup and high performance on various hardware platforms 69,185
mlc-ai/mlc-llm A machine learning compiler and deployment engine for large language models 19,396
nomic-ai/gpt4all An open-source Python client for running Large Language Models (LLMs) locally on any device. 71,176