llm-awq
LLM Quantizer
An open-source software project that enables efficient and accurate low-bit weight quantization for large language models.
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
3k stars
25 watching
212 forks
Language: Python
last commit: about 1 month ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
lyogavin/airllm | Optimizes large language model inference on limited GPU resources | 5,446 |
vllm-project/vllm | An inference and serving engine for large language models | 31,982 |
optimalscale/lmflow | A toolkit for fine-tuning and inferring large machine learning models | 8,312 |
autogptq/autogptq | A package for optimizing large language models for efficient inference on GPUs and other hardware platforms. | 4,560 |
opengvlab/llama-adapter | An implementation of a method for fine-tuning language models to follow instructions with high efficiency and accuracy | 5,775 |
alpha-vllm/llama2-accessory | An open-source toolkit for pretraining and fine-tuning large language models | 2,732 |
intel/neural-compressor | Tools and techniques for optimizing large language models on various frameworks and hardware platforms. | 2,257 |
internlm/lmdeploy | A toolkit for optimizing and serving large language models | 4,854 |
linkedin/liger-kernel | A collection of optimized kernels and post-training loss functions for large language models | 3,840 |
modeltc/lightllm | A Python-based framework for serving large language models with low latency and high scalability. | 2,691 |
microsoft/lmops | A research initiative focused on developing fundamental technology to improve the performance and efficiency of large language models. | 3,747 |
haotian-liu/llava | A system that uses large language and vision models to generate and process visual instructions | 20,683 |
ggerganov/llama.cpp | Enables LLM inference with minimal setup and high performance on various hardware platforms | 69,185 |
mlc-ai/mlc-llm | A machine learning compiler and deployment engine for large language models | 19,396 |
nomic-ai/gpt4all | An open-source Python client for running Large Language Models (LLMs) locally on any device. | 71,176 |