gptq
Quantizer
An implementation of post-training quantization algorithm for transformer models to reduce memory usage and improve inference speed
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
2k stars
29 watching
156 forks
Language: Python
last commit: over 1 year ago
Linked from 1 awesome list
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | Research tool for training large transformer language models at scale | 1,926 |
| | A collection of tools and scripts for training large transformer language models at scale | 1,342 |
| | A Rust implementation of a minimal Generative Pretrained Transformer architecture. | 845 |
| | An implementation of a method to compress large language models using additive quantization and fine-tuning. | 1,184 |
| | A software framework for accurately quantizing large language models using a novel technique | 739 |
| | A generative transformer model designed to process and generate text in various vertical domains, including computer science, finance, and more. | 217 |
| | A PyTorch-based simulator for quantum machine learning | 45 |
| | Improving Object Detection from Scratch via Gated Feature Reuse | 65 |
| | Tools and techniques for optimizing large language models on various frameworks and hardware platforms. | 2,257 |
| | An implementation of deep learning transformer models in MATLAB | 209 |
| | This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,167 |
| | Training and deploying large language models on computer vision tasks using region-of-interest inputs | 517 |
| | Trains PyTorch models using a genetic algorithm | 96 |
| | An attention-based sequence-to-sequence learning framework | 389 |
| | A deep learning library that provides an easy-to-use interface for quantizing neural networks and accelerating their inference on various hardware platforms. | 541 |