gptq
Quantizer
An implementation of post-training quantization algorithm for transformer models to reduce memory usage and improve inference speed
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
2k stars
29 watching
156 forks
Language: Python
last commit: 11 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| Research tool for training large transformer language models at scale | 1,926 |
| A collection of tools and scripts for training large transformer language models at scale | 1,342 |
| A Rust implementation of a minimal Generative Pretrained Transformer architecture. | 845 |
| An implementation of a method to compress large language models using additive quantization and fine-tuning. | 1,184 |
| A software framework for accurately quantizing large language models using a novel technique | 739 |
| A generative transformer model designed to process and generate text in various vertical domains, including computer science, finance, and more. | 217 |
| A PyTorch-based simulator for quantum machine learning | 45 |
| Improving Object Detection from Scratch via Gated Feature Reuse | 65 |
| Tools and techniques for optimizing large language models on various frameworks and hardware platforms. | 2,257 |
| An implementation of deep learning transformer models in MATLAB | 209 |
| This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,167 |
| Training and deploying large language models on computer vision tasks using region-of-interest inputs | 517 |
| Trains PyTorch models using a genetic algorithm | 96 |
| An attention-based sequence-to-sequence learning framework | 389 |
| A deep learning library that provides an easy-to-use interface for quantizing neural networks and accelerating their inference on various hardware platforms. | 541 |