ktransformers
LLM optimizer
A flexible framework for LLM inference optimizations with support for multiple models and architectures
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
736 stars
15 watching
38 forks
Language: Python
last commit: 7 days ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
lge-arc-advancedai/auptimizer | Automates model building and deployment process by optimizing hyperparameters and compressing models for edge computing. | 200 |
tensorzero/tensorzero | A tool that creates a feedback loop to optimize large language models by integrating model gateways and providing data analytics and machine learning capabilities. | 569 |
alibaba/conv-llava | This project presents an optimization technique for large-scale image models to reduce computational requirements while maintaining performance. | 104 |
brml/climin | A framework for optimizing machine learning functions using gradient-based optimization methods. | 180 |
google-deepmind/kfac-jax | Library providing an implementation of the K-FAC optimizer and curvature estimator for second-order optimization in neural networks. | 248 |
vaibkumr/prompt-optimizer | A tool to reduce the complexity of text prompts to minimize API costs and model computations. | 241 |
qcri/llmebench | A benchmarking framework for large language models | 80 |
lyogavin/anima | An optimization technique for large language models allowing them to run on limited hardware resources without significant performance loss. | 6 |
ai-hypercomputer/maxtext | A high-performance LLM written in Python/Jax for training and inference on Google Cloud TPUs and GPUs. | 1,529 |
davisyoshida/lorax | A JAX transform that simplifies the training of large language models by reducing memory usage through low-rank adaptation. | 132 |
q-optimize/c3 | Toolset for optimizing and calibrating physical systems using machine learning and quantum computing | 67 |
pcg-mlp/ksanallm | An LLM inference and serving engine with high performance, flexibility, and support for various hardware platforms. | 288 |
deepseek-ai/deepseek-moe | A large language model with improved efficiency and performance compared to similar models | 1,006 |
google/jaxopt | An open-source project providing hardware accelerated, batchable and differentiable optimizers in JAX for deep learning. | 933 |
kendryte/toucan-llm | A large language model with 70 billion parameters designed for chatbot and conversational AI tasks | 29 |