ktransformers

LLM optimizer

A flexible framework for LLM inference optimizations with support for multiple models and architectures

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

GitHub

736 stars
15 watching
38 forks
Language: Python
last commit: 7 days ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
lge-arc-advancedai/auptimizer Automates model building and deployment process by optimizing hyperparameters and compressing models for edge computing. 200
tensorzero/tensorzero A tool that creates a feedback loop to optimize large language models by integrating model gateways and providing data analytics and machine learning capabilities. 569
alibaba/conv-llava This project presents an optimization technique for large-scale image models to reduce computational requirements while maintaining performance. 104
brml/climin A framework for optimizing machine learning functions using gradient-based optimization methods. 180
google-deepmind/kfac-jax Library providing an implementation of the K-FAC optimizer and curvature estimator for second-order optimization in neural networks. 248
vaibkumr/prompt-optimizer A tool to reduce the complexity of text prompts to minimize API costs and model computations. 241
qcri/llmebench A benchmarking framework for large language models 80
lyogavin/anima An optimization technique for large language models allowing them to run on limited hardware resources without significant performance loss. 6
ai-hypercomputer/maxtext A high-performance LLM written in Python/Jax for training and inference on Google Cloud TPUs and GPUs. 1,529
davisyoshida/lorax A JAX transform that simplifies the training of large language models by reducing memory usage through low-rank adaptation. 132
q-optimize/c3 Toolset for optimizing and calibrating physical systems using machine learning and quantum computing 67
pcg-mlp/ksanallm An LLM inference and serving engine with high performance, flexibility, and support for various hardware platforms. 288
deepseek-ai/deepseek-moe A large language model with improved efficiency and performance compared to similar models 1,006
google/jaxopt An open-source project providing hardware accelerated, batchable and differentiable optimizers in JAX for deep learning. 933
kendryte/toucan-llm A large language model with 70 billion parameters designed for chatbot and conversational AI tasks 29