ktransformers

LLM optimizer

A flexible framework for LLM inference optimizations with support for multiple models and architectures

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

GitHub

771 stars

15 watching

41 forks

Language: Python

last commit: 11 months ago

Linked from 1 awesome list

Backlinks from these awesome lists:

ethicalml/awesome-production-machine-learning

Related projects:

Repository	Description	Stars
lge-arc-advancedai/auptimizer	Automates model building and deployment process by optimizing hyperparameters and compressing models for edge computing.	200
tensorzero/tensorzero	A tool for optimizing large language models by collecting feedback and metrics to improve their performance over time	1,245
alibaba/conv-llava	This project presents an optimization technique for large-scale image models to reduce computational requirements while maintaining performance.	106
brml/climin	A Python package for gradient-based function optimization in machine learning	181
google-deepmind/kfac-jax	Library providing an implementation of the K-FAC optimizer and curvature estimator for second-order optimization in neural networks.	252
vaibkumr/prompt-optimizer	A tool to reduce the complexity of text prompts to minimize API costs and model computations.	246
qcri/llmebench	A benchmarking framework for large language models	81
lyogavin/anima	An optimization technique for large language models allowing them to run on limited hardware resources without significant performance loss.	9
ai-hypercomputer/maxtext	A high-performance LLM written in Python/Jax for training and inference on Google Cloud TPUs and GPUs.	1,557
davisyoshida/lorax	A JAX transform that simplifies the training of large language models by reducing memory usage through low-rank adaptation.	134
q-optimize/c3	Toolset for optimizing and calibrating physical systems using machine learning and quantum computing	70
pcg-mlp/ksanallm	An LLM inference and serving engine with high performance, flexibility, and support for various hardware platforms.	295
deepseek-ai/deepseek-moe	A large language model with improved efficiency and performance compared to similar models	1,024
google/jaxopt	An open-source project providing hardware accelerated, batchable and differentiable optimizers in JAX for deep learning.	941
kendryte/toucan-llm	A large language model with 70 billion parameters designed for chatbot and conversational AI tasks	29