BentoML

AI API service

An open-source Python framework for building model inference APIs and serving AI models in production environments.

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

GitHub

7k stars

77 watching

795 forks

Language: Python

last commit: 8 months ago

Linked from 3 awesome lists

ai-inferencedeep-learninggenerative-aiinference-platformllmllm-inferencellm-servingllmopsmachine-learningml-engineeringmlopsmodel-inference-servicemodel-servingmultimodalpython

bentoml.com

Backlinks from these awesome lists:

Related projects:

Repository	Description	Stars
meta-llama/llama-stack	Provides pre-packaged building blocks for generative AI applications with standardized APIs and service-oriented design.	5,164
jina-ai/serve	A framework for building and deploying AI services that can be scaled from local development to production	21,180
mindsdb/mindsdb	A platform for building AI agents that can learn from and answer questions across multiple data sources using machine learning and natural language processing.	26,915
leptonai/leptonai	An AI service building framework using Python to simplify the development of machine learning models and services.	2,669
fedml-ai/fedml	A unified and scalable machine learning library for large-scale distributed training, model serving, and federated learning.	4,205
ludwig-ai/ludwig	A low-code framework for building custom deep learning models and neural networks	11,236
optimalscale/lmflow	A toolkit for fine-tuning and inferring large machine learning models	8,312
exo-explore/exo	An experimental software framework to run AI models on diverse devices without requiring expensive GPUs.	17,369
tensorflow/serving	A high-performance serving system for machine learning models in production environments.	6,195
replicate/cog	A tool for packaging and deploying machine learning models in a standard, production-ready container environment.	8,169
ericlbuehler/mistral.rs	A high-performance LLM inference framework written in Rust	4,677
ml-tooling/opyrator	Automates conversion of machine learning code into production-ready microservices with web API and GUI.	3,116
mockoon/mockoon	A tool to design and run mock APIs locally, allowing developers to speed up development, test applications in a controlled environment, and simulate edge cases.	6,665
ahkarami/deep-learning-in-production	A collection of notes and references on deploying deep learning models in production environments	4,313
deepset-ai/haystack	An AI orchestration framework to build customizable LLM applications with advanced retrieval methods.	18,094