BentoML

AI API service

An open-source Python framework for building model inference APIs and serving AI models in production environments.

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

GitHub

7k stars
77 watching
792 forks
Language: Python
last commit: 6 days ago
Linked from 3 awesome lists

ai-inferencedeep-learninggenerative-aiinference-platformllmllm-inferencellm-servingllmopsmachine-learningml-engineeringmlopsmodel-inference-servicemodel-servingmultimodalpython

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
meta-llama/llama-stack Provides a set of standardized APIs and tools to build generative AI applications 4,591
jina-ai/serve A framework for building and deploying AI services that can be scaled from local development to production 21,129
mindsdb/mindsdb An AI platform for building agents that can learn and answer questions over federated data from various sources. 26,793
leptonai/leptonai An AI service building framework using Python to simplify the development of machine learning models and services. 2,653
fedml-ai/fedml A unified and scalable machine learning library for large-scale distributed training, model serving, and federated learning. 4,187
ludwig-ai/ludwig A low-code framework for building custom deep learning models and neural networks 11,189
optimalscale/lmflow A toolkit for finetuning large language models and providing efficient inference capabilities 8,273
exo-explore/exo Allows developers to run AI models on personal devices with diverse hardware configurations. 14,829
tensorflow/serving A high-performance serving system for machine learning models in production environments. 6,185
replicate/cog A tool for packaging and deploying machine learning models in a standard, production-ready container environment. 8,081
ericlbuehler/mistral.rs A fast and flexible LLM inference platform supporting various models and devices 4,466
ml-tooling/opyrator Automates conversion of machine learning code into production-ready microservices with web API and GUI. 3,102
mockoon/mockoon A tool to design and run mock APIs locally, allowing developers to speed up development, test applications in a controlled environment, and simulate edge cases. 6,558
ahkarami/deep-learning-in-production A collection of notes and references on deploying deep learning models in production environments 4,306
deepset-ai/haystack An AI orchestration framework to build customizable LLM applications with advanced retrieval methods. 17,691