BentoML
AI API service
An open-source Python framework for building model inference APIs and serving AI models in production environments.
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
7k stars
77 watching
792 forks
Language: Python
last commit: 6 days ago
Linked from 3 awesome lists
ai-inferencedeep-learninggenerative-aiinference-platformllmllm-inferencellm-servingllmopsmachine-learningml-engineeringmlopsmodel-inference-servicemodel-servingmultimodalpython
Related projects:
Repository | Description | Stars |
---|---|---|
meta-llama/llama-stack | Provides a set of standardized APIs and tools to build generative AI applications | 4,591 |
jina-ai/serve | A framework for building and deploying AI services that can be scaled from local development to production | 21,129 |
mindsdb/mindsdb | An AI platform for building agents that can learn and answer questions over federated data from various sources. | 26,793 |
leptonai/leptonai | An AI service building framework using Python to simplify the development of machine learning models and services. | 2,653 |
fedml-ai/fedml | A unified and scalable machine learning library for large-scale distributed training, model serving, and federated learning. | 4,187 |
ludwig-ai/ludwig | A low-code framework for building custom deep learning models and neural networks | 11,189 |
optimalscale/lmflow | A toolkit for finetuning large language models and providing efficient inference capabilities | 8,273 |
exo-explore/exo | Allows developers to run AI models on personal devices with diverse hardware configurations. | 14,829 |
tensorflow/serving | A high-performance serving system for machine learning models in production environments. | 6,185 |
replicate/cog | A tool for packaging and deploying machine learning models in a standard, production-ready container environment. | 8,081 |
ericlbuehler/mistral.rs | A fast and flexible LLM inference platform supporting various models and devices | 4,466 |
ml-tooling/opyrator | Automates conversion of machine learning code into production-ready microservices with web API and GUI. | 3,102 |
mockoon/mockoon | A tool to design and run mock APIs locally, allowing developers to speed up development, test applications in a controlled environment, and simulate edge cases. | 6,558 |
ahkarami/deep-learning-in-production | A collection of notes and references on deploying deep learning models in production environments | 4,306 |
deepset-ai/haystack | An AI orchestration framework to build customizable LLM applications with advanced retrieval methods. | 17,691 |