BentoML
AI API service
An open-source Python framework for building model inference APIs and serving AI models in production environments.
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
7k stars
77 watching
795 forks
Language: Python
last commit: 2 months ago
Linked from 3 awesome lists
ai-inferencedeep-learninggenerative-aiinference-platformllmllm-inferencellm-servingllmopsmachine-learningml-engineeringmlopsmodel-inference-servicemodel-servingmultimodalpython
Related projects:
Repository | Description | Stars |
---|---|---|
| Provides pre-packaged building blocks for generative AI applications with standardized APIs and service-oriented design. | 5,164 |
| A framework for building and deploying AI services that can be scaled from local development to production | 21,180 |
| A platform for building AI agents that can learn from and answer questions across multiple data sources using machine learning and natural language processing. | 26,915 |
| An AI service building framework using Python to simplify the development of machine learning models and services. | 2,669 |
| A unified and scalable machine learning library for large-scale distributed training, model serving, and federated learning. | 4,205 |
| A low-code framework for building custom deep learning models and neural networks | 11,236 |
| A toolkit for fine-tuning and inferring large machine learning models | 8,312 |
| An experimental software framework to run AI models on diverse devices without requiring expensive GPUs. | 17,369 |
| A high-performance serving system for machine learning models in production environments. | 6,195 |
| A tool for packaging and deploying machine learning models in a standard, production-ready container environment. | 8,169 |
| A high-performance LLM inference framework written in Rust | 4,677 |
| Automates conversion of machine learning code into production-ready microservices with web API and GUI. | 3,116 |
| A tool to design and run mock APIs locally, allowing developers to speed up development, test applications in a controlled environment, and simulate edge cases. | 6,665 |
| A collection of notes and references on deploying deep learning models in production environments | 4,313 |
| An AI orchestration framework to build customizable LLM applications with advanced retrieval methods. | 18,094 |