bentoml/BentoML — Easily build and deploy AI model inference APIs and pip

Version	Commit	Size	Downloads	Date
latestLatest	HEAD	33.2 MB	8	1mo ago

🍱 Build model inference APIs and multi-model serving systems with any open-source or custom AI models. 👉 Join our forum!

BentoML is a Python library for building online serving systems optimized for AI apps and model inference.

🍱 Easily build APIs for Any AI/ML Model. Turn any model inference script into a REST API server with just a few lines of code and standard Python type hints.
🐳 Docker Containers made simple. No more dependency hell! Manage your environments, dependencies and model versions with a simple config file. BentoML automatically generates Docker images, ensures reproducibility, and simplifies how you deploy to different environments.
🧭 Maximize CPU/GPU utilization. Build high performance inference APIs leveraging built-in serving optimization features like dynamic batching, model parallelism, multi-stage pipeline and multi-model inference-graph orchestration.
👩‍💻 Fully customizable. Easily implement your own APIs or task queues, with custom business logic, model inference and multi-model composition. Supports any ML framework, modality, and inference runtime.
🚀 Ready for Production. Develop, run and debug locally. Seamlessly deploy to production with Docker containers or BentoCloud.

Install BentoML:

BentoML