inference

7 repos

vllm-project

A high-throughput and memory-efficient inference and serving engine for LLMs

deepspeedai

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

google-ai-edge

Cross-platform, customizable ML solutions for live and streaming media.

openvinotoolkit

OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

vllm-project

A framework for efficient model inference with omni-modality models

TypeDB: Built for systems, not records

theopenco

Route, manage, and analyze your LLM requests across multiple providers with a unified API interface.