vllm-project
A high-throughput and memory-efficient inference and serving engine for LLMs
skypilot-org
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, Slurm, 20+ clouds, on-prem).