predibase/lorax — Efficiently serve numerous LoRA-fine-tuned LLMs on a si

Version	Commit	Size	Downloads	Date
latestLatest	HEAD	1.2 MB	9	1mo ago

LoRAX: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

LoRAX (LoRA eXchange) is a framework that allows users to serve thousands of fine-tuned models on a single GPU, dramatically reducing the cost of serving without compromising on throughput or latency.

📖 Table of contents

📖 Table of contents
🌳 Features
🏠 Models
🏃‍♂️ Getting Started

lorax

Quick Overview

What is this?

What problem does it solve?

Who should use it?

Pros

Cons

Scores

Trust Score

Maintenance

Popularity

Star History

Snapshot Versions

Alternatives

tensorflow

stable-diffusion-webui

transformers

pytorch

LLMs-from-scratch

opencv

Community Reviews

README

📖 Table of contents