rLLM is an open-source framework for training AI agents with reinforcement learning. Swap in a tracked client, define a reward function, and let RL handle the rest — no matter what agent framework you use.
Core Features
- Works with any agent framework — LangGraph, SmolAgent, Strands, OpenAI Agents SDK, Google ADK, or plain
openai.OpenAI. Just swap the client. 🔌
- Near-zero code changes — Add
@rllm.rollout to wrap your agent code, and rLLM traces every LLM call automatically. 🪄
- CLI-first workflow — Eval and train from the command line with 50+ built-in benchmarks.
rllm eval gsm8k just works. ⚡
- Battle-tested results — rLLM-trained agents beat models 50x their size (4B → outperforms 235B on finance, 1.5B → surpasses O1-Preview on math). 📈
- Multiple RL algorithms — GRPO, REINFORCE, RLOO, rejection sampling, and more. 🧠
- Two training backends —
verl for distributed multi-GPU training, tinker for single-machine / CPU setups. Same API either way. 🔧
Read more on our documentation site.
Installation