This project offers a suite of advanced algorithms for post-training large language models to enhance their reasoning and code generation.
Gen-Verse/ReasonFlux is a Python-based open-source collection for improving LLM capabilities. It's a NeurIPS 2025 Spotlight suite featuring ReasonFlux-PRM, which uses trajectory-aware process reward models for long chain-of-thought reasoning, and ReasonFlux-Coder, which co-evolves LLM coders and unit testers through reinforcement learning. You can explore specific models and their details within directories like `./ReasonFlux_PRM/README.md` and by checking the linked `Gen-Verse/CURE` repository.
This project offers a suite of advanced algorithms for post-training large language models to enhance their reasoning and code generation.
AI researchers and advanced developers working on LLM reasoning, code generation, and reinforcement learning will find this useful.