VLARLKit
An elegant and researcher-friendly RL library for Vision-Language-Action (VLA) models.
- Github repo: VLARLKit.
VLARLKit is a PyTorch-based reinforcement learning library tailored for Vision-Language-Action (VLA) models. It is designed to be friendly and convenient for researchers, with the following features:
- Simple and clear implementation, with cleanly separated policy, rollout, runner, and model layers, easy to read, modify, and extend
- Environment-decoupled architecture, environments run as independent processes via ZMQ, eliminating dependency conflicts between different benchmark simulators
- Async off-policy training, supports asynchronous off-policy training, enabling non-blocking data collection alongside model updates
Supported components
- RL Algorithms
- Proximal Policy Optimization (PPO) (on-policy)
- Diffusion Steering via Reinforcement Learning (DSRL) (off-policy)
- RL Token (RLT) (off-policy)
- Base Models
- Benchmarks