by Y. As, C. Qu, B. Unger, D. Kang, M. van der Hart, L. Shi, S. Coros, A. Wierman, A. Krause
Abstract:
Deploying reinforcement learning (RL) safely in the real world is challenging, as policies trained in simulators must face the inevitable ‘sim-to-real gap’. Robust safe RL techniques are provably safe however difficult to scale, while domain randomization is more practical yet prone to unsafe behaviors. We address this gap by proposing SPiDR, short for Sim-to-real via Pessimistic Domain Randomization—a scalable algorithm with provable guarantees for safe sim-to-real transfer. SPiDR uses domain randomization to incorporate the uncertainty about the sim-to-real gap into the safety constraints, making it versatile and highly compatible with existing training pipelines. Through extensive experiments on sim-to-sim benchmarks and two distinct real-world robotic platforms, we demonstrate that SPiDR effectively ensures safety despite the sim-to-real gap while maintaining strong performance
Reference:
SPiDR: A Simple Approach for Zero-Shot Safety in Sim-to-Real Transfer Y. As, C. Qu, B. Unger, D. Kang, M. van der Hart, L. Shi, S. Coros, A. Wierman, A. KrauseIn Advances in Neural Information Processing Systems (NeurIPS), 2025
Bibtex Entry:
@inproceedings{
as2025spidr,
title={SPiDR: A Simple Approach for Zero-Shot Safety in Sim-to-Real Transfer},
author={Yarden As and Chengrui Qu and Benjamin Unger and Dongho Kang and Max van der Hart and Laixi Shi and Stelian Coros and Adam Wierman and Andreas Krause},
booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
year={2025},
month={December},
blog={https://yardenas.github.io/spidr/}
}