AC-Solver
PublicA long-horizon, sparse-reward math environment for reinforcement learning. Official code repo for "What makes Math problems hard for reinforcement learning: A case study".
A long-horizon, sparse-reward math environment for reinforcement learning. Official code repo for "What makes Math problems hard for reinforcement learning: A case study".