mean token utilization

1 article

← All topics

NVIDIA ENPIRE: real-robot coding agents hit 99% pass@8
nvidiacmuuc-berkeleyenpirerobotics+17

On 2026-06-16, NVIDIA GEAR, CMU LeCAR Lab, and UC Berkeley published ENPIRE, a four-module harness (Environment, Policy Improvement, Rollout, Evolution) that puts coding agents (Codex with GPT-5.5, Claude Code with Opus 4.7, Kimi Code with Kimi K2.6) in a fully automatic closed loop on real robots, with auto-reset and auto-verify. The team reports 99% pass@8 across five hard manipulation tasks (Push-T, Pin Insertion, Tie Zip-tie, GPU Insertion, Cut Zip-tie), team-size scaling 1/4/8, and two new multi-agent physical-autoresearch efficiency metrics — Mean Robot Utilization (MRU) and Mean Token Utilization (MTU). The 99% figure is the team's emergent retry-and-recovery capability, not best-of-8 sampling; a heuristic-policy baseline reports 0% coverage in 43–73 steps. The harness code is not yet open-sourced as of 2026-06-19.