Inference-Time Policy Steering — Maze2D

Real-time motion-policy predictions running entirely on your device. Pick an engine, move the mouse over the maze.
Based on Wang et al., Inference-Time Policy Steering through Human Interactions.

Loading model…
Sketch alignment method:
Batch size: Trajectories per frame. Bigger looks richer but is slower; collision trajectories are whitened/grayed compared to collision-free ones. DDIM steps: DP denoising iterations per frame. The paper uses 10; lower is faster but rougher. Changing it is instant. Unconditional: With “unconditional rollouts” selected, move or click-and-drag the mouse to reposition the red agent and preview rollouts. Sketch: With a sketch method selected, click-and-drag to draw different guide sketches for the same fixed agent location. The agent stays fixed until you move near the red agent to reactivate it.