ITPS Maze2D — browser-side ACT & DP

What are you seeing?

A pretrained robot policy planning through a maze. The red dot is the agent, and the colored trails are possible futures.

Why this is cool: Your sketch steers the policy live at inference time—without retraining it.

Try it

Explore: Leave “unconditional rollouts” selected and move the red dot to preview different futures.

Sketch a guide: Choose a sketch-conditioned method, then click and drag from the red dot to draw a desired path.

DP ACT

Loading model…

Red dot: agent Rainbow trails: predicted futures Pale trails: wall collisions

Batch size: 8 DDIM steps: 4

Sketch alignment method:

What do the policies and controls mean?

ACT vs. DP: ACT predicts action chunks directly. DP samples trajectories through diffusion; in this demo it tends to produce more varied, collision-free possibilities.

Sketch methods: The dropdown compares post-hoc ranking, biased initialization, guided diffusion, and stochastic sampling. Available methods depend on the selected policy.

Batch size: How many futures are sampled per frame. Larger batches look richer but run more slowly.

DDIM steps: DP denoising iterations per frame. More steps refine the samples but take longer; the paper uses 10.