We collect additional data for the dynamics model from two sources: (1) We rollout the pre-trained policy checkpoint on some out-of-distribution initial poses; (2) To further diversify the training data, we collect additional random human-play trajectories using the handheld UMI gripper (examples shown in the video above).