Latent Policy Barrier (LPB) decouples precise expert imitation from OOD recovery by leverages two complementary components: (a) a base diffusion policy trained exclusively on consistent, high-quality expert demonstrations, ensuring precise imitation and high task performance; and (b) an action-conditioned visual latent dynamics model trained on a broader, mixed-quality dataset combining expert demonstrations and automatically generated rollout data. At inference time, if the Euclidean distance between the current latent state and the nearest expert state is below a threshold, LPB defaults to standard action denoising. Otherwise, LPB refines the action denoising process by performing policy steering in the latent space, effectively ensuring that the agent stays within the expert distribution. LPB uses the dynamics model to predict future latent states conditioned on candidate actions output from the base policy, and minimizes the distance between the predicted future latent states and their nearest neighbors from the expert demonstrations in the same latent space. In this way, LPB simultaneously achieves high task performance and robustness, resolving deviations without compromising imitation precision.