Block-wise Adaptive Caching for Accelerating Diffusion Policy
Published in International Conference on Learning Representations 2026 (ICLR 2026), 2026
Recommended citation: Kangye Ji, Yuan Meng, Hanyun Cui, Ye Li, Jianbo Zhou, Shengjia Hua, Lei Chen, Zhi Wang. "Block-wise Adaptive Caching for Accelerating Diffusion Policy." ICLR 2026. https://arxiv.org/abs/2506.13456

Diffusion Policy has demonstrated strong visuomotor modeling capabilities, but its high computational cost during denoising renders it impractical for real-time robotic control. Despite huge redundancy across repetitive denoising steps, existing diffusion-acceleration techniques fail to generalize to Diffusion Policy due to fundamental architectural and data divergences.
BAC (Block-wise Adaptive Caching) accelerates Diffusion Policy by caching intermediate action features, achieving lossless acceleration by adaptively updating and reusing cached features at the block level — based on the key observation that feature similarities are non-uniform over time and block-specific:
- Adaptive Caching Scheduler. Identifies optimal per-block update timesteps by maximizing global feature similarity between cached and skipped features.
- Bubbling Union Algorithm. Per-block scheduling alone triggers error surges from inter-block error propagation (especially in FFN blocks); BAC truncates these errors by updating high-error upstream blocks before downstream FFNs.
As a training-free plugin compatible with transformer-based Diffusion Policy and vision-language-action models, BAC achieves up to 3× inference speedup for free.