論文 Hugging Face 発表: 2026-06-02 HF ↑6

Value-Aware Stochastic KV Cache Eviction for Reasoning Models

Value-Aware Stochastic KV Cache Eviction for Reasoning Models

著者: Ting-Yun Chang, Harvey Yiyun Fu, Deqing Fu, Chenghao Yang, Jesse Thomason ほか1名

要約

Reasoning models improve accuracy through extended chains of thought, but their long outputs create a memory and compute bottleneck. KV cache eviction methods reduce this cost by evicting unimportant key-value pairs from the cache, yet they often yield worse accuracy than selection-based sparse atte…

同じカテゴリの記事