ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both
ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both
要約
Visual reasoning, often interleaved with intermediate visual states, has emerged as a promising direction in the field. A straightforward approach is to directly generate images via unified models during reasoning, but this is computationally expensive and architecturally non-trivial. Recent alterna…