OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics
OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics
要約
Vision-language model (VLM) agents are increasingly deployed in interactive game environments. Yet game benchmarks for VLM agents typically report a single first-attempt score per (agent, game) pair, focus on single-agent Solo play, and lack unified protocols for evaluating heterogeneous agent class…