論文 深掘り Hugging Face 発表: 2026-05-13 HF ↑52

MemLens: Benchmarking Multimodal Long-Term Memory in Large Vision-Language Models

MemLens: Benchmarking Multimodal Long-Term Memory in Large Vision-Language Models

著者: Xiyu Ren, Zhaowei Wang, Yiming Du, Zhongwei Xie, Chi Liu ほか9名

要約

Memory is essential for large vision-language models (LVLMs) to handle long, multimodal interactions, with two method directions providing this capability: long-context LVLMs and memory-augmented agents. However, no existing benchmark conducts a systematic comparison of the two on questions that gen…

#multimodal#agent#benchmark

同じカテゴリの記事