論文 Hugging Face 発表: 2026-05-12 HF ↑18

FrameSkip: Learning from Fewer but More Informative Frames in VLA Training

著者: Bin Yu, Shijie Lian, Xiaopeng Lin, Zhaolong Shen, Yuliang Wei ほか6名

要約

Vision-Language-Action (VLA) policies are commonly trained from dense robot demonstration trajectories, often collected through teleoperation, by sampling every recorded frame as if it provided equally useful supervision. We argue that this convention creates a temporal supervision imbalance: long l…

#alignment#robotics#benchmark

FrameSkip: Learning from Fewer but More Informative Frames in VLA Training

要約

同じカテゴリの記事

Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

World-R1: テキストから動画生成における3D制約の強化学習による整合