論文 Hugging Face 発表: 2026-05-20 HF ↑16

Maestro: Reinforcement Learning to Orchestrate Hierarchical Model-Skill Ensembles

著者: Jinyang Wu, Guocheng Zhai, Ruihan Jin, Yuhao Shen, Zhengxi Lu ほか5名

要約

The proliferation of large language models (LLMs) and modular skills has endowed autonomous agents with increasingly powerful capabilities. Existing frameworks typically rely on monolithic LLMs and fixed logic to interface with these skills. This gives rise to a critical bottleneck: different LLMs o…

#llm#multimodal#rl#agent#benchmark

Maestro: Reinforcement Learning to Orchestrate Hierarchical Model-Skill Ensembles

要約

同じカテゴリの記事

Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

World-R1: テキストから動画生成における3D制約の強化学習による整合