論文深掘り arXiv 発表: 2026-05-04

Reinforcement Learning for LLM-based Multi-Agent Systems through Orchestration Traces

著者: Chenchen Zhang

要約

As large language model (LLM) agents evolve from isolated tool users into coordinated teams, reinforcement learning (RL) must optimize not only individual actions but also how work is spawned, delegated, communicated, aggregated, and stopped. This paper studies RL for LLM-based multi-agent systems t…

#agent#llm#rl#benchmark

Reinforcement Learning for LLM-based Multi-Agent Systems through Orchestration Traces

要約

同じカテゴリの記事

Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

World-R1: テキストから動画生成における3D制約の強化学習による整合