2026-06-09

18件

← アーカイブ一覧

論文 深掘り Hugging Face 2026-06-07 HF ↑33

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

Conventional LLMs keep the full KV cache loaded during decoding, causing a severe GPU memory bottleneck for ultra-long context serving. In this report, we propose Lookahead Sparse Attention (LSA), a novel inference paradigm powered by a Neural Memory Indexer built upon the DeepSeek-V4 architecture. ...

#llm#coding#benchmark
論文 深掘り Hugging Face 2026-06-07 HF ↑11

SwiftVR: Real-Time One-Step Generative Video Restoration

Real-time video restoration (VR) for live streams requires high-resolution outputs under strict per-frame latency constraints. Existing one-step diffusion-based VR models remain difficult to deploy on consumer-grade GPUs due to two main bottlenecks: quadratic spatial attention at high resolutions an...

#diffusion#coding
論文 深掘り Hugging Face 2026-06-07 HF ↑34

SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks

Spatial reasoning is a foundational capability for multimodal large language models (MLLMs) to perceive and operate within the physical world. However, existing benchmarks predominantly rely on passive evaluation (e.g., static VQA) or simulator-specific pipelines, failing to assess general interacti...

#agent#multimodal#benchmark#llm
論文 Hugging Face 2026-06-07 HF ↑6

End-to-End Context Compression at Scale

Long-context language model inference is bottlenecked by memory, as the KV cache grows with context length. Recent techniques to compress the KV cache fall short: they either degrade model quality substantially or require considerable time and compute to compress a single long prompt. Furthermore, m...

#agent
論文 Hugging Face 2026-06-07 HF ↑27

Echo-Memory: A Controlled Study of Memory in Action World Models

We present Echo-Memory, a controlled study of memory mechanisms in action-conditioned world models. These models generate multi-segment videos from a first frame, text prompt, and camera-action sequence, but their central failure is often memory rather than local image synthesis: after the camera le...

#benchmark#diffusion
論文 Hugging Face 2026-06-07 HF ↑37

Latent Spatial Memory for Video World Models

Video world models that maintain 3D spatial consistency across generated frames typically rely on explicit point cloud memory constructed in RGB space. This design is both computationally expensive, requiring repeated rendering and VAE encoding, and inherently lossy, as the round trip through pixel ...

#diffusion#coding
論文 深掘り Hugging Face 2026-06-08 HF ↑7

Reasoning Arena: Trace Tournaments When Verifiable Rewards Fall Short

Reinforcement learning with verifiable rewards (RLVR) has become a leading paradigm for improving the reasoning ability of large language models through outcome-based supervision. However, verifiable rewards frequently become uninformative at the group level: when all sampled traces of a given promp...

#llm#rl#coding#benchmark
論文 Hugging Face 2026-06-07 HF ↑15

OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics

Vision-language model (VLM) agents are increasingly deployed in interactive game environments. Yet game benchmarks for VLM agents typically report a single first-attempt score per (agent, game) pair, focus on single-agent Solo play, and lack unified protocols for evaluating heterogeneous agent class...

#agent#multimodal#benchmark#llm
論文 Hugging Face 2026-06-08 HF ↑2

Optical Reasoning: Rethinking Images as an Expressive Reasoning Medium Beyond Text

Chain-of-Thought (CoT) improves the performance of Large Language Models (LLMs) and has been extended to Multimodal Large Language Models (MLLMs). More recent work further moves from text-based multimodal reasoning toward interleaved-modal reasoning, where intermediate steps can incorporate both tex...

#multimodal#llm#benchmark
企業動向 OpenAI 2026-06-09

What Codex unlocks for Notion

How Notion uses Codex to one-shot specs, build AI Voice Input for the web, and multiply engineering power across small teams....

#speech
企業動向 OpenAI 2026-06-08

Built to benefit everyone: our plan

A vision for the future of AI, focusing on access, safety, and shared prosperity as OpenAI works to ensure AGI benefits everyone....

#alignment
モデル OpenAI 2026-06-08

Introducing the OpenAI Economic Research Exchange

OpenAI launches the Economic Research Exchange to study AI’s impact on jobs, productivity, and the economy. Applications are now open for selected research projects....