#reinforcement-learning (3 件)

ai 2026年5月16日

Sakana AI「Conductor」——7BモデルがGPT-5・Claude Sonnet 4・Gemini 2.5 Proを指揮してSOTAを超える仕組み

Sakana AIがICLR 2026で発表した7BのConductorモデルは、強化学習でGPT-5・Claude Sonnet 4・Gemini 2.5 Proを動的にオーケストレーションしGPQA-Diamond 87.5%・LiveCodeBench 83.93%を達成。商用製品Fugu（ベータ）の技術的仕組みを解説。

#ai #llm #multi-agent #orchestration #reinforcement-learning #sakana-ai #iclr #gpt-5 #benchmark

記事へ →

ai 2026年5月4日

Ineffable Intelligence——AlphaGo設計者が欧州史上最大$1.1Bを調達、「人間データ不要」強化学習スーパーラーナーが目指す次のフロンティア

2026年4月27日、DeepMind元RL主任David SilverのIneffable Intelligenceが欧州史上最大$1.1Bシード（評価額$5.1B）を調達。強化学習で人間データなしに新知識を発見する「スーパーラーナー」の構想と開発者・研究者への意味を解説。

#ai #reinforcement-learning #startup #deepmind #machine-learning #research #sequoia #funding

記事へ →

ai 2026年4月29日

SonyのAIロボット「Ace」がプロ卓球選手を破る——Natureに掲載された物理AIの新マイルストーンと開発者への示唆

Sony AIが開発した自律ロボット「Ace」がNature誌（2026年4月23日）に掲載された。エリート選手に3勝2敗、3月には新たな3名のプロ選手全員から少なくとも1勝。8関節アームと高速カメラネットワークによる物理AIが卓球という競技でついて人間のプロ級に達した初のシステム。

#sony-ai #robotics #physical-ai #reinforcement-learning #nature #computer-vision #autonomous-robot #ai

記事へ →