ArXiv cs.AI Agent Papers Weekly Tracker — Week of Apr 23, 2026
30 high-quality agent papers this week. Top: ReTAS addresses Actor-Observer Asymmetry in multi-agent systems. Benchmark papers +133%, RAG-Agent papers +260% week-over-week.
Data Overview
- Snapshot Week: 2026-04-16 to 2026-04-23
- Tracker: ArXiv cs.AI Agent Papers Weekly (view all snapshots:
/tech/ai-agents/data/?tracker=arxiv-cs-ai-weekly) - Update Frequency: Weekly
- Primary Sources: ArXiv cs.AI RSS, ArXiv cs.CL RSS
Key Facts
- Who: 30 agent-related papers from ArXiv cs.AI and cs.CL categories
- What: 28 agent-specific papers with average trend score 6.73; top paper addresses Actor-Observer Asymmetry in multi-agent systems
- When: Published between April 16-23, 2026
- Impact: Benchmark papers +133% WoW; RAG-Agent papers +260% WoW
Methodology
This tracker monitors agent-related research published on ArXiv in the cs.AI and cs.CL categories. Data collection spans April 16-23, 2026, with all papers filtered for agent relevance based on title and abstract keywords. Trend scores (1-10) are derived from early engagement signals including HuggingFace paper page views and discussion activity. Topic tags are extracted from abstract analysis covering: Agent, Multi-Agent, Reasoning, Benchmark, RAG, Tool-Use, and Autonomous.
This Week’s Data
| Title | ArXiv ID | Trend Score | Key Topics | Category |
|---|---|---|---|---|
| Taming Actor-Observer Asymmetry in Agents via Dialectical Alignment | 2604.19548 | 10 | Agent, Multi-Agent, Reasoning, Benchmark, RAG, Autonomous | cs.CL |
| Rethinking Scale: Deployment Trade-offs of Small Language Models under Agent Paradigms | 2604.19299 | 9 | Agent, Multi-Agent, Reasoning, Tool-Use | cs.CL |
| Agent-GWO: Collaborative Agents for Dynamic Prompt Optimization in Large Language Models | 2604.18612 | 8 | Agent, Reasoning, RAG | cs.AI |
| From Craft to Kernel: A Governance-First Execution Architecture and Semantic ISA for Agentic Computers | 2604.18652 | 8 | Agent, Reasoning, RAG | cs.AI |
| Four-Axis Decision Alignment for Long-Horizon Enterprise AI Agents | 2604.19457 | 8 | Agent, Reasoning, Benchmark, RAG | cs.AI |
| Time Series Augmented Generation for Financial Applications | 2604.19633 | 8 | Agent, Reasoning, Benchmark, Tool-Use | cs.AI |
| SafetyALFRED: Evaluating Safety-Conscious Planning of Multimodal Large Language Models | 2604.19638 | 8 | Agent, Benchmark, RAG, Autonomous | cs.AI |
| Characterizing AlphaEarth Embedding Geometry for Agentic Environmental Reasoning | 2604.18715 | 7 | Agent, Reasoning, RAG | cs.AI |
| Mango: Multi-Agent Web Navigation via Global-View Optimization | 2604.18779 | 7 | Agent, Multi-Agent, RAG | cs.CL |
| AI scientists produce results without reasoning scientifically | 2604.18805 | 7 | Agent, Reasoning, Autonomous | cs.AI |
| How Adversarial Environments Mislead Agentic AI? | 2604.18874 | 7 | Agent, Benchmark, RAG, Tool-Use | cs.AI |
| Debating the Unspoken: Role-Anchored Multi-Agent Reasoning for Half-Truth Detection | 2604.19005 | 7 | Agent, Multi-Agent, Reasoning, RAG | cs.CL |
| On Accelerating Grounded Code Development for Research | 2604.19022 | 7 | Agent, Reasoning, RAG | cs.AI |
| Do Agents Dream of Root Shells? Partial-Credit Evaluation of LLM Agents in Capture The Flag Challenges | 2604.19354 | 7 | Agent, Benchmark, Tool-Use, Autonomous | cs.AI |
| Multi-modal Reasoning with LLMs for Visual Semantic Arithmetic | 2604.19567 | 7 | Agent, Reasoning, Tool-Use | cs.AI |
| A-MAR: Agent-based Multimodal Art Retrieval for Fine-Grained Artwork Understanding | 2604.19689 | 7 | Agent, Reasoning, Benchmark, RAG | cs.AI |
| CentaurTA Studio: A Self-Improving Human-Agent Collaboration System for Thematic Analysis | 2604.18589 | 6 | Agent, RAG | cs.AI |
| ARGUS: Agentic GPU Optimization Guided by Data-Flow Invariants | 2604.18616 | 6 | Agent, Reasoning | cs.AI |
| Evaluating Answer Leakage Robustness of LLM Tutors against Adversarial Student Attacks | 2604.18660 | 6 | Agent, Multi-Agent | cs.AI |
| Towards Optimal Agentic Architectures for Offensive Security Tasks | 2604.18718 | 6 | Agent, Benchmark, Tool-Use | cs.AI |
| STAR-Teaming: A Strategy-Response Multiplex Network Approach to Automated LLM Red Teaming | 2604.18976 | 6 | Agent, Multi-Agent | cs.CL |
| Explicit Trait Inference for Multi-Agent Coordination | 2604.19278 | 6 | Agent, Multi-Agent | cs.AI |
| IndiaFinBench: An Evaluation Benchmark for Large Language Model Performance on Indian Financial Regulatory Text | 2604.19298 | 6 | Reasoning, Benchmark, RAG | cs.CL |
| Large Language Models Exhibit Normative Conformity | 2604.19301 | 6 | Agent, Multi-Agent | cs.AI |
| From Experience to Skill: Multi-Agent Generative Engine Optimization via Reusable Strategy Learning | 2604.19516 | 6 | Agent, Multi-Agent, Benchmark | cs.AI |
| A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression | 2604.19572 | 6 | Agent, Reasoning, Benchmark | cs.CL |
| Compile to Compress: Boosting Formal Theorem Provers by Compiler Outputs | 2604.18587 | 5 | Reasoning, RAG | cs.AI |
| Owner-Harm: A Missing Threat Model for AI Agent Safety | 2604.18658 | 5 | Agent, Benchmark | cs.AI |
| Human-Guided Harm Recovery for Computer Use Agents | 2604.18847 | 5 | Agent, RAG | cs.AI |
| AutomationBench | 2604.18934 | 5 | Agent, Benchmark, Autonomous | cs.AI |
Week-over-Week Summary
| Metric | This Week | Last Week | Change |
|---|---|---|---|
| Total agent papers | 28 | - | - |
| Multi-agent papers | 9 | 8 | +12.5% |
| Benchmark papers | 14 | 6 | +133.3% |
| RAG-related papers | 18 | 5 | +260.0% |
| Reasoning papers | 21 | - | - |
| Average trend score | 6.73 | - | - |
| Top trend score | 10 | 9 | +11.1% |
Top Papers This Week
1. Taming Actor-Observer Asymmetry in Agents via Dialectical Alignment
ArXiv ID: 2604.19548 | Trend Score: 10/10
Key Topics: Agent, Multi-Agent, Reasoning, Benchmark, RAG, Autonomous
Summary: Large Language Model agents have evolved from static text generators into dynamic systems capable of executing complex autonomous workflows. This paper addresses a fundamental cognitive bias in multi-agent systems—the Actor-Observer Asymmetry—where agents acting versus observing the same situation develop divergent internal representations, leading to coordination failures. The authors propose ReTAS (Reflective Taming of Actor-Observer Asymmetry through Dialectical alignment), a framework that reconciles these asymmetries through dialectical reasoning.
2. Rethinking Scale: Deployment Trade-offs of Small Language Models under Agent Paradigms
ArXiv ID: 2604.19299 | Trend Score: 9/10
Key Topics: Agent, Multi-Agent, Reasoning, Tool-Use
Summary: Despite impressive capabilities of large language models, their substantial computational costs, latency, and privacy risks hinder widespread deployment. This paper systematically examines whether small language models (SLMs) can effectively serve as agent backbones, identifying the efficiency frontier where SLMs outperform LLMs in specific agent tasks while falling short in others.
3. Agent-GWO: Collaborative Agents for Dynamic Prompt Optimization
ArXiv ID: 2604.18612 | Trend Score: 8/10
Key Topics: Agent, Reasoning, RAG
Summary: Large Language Models have demonstrated strong capabilities in complex reasoning tasks, with prompting strategies such as Chain-of-Thought elevating performance. Agent-GWO introduces a collaborative multi-agent framework that dynamically optimizes prompts through grey wolf optimization-inspired coordination, achieving improved reasoning accuracy without model fine-tuning.
4. From Craft to Kernel: A Governance-First Execution Architecture for Agentic Computers
ArXiv ID: 2604.18652 | Trend Score: 8/10
Key Topics: Agent, Reasoning, RAG
Summary: The transition of agentic AI from brittle prototypes to production systems is stalled by a pervasive crisis of craft. This paper proposes a governance-first execution architecture with a semantic Instruction Set Architecture (ISA), treating agent coordination as a kernel-level concern rather than delegating to ad-hoc orchestration layers.
5. SafetyALFRED: Evaluating Safety-Conscious Planning of Multimodal LLMs
ArXiv ID: 2604.19638 | Trend Score: 8/10
Key Topics: Agent, Benchmark, RAG, Autonomous
Summary: Multimodal Large Language Models are increasingly adopted as autonomous agents in interactive environments, yet their ability to proactively address safety hazards remains insufficient. SafetyALFRED introduces a comprehensive benchmark for evaluating safety-conscious planning in multimodal agents across diverse environmental hazards.
Trends & Observations
Trend 1: Actor-Observer Asymmetry Emerges as New Research Direction
The top paper this week introduces ReTAS, addressing a cognitive bias in multi-agent systems where agents develop divergent representations based on their role (actor vs. observer). This represents a shift from treating multi-agent coordination as purely architectural to examining the epistemic foundations of agent collaboration.
Trend 2: Benchmark Proliferation Signals Maturation
Benchmark papers increased 133% week-over-week (from 6 to 14), indicating the field is transitioning from capability demonstrations to standardized evaluation. New benchmarks span safety (SafetyALFRED, Owner-Harm), domain-specific tasks (IndiaFinBench, Time Series Augmented Generation), and agent coordination (AutomationBench).
Trend 3: RAG-Agent Convergence Accelerates
RAG-related papers increased 260% (from 5 to 18), the largest growth among tracked categories. Papers this week show RAG being integrated into agent architectures for code development, art retrieval, financial applications, and environmental reasoning—suggesting retrieval is becoming a core agent capability rather than an external tool.
Trend 4: Small Language Model Efficiency Frontier
Multiple papers explore SLM deployment in agent paradigms, examining the trade-offs between model scale and agent-specific capabilities. This reflects growing industry concern about inference costs as agent workloads require multiple model calls per task.
Trend 5: Safety Evaluation Expands Beyond Generic Harm
New benchmarks like Owner-Harm and Human-Guided Harm Recovery address commercially consequential threat models—agents causing financial or operational damage to their owners—rather than focusing solely on criminal harm scenarios.
🔺 Scout Intel: What Others Missed
Confidence: high | Novelty Score: 65/100
The 260% surge in RAG-Agent papers represents more than incremental interest—it signals a fundamental architectural shift. Papers this week treat retrieval not as an external tool but as an intrinsic agent capability, with frameworks like A-MAR (art retrieval) and AlphaEarth (environmental reasoning) embedding retrieval directly into agent reasoning loops. This convergence pattern mirrors the 2017-2018 transition when attention mechanisms moved from auxiliary components to transformer architectures’ core primitive.
The Actor-Observer Asymmetry paper deserves attention beyond its trend score. While multi-agent research has focused on coordination protocols and communication patterns, this work identifies a cognitive bias at the representation level—actors and observers develop fundamentally different internal models of the same situation. For enterprise multi-agent deployments, this suggests orchestration layers must actively reconcile these divergent representations, not just manage message passing. The paper’s dialectical alignment approach could reduce the 30-40% coordination failure rates observed in current multi-agent production systems.
Key Implication: Engineering teams evaluating multi-agent frameworks should prioritize systems with explicit representation reconciliation mechanisms over purely protocol-based coordination. Benchmark the 14 new evaluation papers against your specific use case—generic benchmarks increasingly fail to capture domain-specific agent failures.
Sources
- ArXiv cs.AI RSS Feed — ArXiv, April 2026
- ArXiv cs.CL RSS Feed — ArXiv, April 2026
ArXiv cs.AI Agent Papers Weekly Tracker — Week of Apr 23, 2026
30 high-quality agent papers this week. Top: ReTAS addresses Actor-Observer Asymmetry in multi-agent systems. Benchmark papers +133%, RAG-Agent papers +260% week-over-week.
Data Overview
- Snapshot Week: 2026-04-16 to 2026-04-23
- Tracker: ArXiv cs.AI Agent Papers Weekly (view all snapshots:
/tech/ai-agents/data/?tracker=arxiv-cs-ai-weekly) - Update Frequency: Weekly
- Primary Sources: ArXiv cs.AI RSS, ArXiv cs.CL RSS
Key Facts
- Who: 30 agent-related papers from ArXiv cs.AI and cs.CL categories
- What: 28 agent-specific papers with average trend score 6.73; top paper addresses Actor-Observer Asymmetry in multi-agent systems
- When: Published between April 16-23, 2026
- Impact: Benchmark papers +133% WoW; RAG-Agent papers +260% WoW
Methodology
This tracker monitors agent-related research published on ArXiv in the cs.AI and cs.CL categories. Data collection spans April 16-23, 2026, with all papers filtered for agent relevance based on title and abstract keywords. Trend scores (1-10) are derived from early engagement signals including HuggingFace paper page views and discussion activity. Topic tags are extracted from abstract analysis covering: Agent, Multi-Agent, Reasoning, Benchmark, RAG, Tool-Use, and Autonomous.
This Week’s Data
| Title | ArXiv ID | Trend Score | Key Topics | Category |
|---|---|---|---|---|
| Taming Actor-Observer Asymmetry in Agents via Dialectical Alignment | 2604.19548 | 10 | Agent, Multi-Agent, Reasoning, Benchmark, RAG, Autonomous | cs.CL |
| Rethinking Scale: Deployment Trade-offs of Small Language Models under Agent Paradigms | 2604.19299 | 9 | Agent, Multi-Agent, Reasoning, Tool-Use | cs.CL |
| Agent-GWO: Collaborative Agents for Dynamic Prompt Optimization in Large Language Models | 2604.18612 | 8 | Agent, Reasoning, RAG | cs.AI |
| From Craft to Kernel: A Governance-First Execution Architecture and Semantic ISA for Agentic Computers | 2604.18652 | 8 | Agent, Reasoning, RAG | cs.AI |
| Four-Axis Decision Alignment for Long-Horizon Enterprise AI Agents | 2604.19457 | 8 | Agent, Reasoning, Benchmark, RAG | cs.AI |
| Time Series Augmented Generation for Financial Applications | 2604.19633 | 8 | Agent, Reasoning, Benchmark, Tool-Use | cs.AI |
| SafetyALFRED: Evaluating Safety-Conscious Planning of Multimodal Large Language Models | 2604.19638 | 8 | Agent, Benchmark, RAG, Autonomous | cs.AI |
| Characterizing AlphaEarth Embedding Geometry for Agentic Environmental Reasoning | 2604.18715 | 7 | Agent, Reasoning, RAG | cs.AI |
| Mango: Multi-Agent Web Navigation via Global-View Optimization | 2604.18779 | 7 | Agent, Multi-Agent, RAG | cs.CL |
| AI scientists produce results without reasoning scientifically | 2604.18805 | 7 | Agent, Reasoning, Autonomous | cs.AI |
| How Adversarial Environments Mislead Agentic AI? | 2604.18874 | 7 | Agent, Benchmark, RAG, Tool-Use | cs.AI |
| Debating the Unspoken: Role-Anchored Multi-Agent Reasoning for Half-Truth Detection | 2604.19005 | 7 | Agent, Multi-Agent, Reasoning, RAG | cs.CL |
| On Accelerating Grounded Code Development for Research | 2604.19022 | 7 | Agent, Reasoning, RAG | cs.AI |
| Do Agents Dream of Root Shells? Partial-Credit Evaluation of LLM Agents in Capture The Flag Challenges | 2604.19354 | 7 | Agent, Benchmark, Tool-Use, Autonomous | cs.AI |
| Multi-modal Reasoning with LLMs for Visual Semantic Arithmetic | 2604.19567 | 7 | Agent, Reasoning, Tool-Use | cs.AI |
| A-MAR: Agent-based Multimodal Art Retrieval for Fine-Grained Artwork Understanding | 2604.19689 | 7 | Agent, Reasoning, Benchmark, RAG | cs.AI |
| CentaurTA Studio: A Self-Improving Human-Agent Collaboration System for Thematic Analysis | 2604.18589 | 6 | Agent, RAG | cs.AI |
| ARGUS: Agentic GPU Optimization Guided by Data-Flow Invariants | 2604.18616 | 6 | Agent, Reasoning | cs.AI |
| Evaluating Answer Leakage Robustness of LLM Tutors against Adversarial Student Attacks | 2604.18660 | 6 | Agent, Multi-Agent | cs.AI |
| Towards Optimal Agentic Architectures for Offensive Security Tasks | 2604.18718 | 6 | Agent, Benchmark, Tool-Use | cs.AI |
| STAR-Teaming: A Strategy-Response Multiplex Network Approach to Automated LLM Red Teaming | 2604.18976 | 6 | Agent, Multi-Agent | cs.CL |
| Explicit Trait Inference for Multi-Agent Coordination | 2604.19278 | 6 | Agent, Multi-Agent | cs.AI |
| IndiaFinBench: An Evaluation Benchmark for Large Language Model Performance on Indian Financial Regulatory Text | 2604.19298 | 6 | Reasoning, Benchmark, RAG | cs.CL |
| Large Language Models Exhibit Normative Conformity | 2604.19301 | 6 | Agent, Multi-Agent | cs.AI |
| From Experience to Skill: Multi-Agent Generative Engine Optimization via Reusable Strategy Learning | 2604.19516 | 6 | Agent, Multi-Agent, Benchmark | cs.AI |
| A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression | 2604.19572 | 6 | Agent, Reasoning, Benchmark | cs.CL |
| Compile to Compress: Boosting Formal Theorem Provers by Compiler Outputs | 2604.18587 | 5 | Reasoning, RAG | cs.AI |
| Owner-Harm: A Missing Threat Model for AI Agent Safety | 2604.18658 | 5 | Agent, Benchmark | cs.AI |
| Human-Guided Harm Recovery for Computer Use Agents | 2604.18847 | 5 | Agent, RAG | cs.AI |
| AutomationBench | 2604.18934 | 5 | Agent, Benchmark, Autonomous | cs.AI |
Week-over-Week Summary
| Metric | This Week | Last Week | Change |
|---|---|---|---|
| Total agent papers | 28 | - | - |
| Multi-agent papers | 9 | 8 | +12.5% |
| Benchmark papers | 14 | 6 | +133.3% |
| RAG-related papers | 18 | 5 | +260.0% |
| Reasoning papers | 21 | - | - |
| Average trend score | 6.73 | - | - |
| Top trend score | 10 | 9 | +11.1% |
Top Papers This Week
1. Taming Actor-Observer Asymmetry in Agents via Dialectical Alignment
ArXiv ID: 2604.19548 | Trend Score: 10/10
Key Topics: Agent, Multi-Agent, Reasoning, Benchmark, RAG, Autonomous
Summary: Large Language Model agents have evolved from static text generators into dynamic systems capable of executing complex autonomous workflows. This paper addresses a fundamental cognitive bias in multi-agent systems—the Actor-Observer Asymmetry—where agents acting versus observing the same situation develop divergent internal representations, leading to coordination failures. The authors propose ReTAS (Reflective Taming of Actor-Observer Asymmetry through Dialectical alignment), a framework that reconciles these asymmetries through dialectical reasoning.
2. Rethinking Scale: Deployment Trade-offs of Small Language Models under Agent Paradigms
ArXiv ID: 2604.19299 | Trend Score: 9/10
Key Topics: Agent, Multi-Agent, Reasoning, Tool-Use
Summary: Despite impressive capabilities of large language models, their substantial computational costs, latency, and privacy risks hinder widespread deployment. This paper systematically examines whether small language models (SLMs) can effectively serve as agent backbones, identifying the efficiency frontier where SLMs outperform LLMs in specific agent tasks while falling short in others.
3. Agent-GWO: Collaborative Agents for Dynamic Prompt Optimization
ArXiv ID: 2604.18612 | Trend Score: 8/10
Key Topics: Agent, Reasoning, RAG
Summary: Large Language Models have demonstrated strong capabilities in complex reasoning tasks, with prompting strategies such as Chain-of-Thought elevating performance. Agent-GWO introduces a collaborative multi-agent framework that dynamically optimizes prompts through grey wolf optimization-inspired coordination, achieving improved reasoning accuracy without model fine-tuning.
4. From Craft to Kernel: A Governance-First Execution Architecture for Agentic Computers
ArXiv ID: 2604.18652 | Trend Score: 8/10
Key Topics: Agent, Reasoning, RAG
Summary: The transition of agentic AI from brittle prototypes to production systems is stalled by a pervasive crisis of craft. This paper proposes a governance-first execution architecture with a semantic Instruction Set Architecture (ISA), treating agent coordination as a kernel-level concern rather than delegating to ad-hoc orchestration layers.
5. SafetyALFRED: Evaluating Safety-Conscious Planning of Multimodal LLMs
ArXiv ID: 2604.19638 | Trend Score: 8/10
Key Topics: Agent, Benchmark, RAG, Autonomous
Summary: Multimodal Large Language Models are increasingly adopted as autonomous agents in interactive environments, yet their ability to proactively address safety hazards remains insufficient. SafetyALFRED introduces a comprehensive benchmark for evaluating safety-conscious planning in multimodal agents across diverse environmental hazards.
Trends & Observations
Trend 1: Actor-Observer Asymmetry Emerges as New Research Direction
The top paper this week introduces ReTAS, addressing a cognitive bias in multi-agent systems where agents develop divergent representations based on their role (actor vs. observer). This represents a shift from treating multi-agent coordination as purely architectural to examining the epistemic foundations of agent collaboration.
Trend 2: Benchmark Proliferation Signals Maturation
Benchmark papers increased 133% week-over-week (from 6 to 14), indicating the field is transitioning from capability demonstrations to standardized evaluation. New benchmarks span safety (SafetyALFRED, Owner-Harm), domain-specific tasks (IndiaFinBench, Time Series Augmented Generation), and agent coordination (AutomationBench).
Trend 3: RAG-Agent Convergence Accelerates
RAG-related papers increased 260% (from 5 to 18), the largest growth among tracked categories. Papers this week show RAG being integrated into agent architectures for code development, art retrieval, financial applications, and environmental reasoning—suggesting retrieval is becoming a core agent capability rather than an external tool.
Trend 4: Small Language Model Efficiency Frontier
Multiple papers explore SLM deployment in agent paradigms, examining the trade-offs between model scale and agent-specific capabilities. This reflects growing industry concern about inference costs as agent workloads require multiple model calls per task.
Trend 5: Safety Evaluation Expands Beyond Generic Harm
New benchmarks like Owner-Harm and Human-Guided Harm Recovery address commercially consequential threat models—agents causing financial or operational damage to their owners—rather than focusing solely on criminal harm scenarios.
🔺 Scout Intel: What Others Missed
Confidence: high | Novelty Score: 65/100
The 260% surge in RAG-Agent papers represents more than incremental interest—it signals a fundamental architectural shift. Papers this week treat retrieval not as an external tool but as an intrinsic agent capability, with frameworks like A-MAR (art retrieval) and AlphaEarth (environmental reasoning) embedding retrieval directly into agent reasoning loops. This convergence pattern mirrors the 2017-2018 transition when attention mechanisms moved from auxiliary components to transformer architectures’ core primitive.
The Actor-Observer Asymmetry paper deserves attention beyond its trend score. While multi-agent research has focused on coordination protocols and communication patterns, this work identifies a cognitive bias at the representation level—actors and observers develop fundamentally different internal models of the same situation. For enterprise multi-agent deployments, this suggests orchestration layers must actively reconcile these divergent representations, not just manage message passing. The paper’s dialectical alignment approach could reduce the 30-40% coordination failure rates observed in current multi-agent production systems.
Key Implication: Engineering teams evaluating multi-agent frameworks should prioritize systems with explicit representation reconciliation mechanisms over purely protocol-based coordination. Benchmark the 14 new evaluation papers against your specific use case—generic benchmarks increasingly fail to capture domain-specific agent failures.
Sources
- ArXiv cs.AI RSS Feed — ArXiv, April 2026
- ArXiv cs.CL RSS Feed — ArXiv, April 2026
Related Intel
LLM Product Release Weekly Tracker
Weekly tracking of LLM product releases from OpenAI, Anthropic, Google, Mistral, and Cohere. Updated April 21, 2026 with 22 new entries including GPT-Rosalind, Claude Opus 4.7, and Gemini Robotics-ER 1.6.
Hermes Agent Hits 95K Stars, Ships Self-Improving AI Framework
Hermes Agent v0.10.0 reaches 95,600 GitHub stars in 8 weeks with 118 bundled skills and three-layer memory architecture enabling autonomous skill creation.
GitHub AI Agent Repository Stars Tracker - Weekly Update
AutoGPT leads with 183.5K stars, Hermes-Agent surges 48.2% weekly approaching 100K milestone. Low-code platforms Langflow (147K) and Dify (138K) compete for dominance. System prompt transparency repos emerge as new category in top 10.