ArXiv cs.AI Weekly Tracker - Week of May 28, 2026
Self-improving agent frameworks emerge with MUSE-Autoskill and SIA. FinHarness and QUACK advance domain-specific safety. RLHF vulnerability identified in ICML 2026 paper.
Data Overview
- Snapshot Week: 2026-05-22 to 2026-05-28
- Tracker: ArXiv cs.AI Weekly Papers Tracker (view all historical snapshots:
/tech/ai-agents/data/?tracker=arxiv-cs-ai-weekly) - Update Frequency: Weekly
- Primary Sources: ArXiv cs.CL API, Brave Search
Key Facts
- Who: 18 agent-related papers from ArXiv cs.CL (primary category due to API rate limits)
- What: Self-improving agent frameworks (MUSE-Autoskill, SIA) dominate; domain-specific safety harnesses emerge (FinHarness, ENPMR-Bench); RLHF vulnerability identified
- When: Week of May 22-28, 2026
- Impact: 36% agent-relevance rate; 3 multi-agent papers; avg trend score 5.2 for agent papers vs 2.4 overall
Methodology
Papers are collected weekly from ArXiv API queries targeting cs.CL, cs.AI, and related categories. Agent-related papers are identified through keyword matching on titles and abstracts (agent, multi-agent, autonomous, tool use, planning, reasoning). Trend Scores (1-10) are assigned based on relevance to core agent research themes, novelty of approach, and potential impact.
This snapshot reflects papers submitted or updated during the week of May 22-28, 2026. Collection was limited by API rate limits on cs.AI and cs.MA categories; Brave Search provided supplementary coverage.
This Week’s Data
Top Papers by Trend Score
| Rank | Title | ArXiv ID | Trend | Key Innovation |
|---|---|---|---|---|
| 1 | MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation | 2605.27366 | 8 | Unified skill lifecycle management (creation, memory, evaluation, refinement) |
| 2 | SIA: Self Improving AI with Harness & Weight Updates | 2605.27276 | 8 | Combined harness and weight updates for autonomous improvement (56.6% LawBench gain) |
| 3 | FinHarness: An Inline Lifecycle Safety Harness for Finance LLM Agents | 2605.27333 | 7 | Finance-specific safety harness (ASR reduced from 38.3% to 15.0%) |
| 4 | QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents | 2605.27068 | 7 | Multimodal agent auditing (15.1% spatial hallucination, 50%+ baseless accusations) |
| 5 | Alignment Tampering: How RLHF Is Exploited to Optimize Misaligned Biases | 2605.27355 | 6 | RLHF vulnerability when LLM influences preference datasets (ICML 2026) |
| 6 | ENPMR-Bench: Benchmarking Proactive Memory Retrieval for Emotional Support Agents | 2605.27240 | 5 | Maslow-grounded proactive memory retrieval for emotional support |
Full Agent-Related Papers (18 papers)
| ArXiv ID | Title | Category | Trend | Focus |
|---|---|---|---|---|
| 2605.27366 | MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation | cs.AI | 8 | Self-improving, skill lifecycle |
| 2605.27276 | SIA: Self Improving AI with Harness & Weight Updates | cs.AI | 8 | Self-improving, meta-agent |
| 2605.27333 | FinHarness: An Inline Lifecycle Safety Harness for Finance LLM Agents | cs.CL | 7 | Safety harness, finance |
| 2605.27068 | QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents | cs.CL | 7 | Multimodal, auditing, hallucination |
| 2605.27355 | Alignment Tampering: How RLHF Is Exploited to Optimize Misaligned Biases | cs.AI | 6 | RLHF, alignment, safety |
| 2605.27240 | ENPMR-Bench: Benchmarking Proactive Memory Retrieval for Emotional Support Agents | cs.CL | 5 | Emotional support, memory |
| 2605.27294 | Separating Semantic Competition from Context Length in RAG Reading | cs.CL | 3 | RAG, retrieval |
| 2605.27220 | The Coverage Illusion: From Pre-retrieval Routing Failure to Post-retrieval Cascades in a Production RAG System | cs.CL | 3 | RAG, production |
| 2605.27156 | LitSeg: Narrative-Aware Document Segmentation for Literary RAG | cs.CL | 4 | RAG, segmentation |
| 2605.27110 | BAIT: Boundary-Guided Disclosure Escalation via Self-Conditioned Reasoning | cs.CR | 4 | Jailbreak, agent safety |
| 2605.27030 | Share More, Search Less: Collaborative Parallel Thinking for Efficient Test-Time Scaling | cs.CL | 4 | Reasoning, test-time scaling |
| 2605.27190 | Learning When to Think While Listening in Large Audio-Language Models | cs.CL | 4 | Audio-language, reasoning |
Week-over-Week Summary
| Metric | This Week | Last Week | Change |
|---|---|---|---|
| Total papers collected | 50 | 498 | -448 (-89.9%) |
| Agent-related papers | 18 | 167 | -149 (-89.2%) |
| Multi-agent systems | 3 | 28 | -25 (-89.3%) |
| Avg trend score (agent) | 5.2 | - | N/A |
| Top trend score | 8 | 10 | -2 |
Note: This week’s collection was impacted by ArXiv API rate limits (cs.AI, cs.MA categories blocked; cs.CL succeeded). The 89.9% reduction in total papers reflects partial coverage, not an actual decline in submissions. Full coverage expected to resume next week.
Ecosystem Metrics
| Category | Count | Percentage |
|---|---|---|
| Total papers scanned | 50 | 100% |
| Agent-related papers | 18 | 36.0% |
| Multi-agent systems | 3 | 6.0% |
| Safety-related | 4 | 8.0% |
| RAG-related | 4 | 8.0% |
| Reasoning | 5 | 10.0% |
| Multimodal | 2 | 4.0% |
Category Distribution
| Primary Category | Count | Percentage |
|---|---|---|
| cs.CL | 32 | 64.0% |
| cs.AI | 8 | 16.0% |
| cs.LG | 6 | 12.0% |
| cs.CV | 2 | 4.0% |
| cs.CR | 1 | 2.0% |
Topic Clusters
| Cluster | Papers | Keywords |
|---|---|---|
| Self-improving agents | 3 | skill lifecycle, weight updates, meta-agent |
| Safety harnesses | 4 | finance, emotional support, jailbreak, RLHF |
| RAG optimization | 4 | retrieval, segmentation, coverage, competition |
| Multimodal auditing | 2 | hallucination, social deduction |
| Reasoning control | 2 | test-time scaling, audio-language |
Trends & Observations
-
Self-Improving Agent Architectures Converge: MUSE-Autoskill and SIA independently arrive at nearly identical architectures—skill lifecycle management combined with weight/harness updates. This convergence suggests a canonical approach to agent autonomy is emerging. MUSE-Autoskill provides the theoretical framework (creation, memory, evaluation, refinement), while SIA validates it with 56.6% improvement on LawBench.
-
Domain-Specific Safety Harnesses Emerge: Generic agent safety frameworks are giving way to specialized solutions. FinHarness targets finance LLM agents with a three-module architecture (Query Monitor, Tool Monitor, Cascade), reducing attack success rate from 38.3% to 15.0% while preserving benign approval rates. ENPMR-Bench addresses emotional support agents with Maslow-grounded proactive memory retrieval. This specialization trend indicates one-size-fits-all safety is insufficient for production deployment.
-
RLHF Structural Vulnerability Identified: Alignment Tampering (accepted to ICML 2026) demonstrates a fundamental flaw in RLHF’s preference feedback loop—when LLM outputs influence preference datasets, the training process can amplify misaligned biases rather than correct them. This is not an implementation bug but a structural vulnerability in the RLHF paradigm itself.
-
Multimodal Hallucination Persists: QUACK reveals that top vision-language models hallucinate 15.1% of spatial claims and make more than 50% of accusations without grounded evidence in social deduction scenarios. The framework introduces a systematic auditing methodology, but the results underscore that multimodal grounding remains unsolved.
-
RAG Understanding Deepens: Three RAG papers advance retrieval understanding from different angles: The Coverage Illusion exposes the gap between synthetic and real query distributions; LitSeg brings narrative-aware segmentation to literary works; Semantic Competition isolates retrieval interference from context length effects. Collectively, these suggest production RAG systems have systematic blind spots.
🔺 Scout Intel: What Others Missed
Confidence: high | Novelty Score: 65/100
This week’s ArXiv snapshot reveals three emerging patterns that mainstream coverage overlooks:
1. Self-improving agent convergence: MUSE-Autoskill and SIA independently arrive at similar architectures—skill lifecycle combined with weight/harness updates—suggesting this may become the canonical approach for agent autonomy. The convergence across research teams (Huawei, independent researchers) indicates a theoretical attractor rather than coincidence.
2. Domain-specific safety harnesses: FinHarness (finance) and ENPMR-Bench (emotional support) demonstrate that general agent safety frameworks need domain-specific tuning to achieve practical protection rates. FinHarness’s 38.3% to 15.0% ASR reduction comes from finance-specific modules (Query Monitor, Tool Monitor, Cascade) that understand transaction semantics. Generic safety benchmarks systematically overestimate protection for vertical applications.
3. RLHF structural vulnerability: Alignment Tampering (ICML 2026) shows RLHF’s preference feedback loop can be exploited—a fundamental flaw that may require rethinking post-training alignment. The paper demonstrates that when LLM outputs influence preference datasets, the optimization process amplifies undesired behaviors rather than correcting them. This has implications for all frontier model providers currently relying on RLHF as their primary alignment mechanism.
Key Implication: Teams deploying agents in production should evaluate domain-specific safety harnesses rather than relying on generic safety benchmarks—FinHarness’s 23.3 percentage point ASR improvement demonstrates that safety measurement is currently misaligned with deployment reality.
Previous Snapshots
Sources
- ArXiv cs.CL API - Primary source for NLP and computational linguistics papers (successful)
- ArXiv API Agent Query - Supplementary agent-focused queries
- Brave Search - Fallback for rate-limited categories
Collection Note: This snapshot achieved partial coverage due to ArXiv API rate limits affecting cs.AI and cs.MA categories. Full coverage is expected to resume in next week’s snapshot.
ArXiv cs.AI Weekly Tracker - Week of May 28, 2026
Self-improving agent frameworks emerge with MUSE-Autoskill and SIA. FinHarness and QUACK advance domain-specific safety. RLHF vulnerability identified in ICML 2026 paper.
Data Overview
- Snapshot Week: 2026-05-22 to 2026-05-28
- Tracker: ArXiv cs.AI Weekly Papers Tracker (view all historical snapshots:
/tech/ai-agents/data/?tracker=arxiv-cs-ai-weekly) - Update Frequency: Weekly
- Primary Sources: ArXiv cs.CL API, Brave Search
Key Facts
- Who: 18 agent-related papers from ArXiv cs.CL (primary category due to API rate limits)
- What: Self-improving agent frameworks (MUSE-Autoskill, SIA) dominate; domain-specific safety harnesses emerge (FinHarness, ENPMR-Bench); RLHF vulnerability identified
- When: Week of May 22-28, 2026
- Impact: 36% agent-relevance rate; 3 multi-agent papers; avg trend score 5.2 for agent papers vs 2.4 overall
Methodology
Papers are collected weekly from ArXiv API queries targeting cs.CL, cs.AI, and related categories. Agent-related papers are identified through keyword matching on titles and abstracts (agent, multi-agent, autonomous, tool use, planning, reasoning). Trend Scores (1-10) are assigned based on relevance to core agent research themes, novelty of approach, and potential impact.
This snapshot reflects papers submitted or updated during the week of May 22-28, 2026. Collection was limited by API rate limits on cs.AI and cs.MA categories; Brave Search provided supplementary coverage.
This Week’s Data
Top Papers by Trend Score
| Rank | Title | ArXiv ID | Trend | Key Innovation |
|---|---|---|---|---|
| 1 | MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation | 2605.27366 | 8 | Unified skill lifecycle management (creation, memory, evaluation, refinement) |
| 2 | SIA: Self Improving AI with Harness & Weight Updates | 2605.27276 | 8 | Combined harness and weight updates for autonomous improvement (56.6% LawBench gain) |
| 3 | FinHarness: An Inline Lifecycle Safety Harness for Finance LLM Agents | 2605.27333 | 7 | Finance-specific safety harness (ASR reduced from 38.3% to 15.0%) |
| 4 | QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents | 2605.27068 | 7 | Multimodal agent auditing (15.1% spatial hallucination, 50%+ baseless accusations) |
| 5 | Alignment Tampering: How RLHF Is Exploited to Optimize Misaligned Biases | 2605.27355 | 6 | RLHF vulnerability when LLM influences preference datasets (ICML 2026) |
| 6 | ENPMR-Bench: Benchmarking Proactive Memory Retrieval for Emotional Support Agents | 2605.27240 | 5 | Maslow-grounded proactive memory retrieval for emotional support |
Full Agent-Related Papers (18 papers)
| ArXiv ID | Title | Category | Trend | Focus |
|---|---|---|---|---|
| 2605.27366 | MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation | cs.AI | 8 | Self-improving, skill lifecycle |
| 2605.27276 | SIA: Self Improving AI with Harness & Weight Updates | cs.AI | 8 | Self-improving, meta-agent |
| 2605.27333 | FinHarness: An Inline Lifecycle Safety Harness for Finance LLM Agents | cs.CL | 7 | Safety harness, finance |
| 2605.27068 | QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents | cs.CL | 7 | Multimodal, auditing, hallucination |
| 2605.27355 | Alignment Tampering: How RLHF Is Exploited to Optimize Misaligned Biases | cs.AI | 6 | RLHF, alignment, safety |
| 2605.27240 | ENPMR-Bench: Benchmarking Proactive Memory Retrieval for Emotional Support Agents | cs.CL | 5 | Emotional support, memory |
| 2605.27294 | Separating Semantic Competition from Context Length in RAG Reading | cs.CL | 3 | RAG, retrieval |
| 2605.27220 | The Coverage Illusion: From Pre-retrieval Routing Failure to Post-retrieval Cascades in a Production RAG System | cs.CL | 3 | RAG, production |
| 2605.27156 | LitSeg: Narrative-Aware Document Segmentation for Literary RAG | cs.CL | 4 | RAG, segmentation |
| 2605.27110 | BAIT: Boundary-Guided Disclosure Escalation via Self-Conditioned Reasoning | cs.CR | 4 | Jailbreak, agent safety |
| 2605.27030 | Share More, Search Less: Collaborative Parallel Thinking for Efficient Test-Time Scaling | cs.CL | 4 | Reasoning, test-time scaling |
| 2605.27190 | Learning When to Think While Listening in Large Audio-Language Models | cs.CL | 4 | Audio-language, reasoning |
Week-over-Week Summary
| Metric | This Week | Last Week | Change |
|---|---|---|---|
| Total papers collected | 50 | 498 | -448 (-89.9%) |
| Agent-related papers | 18 | 167 | -149 (-89.2%) |
| Multi-agent systems | 3 | 28 | -25 (-89.3%) |
| Avg trend score (agent) | 5.2 | - | N/A |
| Top trend score | 8 | 10 | -2 |
Note: This week’s collection was impacted by ArXiv API rate limits (cs.AI, cs.MA categories blocked; cs.CL succeeded). The 89.9% reduction in total papers reflects partial coverage, not an actual decline in submissions. Full coverage expected to resume next week.
Ecosystem Metrics
| Category | Count | Percentage |
|---|---|---|
| Total papers scanned | 50 | 100% |
| Agent-related papers | 18 | 36.0% |
| Multi-agent systems | 3 | 6.0% |
| Safety-related | 4 | 8.0% |
| RAG-related | 4 | 8.0% |
| Reasoning | 5 | 10.0% |
| Multimodal | 2 | 4.0% |
Category Distribution
| Primary Category | Count | Percentage |
|---|---|---|
| cs.CL | 32 | 64.0% |
| cs.AI | 8 | 16.0% |
| cs.LG | 6 | 12.0% |
| cs.CV | 2 | 4.0% |
| cs.CR | 1 | 2.0% |
Topic Clusters
| Cluster | Papers | Keywords |
|---|---|---|
| Self-improving agents | 3 | skill lifecycle, weight updates, meta-agent |
| Safety harnesses | 4 | finance, emotional support, jailbreak, RLHF |
| RAG optimization | 4 | retrieval, segmentation, coverage, competition |
| Multimodal auditing | 2 | hallucination, social deduction |
| Reasoning control | 2 | test-time scaling, audio-language |
Trends & Observations
-
Self-Improving Agent Architectures Converge: MUSE-Autoskill and SIA independently arrive at nearly identical architectures—skill lifecycle management combined with weight/harness updates. This convergence suggests a canonical approach to agent autonomy is emerging. MUSE-Autoskill provides the theoretical framework (creation, memory, evaluation, refinement), while SIA validates it with 56.6% improvement on LawBench.
-
Domain-Specific Safety Harnesses Emerge: Generic agent safety frameworks are giving way to specialized solutions. FinHarness targets finance LLM agents with a three-module architecture (Query Monitor, Tool Monitor, Cascade), reducing attack success rate from 38.3% to 15.0% while preserving benign approval rates. ENPMR-Bench addresses emotional support agents with Maslow-grounded proactive memory retrieval. This specialization trend indicates one-size-fits-all safety is insufficient for production deployment.
-
RLHF Structural Vulnerability Identified: Alignment Tampering (accepted to ICML 2026) demonstrates a fundamental flaw in RLHF’s preference feedback loop—when LLM outputs influence preference datasets, the training process can amplify misaligned biases rather than correct them. This is not an implementation bug but a structural vulnerability in the RLHF paradigm itself.
-
Multimodal Hallucination Persists: QUACK reveals that top vision-language models hallucinate 15.1% of spatial claims and make more than 50% of accusations without grounded evidence in social deduction scenarios. The framework introduces a systematic auditing methodology, but the results underscore that multimodal grounding remains unsolved.
-
RAG Understanding Deepens: Three RAG papers advance retrieval understanding from different angles: The Coverage Illusion exposes the gap between synthetic and real query distributions; LitSeg brings narrative-aware segmentation to literary works; Semantic Competition isolates retrieval interference from context length effects. Collectively, these suggest production RAG systems have systematic blind spots.
🔺 Scout Intel: What Others Missed
Confidence: high | Novelty Score: 65/100
This week’s ArXiv snapshot reveals three emerging patterns that mainstream coverage overlooks:
1. Self-improving agent convergence: MUSE-Autoskill and SIA independently arrive at similar architectures—skill lifecycle combined with weight/harness updates—suggesting this may become the canonical approach for agent autonomy. The convergence across research teams (Huawei, independent researchers) indicates a theoretical attractor rather than coincidence.
2. Domain-specific safety harnesses: FinHarness (finance) and ENPMR-Bench (emotional support) demonstrate that general agent safety frameworks need domain-specific tuning to achieve practical protection rates. FinHarness’s 38.3% to 15.0% ASR reduction comes from finance-specific modules (Query Monitor, Tool Monitor, Cascade) that understand transaction semantics. Generic safety benchmarks systematically overestimate protection for vertical applications.
3. RLHF structural vulnerability: Alignment Tampering (ICML 2026) shows RLHF’s preference feedback loop can be exploited—a fundamental flaw that may require rethinking post-training alignment. The paper demonstrates that when LLM outputs influence preference datasets, the optimization process amplifies undesired behaviors rather than correcting them. This has implications for all frontier model providers currently relying on RLHF as their primary alignment mechanism.
Key Implication: Teams deploying agents in production should evaluate domain-specific safety harnesses rather than relying on generic safety benchmarks—FinHarness’s 23.3 percentage point ASR improvement demonstrates that safety measurement is currently misaligned with deployment reality.
Previous Snapshots
Sources
- ArXiv cs.CL API - Primary source for NLP and computational linguistics papers (successful)
- ArXiv API Agent Query - Supplementary agent-focused queries
- Brave Search - Fallback for rate-limited categories
Collection Note: This snapshot achieved partial coverage due to ArXiv API rate limits affecting cs.AI and cs.MA categories. Full coverage is expected to resume in next week’s snapshot.
Related Intel
LLM Product Release Tracker — Week of May 26, 2026
Anthropic acquired Stainless for MCP tooling, Mistral acquired Emmi AI for Physics AI, Google launched Managed Agents and Antigravity Agent, and OpenAI released Secure MCP Tunnel. 5 high-impact releases across MCP platform war and agent infrastructure.
GitHub AI Agent Repository Stars Tracker — Week of May 25, 2026
Weekly snapshot of AI agent repositories on GitHub. Hermes Agent leads with 165.5K stars (+7.06% WoW). Ecosystem expanded to 128 repos with Python overtaking TypeScript in top 30.
Enterprise AI Agent Security Threshold: MCP Tunnels, A2A, and $25B Valuation Club
Week 34 marks enterprise AI agents crossing the security threshold: MCP Tunnels enable perimeter security, A2A reaches 150+ organizations, Cursor leads $50B valuation club, and Observational Memory delivers 10x cost reduction.