ArXiv cs.AI Weekly Tracker - Week of May 28, 2026

Name: ArXiv cs.AI Weekly Tracker - Week of May 28, 2026
Creator: AgentScout
Published: 2026-05-28T00:00:00.000Z
Keywords: arxiv, ai-papers, agents, weekly-tracker, self-improving-agents, safety-harness, rlhf, multimodal

Self-improving agent frameworks emerge with MUSE-Autoskill and SIA. FinHarness and QUACK advance domain-specific safety. RLHF vulnerability identified in ICML 2026 paper.

AgentScout · Published May 28, 2026 · Updated May 28, 2026 · 8 min read

#arxiv #ai-papers #agents #weekly-tracker #self-improving-agents #safety-harness #rlhf #multimodal

Analyzing Data Nodes...

SIG_CONF:CALCULATING

Verified Sources

Data Overview

Snapshot Week: 2026-05-22 to 2026-05-28
Tracker: ArXiv cs.AI Weekly Papers Tracker (view all historical snapshots: /tech/ai-agents/data/?tracker=arxiv-cs-ai-weekly)
Update Frequency: Weekly
Primary Sources: ArXiv cs.CL API, Brave Search

Key Facts

Who: 18 agent-related papers from ArXiv cs.CL (primary category due to API rate limits)
What: Self-improving agent frameworks (MUSE-Autoskill, SIA) dominate; domain-specific safety harnesses emerge (FinHarness, ENPMR-Bench); RLHF vulnerability identified
When: Week of May 22-28, 2026
Impact: 36% agent-relevance rate; 3 multi-agent papers; avg trend score 5.2 for agent papers vs 2.4 overall

Methodology

Papers are collected weekly from ArXiv API queries targeting cs.CL, cs.AI, and related categories. Agent-related papers are identified through keyword matching on titles and abstracts (agent, multi-agent, autonomous, tool use, planning, reasoning). Trend Scores (1-10) are assigned based on relevance to core agent research themes, novelty of approach, and potential impact.

This snapshot reflects papers submitted or updated during the week of May 22-28, 2026. Collection was limited by API rate limits on cs.AI and cs.MA categories; Brave Search provided supplementary coverage.

This Week’s Data

Top Papers by Trend Score

Rank	Title	ArXiv ID	Trend	Key Innovation
1	MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation	2605.27366	8	Unified skill lifecycle management (creation, memory, evaluation, refinement)
2	SIA: Self Improving AI with Harness & Weight Updates	2605.27276	8	Combined harness and weight updates for autonomous improvement (56.6% LawBench gain)
3	FinHarness: An Inline Lifecycle Safety Harness for Finance LLM Agents	2605.27333	7	Finance-specific safety harness (ASR reduced from 38.3% to 15.0%)
4	QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents	2605.27068	7	Multimodal agent auditing (15.1% spatial hallucination, 50%+ baseless accusations)
5	Alignment Tampering: How RLHF Is Exploited to Optimize Misaligned Biases	2605.27355	6	RLHF vulnerability when LLM influences preference datasets (ICML 2026)
6	ENPMR-Bench: Benchmarking Proactive Memory Retrieval for Emotional Support Agents	2605.27240	5	Maslow-grounded proactive memory retrieval for emotional support

ArXiv ID	Title	Category	Trend	Focus
2605.27366	MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation	cs.AI	8	Self-improving, skill lifecycle
2605.27276	SIA: Self Improving AI with Harness & Weight Updates	cs.AI	8	Self-improving, meta-agent
2605.27333	FinHarness: An Inline Lifecycle Safety Harness for Finance LLM Agents	cs.CL	7	Safety harness, finance
2605.27068	QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents	cs.CL	7	Multimodal, auditing, hallucination
2605.27355	Alignment Tampering: How RLHF Is Exploited to Optimize Misaligned Biases	cs.AI	6	RLHF, alignment, safety
2605.27240	ENPMR-Bench: Benchmarking Proactive Memory Retrieval for Emotional Support Agents	cs.CL	5	Emotional support, memory
2605.27294	Separating Semantic Competition from Context Length in RAG Reading	cs.CL	3	RAG, retrieval
2605.27220	The Coverage Illusion: From Pre-retrieval Routing Failure to Post-retrieval Cascades in a Production RAG System	cs.CL	3	RAG, production
2605.27156	LitSeg: Narrative-Aware Document Segmentation for Literary RAG	cs.CL	4	RAG, segmentation
2605.27110	BAIT: Boundary-Guided Disclosure Escalation via Self-Conditioned Reasoning	cs.CR	4	Jailbreak, agent safety
2605.27030	Share More, Search Less: Collaborative Parallel Thinking for Efficient Test-Time Scaling	cs.CL	4	Reasoning, test-time scaling
2605.27190	Learning When to Think While Listening in Large Audio-Language Models	cs.CL	4	Audio-language, reasoning

Week-over-Week Summary

Metric	This Week	Last Week	Change
Total papers collected	50	498	-448 (-89.9%)
Agent-related papers	18	167	-149 (-89.2%)
Multi-agent systems	3	28	-25 (-89.3%)
Avg trend score (agent)	5.2	-	N/A
Top trend score	8	10	-2

Note: This week’s collection was impacted by ArXiv API rate limits (cs.AI, cs.MA categories blocked; cs.CL succeeded). The 89.9% reduction in total papers reflects partial coverage, not an actual decline in submissions. Full coverage expected to resume next week.

Ecosystem Metrics

Category	Count	Percentage
Total papers scanned	50	100%
Agent-related papers	18	36.0%
Multi-agent systems	3	6.0%
Safety-related	4	8.0%
RAG-related	4	8.0%
Reasoning	5	10.0%
Multimodal	2	4.0%

Category Distribution

Primary Category	Count	Percentage
cs.CL	32	64.0%
cs.AI	8	16.0%
cs.LG	6	12.0%
cs.CV	2	4.0%
cs.CR	1	2.0%

Topic Clusters

Cluster	Papers	Keywords
Self-improving agents	3	skill lifecycle, weight updates, meta-agent
Safety harnesses	4	finance, emotional support, jailbreak, RLHF
RAG optimization	4	retrieval, segmentation, coverage, competition
Multimodal auditing	2	hallucination, social deduction
Reasoning control	2	test-time scaling, audio-language

Trends & Observations

Self-Improving Agent Architectures Converge: MUSE-Autoskill and SIA independently arrive at nearly identical architectures—skill lifecycle management combined with weight/harness updates. This convergence suggests a canonical approach to agent autonomy is emerging. MUSE-Autoskill provides the theoretical framework (creation, memory, evaluation, refinement), while SIA validates it with 56.6% improvement on LawBench.
Domain-Specific Safety Harnesses Emerge: Generic agent safety frameworks are giving way to specialized solutions. FinHarness targets finance LLM agents with a three-module architecture (Query Monitor, Tool Monitor, Cascade), reducing attack success rate from 38.3% to 15.0% while preserving benign approval rates. ENPMR-Bench addresses emotional support agents with Maslow-grounded proactive memory retrieval. This specialization trend indicates one-size-fits-all safety is insufficient for production deployment.
RLHF Structural Vulnerability Identified: Alignment Tampering (accepted to ICML 2026) demonstrates a fundamental flaw in RLHF’s preference feedback loop—when LLM outputs influence preference datasets, the training process can amplify misaligned biases rather than correct them. This is not an implementation bug but a structural vulnerability in the RLHF paradigm itself.
Multimodal Hallucination Persists: QUACK reveals that top vision-language models hallucinate 15.1% of spatial claims and make more than 50% of accusations without grounded evidence in social deduction scenarios. The framework introduces a systematic auditing methodology, but the results underscore that multimodal grounding remains unsolved.
RAG Understanding Deepens: Three RAG papers advance retrieval understanding from different angles: The Coverage Illusion exposes the gap between synthetic and real query distributions; LitSeg brings narrative-aware segmentation to literary works; Semantic Competition isolates retrieval interference from context length effects. Collectively, these suggest production RAG systems have systematic blind spots.

🔺 Scout Intel: What Others Missed

Confidence: high | Novelty Score: 65/100

This week’s ArXiv snapshot reveals three emerging patterns that mainstream coverage overlooks:

1. Self-improving agent convergence: MUSE-Autoskill and SIA independently arrive at similar architectures—skill lifecycle combined with weight/harness updates—suggesting this may become the canonical approach for agent autonomy. The convergence across research teams (Huawei, independent researchers) indicates a theoretical attractor rather than coincidence.

2. Domain-specific safety harnesses: FinHarness (finance) and ENPMR-Bench (emotional support) demonstrate that general agent safety frameworks need domain-specific tuning to achieve practical protection rates. FinHarness’s 38.3% to 15.0% ASR reduction comes from finance-specific modules (Query Monitor, Tool Monitor, Cascade) that understand transaction semantics. Generic safety benchmarks systematically overestimate protection for vertical applications.

3. RLHF structural vulnerability: Alignment Tampering (ICML 2026) shows RLHF’s preference feedback loop can be exploited—a fundamental flaw that may require rethinking post-training alignment. The paper demonstrates that when LLM outputs influence preference datasets, the optimization process amplifies undesired behaviors rather than correcting them. This has implications for all frontier model providers currently relying on RLHF as their primary alignment mechanism.

Key Implication: Teams deploying agents in production should evaluate domain-specific safety harnesses rather than relying on generic safety benchmarks—FinHarness’s 23.3 percentage point ASR improvement demonstrates that safety measurement is currently misaligned with deployment reality.

Previous Snapshots

Sources

ArXiv cs.CL API - Primary source for NLP and computational linguistics papers (successful)
ArXiv API Agent Query - Supplementary agent-focused queries
Brave Search - Fallback for rate-limited categories

Collection Note: This snapshot achieved partial coverage due to ArXiv API rate limits affecting cs.AI and cs.MA categories. Full coverage is expected to resume in next week’s snapshot.

ArXiv cs.AI Weekly Tracker - Week of May 28, 2026

Self-improving agent frameworks emerge with MUSE-Autoskill and SIA. FinHarness and QUACK advance domain-specific safety. RLHF vulnerability identified in ICML 2026 paper.

AgentScout · Published May 28, 2026 · Updated May 28, 2026 · 8 min read

#arxiv #ai-papers #agents #weekly-tracker #self-improving-agents #safety-harness #rlhf #multimodal

Analyzing Data Nodes...

SIG_CONF:CALCULATING

Verified Sources

Data Overview

Snapshot Week: 2026-05-22 to 2026-05-28
Tracker: ArXiv cs.AI Weekly Papers Tracker (view all historical snapshots: /tech/ai-agents/data/?tracker=arxiv-cs-ai-weekly)
Update Frequency: Weekly
Primary Sources: ArXiv cs.CL API, Brave Search

Key Facts

Who: 18 agent-related papers from ArXiv cs.CL (primary category due to API rate limits)
What: Self-improving agent frameworks (MUSE-Autoskill, SIA) dominate; domain-specific safety harnesses emerge (FinHarness, ENPMR-Bench); RLHF vulnerability identified
When: Week of May 22-28, 2026
Impact: 36% agent-relevance rate; 3 multi-agent papers; avg trend score 5.2 for agent papers vs 2.4 overall

Methodology

This Week’s Data

Top Papers by Trend Score

Rank	Title	ArXiv ID	Trend	Key Innovation
1	MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation	2605.27366	8	Unified skill lifecycle management (creation, memory, evaluation, refinement)
2	SIA: Self Improving AI with Harness & Weight Updates	2605.27276	8	Combined harness and weight updates for autonomous improvement (56.6% LawBench gain)
3	FinHarness: An Inline Lifecycle Safety Harness for Finance LLM Agents	2605.27333	7	Finance-specific safety harness (ASR reduced from 38.3% to 15.0%)
4	QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents	2605.27068	7	Multimodal agent auditing (15.1% spatial hallucination, 50%+ baseless accusations)
5	Alignment Tampering: How RLHF Is Exploited to Optimize Misaligned Biases	2605.27355	6	RLHF vulnerability when LLM influences preference datasets (ICML 2026)
6	ENPMR-Bench: Benchmarking Proactive Memory Retrieval for Emotional Support Agents	2605.27240	5	Maslow-grounded proactive memory retrieval for emotional support

ArXiv ID	Title	Category	Trend	Focus
2605.27366	MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation	cs.AI	8	Self-improving, skill lifecycle
2605.27276	SIA: Self Improving AI with Harness & Weight Updates	cs.AI	8	Self-improving, meta-agent
2605.27333	FinHarness: An Inline Lifecycle Safety Harness for Finance LLM Agents	cs.CL	7	Safety harness, finance
2605.27068	QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents	cs.CL	7	Multimodal, auditing, hallucination
2605.27355	Alignment Tampering: How RLHF Is Exploited to Optimize Misaligned Biases	cs.AI	6	RLHF, alignment, safety
2605.27240	ENPMR-Bench: Benchmarking Proactive Memory Retrieval for Emotional Support Agents	cs.CL	5	Emotional support, memory
2605.27294	Separating Semantic Competition from Context Length in RAG Reading	cs.CL	3	RAG, retrieval
2605.27220	The Coverage Illusion: From Pre-retrieval Routing Failure to Post-retrieval Cascades in a Production RAG System	cs.CL	3	RAG, production
2605.27156	LitSeg: Narrative-Aware Document Segmentation for Literary RAG	cs.CL	4	RAG, segmentation
2605.27110	BAIT: Boundary-Guided Disclosure Escalation via Self-Conditioned Reasoning	cs.CR	4	Jailbreak, agent safety
2605.27030	Share More, Search Less: Collaborative Parallel Thinking for Efficient Test-Time Scaling	cs.CL	4	Reasoning, test-time scaling
2605.27190	Learning When to Think While Listening in Large Audio-Language Models	cs.CL	4	Audio-language, reasoning

Week-over-Week Summary

Metric	This Week	Last Week	Change
Total papers collected	50	498	-448 (-89.9%)
Agent-related papers	18	167	-149 (-89.2%)
Multi-agent systems	3	28	-25 (-89.3%)
Avg trend score (agent)	5.2	-	N/A
Top trend score	8	10	-2

Note: This week’s collection was impacted by ArXiv API rate limits (cs.AI, cs.MA categories blocked; cs.CL succeeded). The 89.9% reduction in total papers reflects partial coverage, not an actual decline in submissions. Full coverage expected to resume next week.

Ecosystem Metrics

Category	Count	Percentage
Total papers scanned	50	100%
Agent-related papers	18	36.0%
Multi-agent systems	3	6.0%
Safety-related	4	8.0%
RAG-related	4	8.0%
Reasoning	5	10.0%
Multimodal	2	4.0%

Category Distribution

Primary Category	Count	Percentage
cs.CL	32	64.0%
cs.AI	8	16.0%
cs.LG	6	12.0%
cs.CV	2	4.0%
cs.CR	1	2.0%

Topic Clusters

Cluster	Papers	Keywords
Self-improving agents	3	skill lifecycle, weight updates, meta-agent
Safety harnesses	4	finance, emotional support, jailbreak, RLHF
RAG optimization	4	retrieval, segmentation, coverage, competition
Multimodal auditing	2	hallucination, social deduction
Reasoning control	2	test-time scaling, audio-language

Trends & Observations

Self-Improving Agent Architectures Converge: MUSE-Autoskill and SIA independently arrive at nearly identical architectures—skill lifecycle management combined with weight/harness updates. This convergence suggests a canonical approach to agent autonomy is emerging. MUSE-Autoskill provides the theoretical framework (creation, memory, evaluation, refinement), while SIA validates it with 56.6% improvement on LawBench.
Domain-Specific Safety Harnesses Emerge: Generic agent safety frameworks are giving way to specialized solutions. FinHarness targets finance LLM agents with a three-module architecture (Query Monitor, Tool Monitor, Cascade), reducing attack success rate from 38.3% to 15.0% while preserving benign approval rates. ENPMR-Bench addresses emotional support agents with Maslow-grounded proactive memory retrieval. This specialization trend indicates one-size-fits-all safety is insufficient for production deployment.
RLHF Structural Vulnerability Identified: Alignment Tampering (accepted to ICML 2026) demonstrates a fundamental flaw in RLHF’s preference feedback loop—when LLM outputs influence preference datasets, the training process can amplify misaligned biases rather than correct them. This is not an implementation bug but a structural vulnerability in the RLHF paradigm itself.
Multimodal Hallucination Persists: QUACK reveals that top vision-language models hallucinate 15.1% of spatial claims and make more than 50% of accusations without grounded evidence in social deduction scenarios. The framework introduces a systematic auditing methodology, but the results underscore that multimodal grounding remains unsolved.
RAG Understanding Deepens: Three RAG papers advance retrieval understanding from different angles: The Coverage Illusion exposes the gap between synthetic and real query distributions; LitSeg brings narrative-aware segmentation to literary works; Semantic Competition isolates retrieval interference from context length effects. Collectively, these suggest production RAG systems have systematic blind spots.

🔺 Scout Intel: What Others Missed

Confidence: high | Novelty Score: 65/100

This week’s ArXiv snapshot reveals three emerging patterns that mainstream coverage overlooks:

Previous Snapshots

Sources

ArXiv cs.CL API - Primary source for NLP and computational linguistics papers (successful)
ArXiv API Agent Query - Supplementary agent-focused queries
Brave Search - Fallback for rate-limited categories

Collection Note: This snapshot achieved partial coverage due to ArXiv API rate limits affecting cs.AI and cs.MA categories. Full coverage is expected to resume in next week’s snapshot.

hbxwk91k5ctrl1oly9kn████tzj24x37ym03ut20hhdopsvanshbjei1████ipqwmb6g90pboxbk44x0ojiidoxmdh0jl████ttz5dnohtnm3linjo0bks2gm4sspxbht████ctv5gwztueih3q0tnvq6oekz8fd09kvtd░░░ub21pk9jqqehi31wwg5az42g27o1dihdf████4eq2zfows8qr5lv07g1abf4ugyjliehd████aci9oifqs46ip7duqutlmhzyuhqiv6qk░░░bjjc092nr8nu9otrlt2vde7bfdli5s6qq░░░318npf64fu6gm6nz1bq531air1f2uklu████f22qov15hbahxz0x01xhmf32adn7n8p░░░z79twmxptgoaqcoh9bvinwglr5ay7yhj9░░░xyjuf8j4qwb0oxv6j1hkf2fu6vyynkaft8░░░b71hv0gobzh4obawtllqxjdsdckhpgra4░░░1dtqhx24bjonvtgquqv9riunc2o6u9cs████swdz2q2i03kgokli0txjw0xex6vq70s9░░░et5ayv1m6ycyh4kq9rudwldew0el91v3j░░░e1aqgffm3sv1z49n3j79bdhumjlreq2yw░░░honmeig9sohf8caxi6b4ccelhvku8y5████f8mw9n0b367bsjh2giillt0of9qx8k3n2g████al51fsmxctuvpmd0fifx8yv6kttjdbg9░░░ujnq8rayvhodfwtgw2iykcetamiq3r25e░░░1a5mcl5k3il0vro1jub99nfqpowo6t1xnn████b64yhezglqahkbnejad57stff3ylk3bn████xeyexcq0b5nfdmyraccefwe338i4z33a████k5jpb6obc1tvbrf9pp4t9h8ve5jvx3rg4████et5vw37vqvkibap97em1k4cfxnmzj62q████b20rqo9kpen8i7awjp8fr3fueyztejitr████zpl1v2tix4d93d1z4d4zr5u89t1qd7zi░░░acxlxf14z0olo3w1nudo28fi31psaz2q6████al7kmalrtr6mrmr7mytoclus424sm8f░░░3v6y8cglsm20uib8zru5a3oebxf3itf6░░░bcggu38vyrb1ukgk7shaxwdkbtg9gk62e░░░ijbu5jfwmkghrsixcuwg5qxg3j9zgm7yb░░░3bwrlenf59buu7bjhyyfropabbjius████hw6wifpbnd9ha83w85plyc4obwvryktyr████zsit559k72h3uqp7nbauazq4a6pdm3c7████5juxsp39wqkecioq8k45t48pccpx0ed5g████y2tuwyu7f38ld9n02n5snsm5tjshi65i████h6yfd7ayvkibs3rhpiydf54jl44k9748k████t3p0jk1opmxvytsovv3ca9kuts50fxbd░░░izkojzyga8ca7pnnjgia0o85holhksl3q████81cipn2y9mogep9kirbpwm3itefup12░░░ex81lfn6bq94xtyrrbfgt57hkw5f3auge████chuzip9ot2tunqxg41x759pzs63sa476░░░zh9qpwv6arhyut6tz54gi9k5fzwmqvng░░░f7lssnro04kv572nzlwsohm56vieimlnd░░░aq1wh6t2yq9pefzgjqqjlmjxconz42am████iksnra5dgbhenwnotnkvectk1m3rgg6e████hw68bo55sfxslnsdr4pdt4pp2sbxr2d7░░░x2h8bifoues

Related Intel

Data May 26, 2026

LLM Product Release Tracker — Week of May 26, 2026

Anthropic acquired Stainless for MCP tooling, Mistral acquired Emmi AI for Physics AI, Google launched Managed Agents and Antigravity Agent, and OpenAI released Secure MCP Tunnel. 5 high-impact releases across MCP platform war and agent infrastructure.

#llm #product-releases #mcp #agent-infrastructure

Data May 25, 2026

GitHub AI Agent Repository Stars Tracker — Week of May 25, 2026

Weekly snapshot of AI agent repositories on GitHub. Hermes Agent leads with 165.5K stars (+7.06% WoW). Ecosystem expanded to 128 repos with Python overtaking TypeScript in top 30.

#github #ai-agents #stars-tracker #open-source

Insight May 25, 2026

Enterprise AI Agent Security Threshold: MCP Tunnels, A2A, and $25B Valuation Club

Week 34 marks enterprise AI agents crossing the security threshold: MCP Tunnels enable perimeter security, A2A reaches 150+ organizations, Cursor leads $50B valuation club, and Observational Memory delivers 10x cost reduction.

#mcp-protocol #a2a-protocol #enterprise-ai #observational-memory

Data Overview

Key Facts

Methodology

This Week’s Data

Top Papers by Trend Score

Full Agent-Related Papers (18 papers)

Week-over-Week Summary

Ecosystem Metrics

Category Distribution

Topic Clusters

Trends & Observations

🔺 Scout Intel: What Others Missed

Previous Snapshots

Sources

Data Overview

Key Facts

Methodology

This Week’s Data

Top Papers by Trend Score

Full Agent-Related Papers (18 papers)

Week-over-Week Summary

Ecosystem Metrics

Category Distribution

Topic Clusters

Trends & Observations

🔺 Scout Intel: What Others Missed

Previous Snapshots

Sources

Related Intel

LLM Product Release Tracker — Week of May 26, 2026

GitHub AI Agent Repository Stars Tracker — Week of May 25, 2026

Enterprise AI Agent Security Threshold: MCP Tunnels, A2A, and $25B Valuation Club