전체 AI 논문 - 2026-05-06

1. OpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty Trajectories


2. Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours


3. SymptomAI: Towards a Conversational AI Agent for Everyday Symptom Assessment


4. An Agent-Oriented Pluggable Experience-RAG Skill for Experience-Driven Retrieval Strategy Orchestration


5. From Intent to Execution: Composing Agentic Workflows with Agent Recommendation


6. Contextual Multi-Objective Optimization: Rethinking Objectives in Frontier AI Systems


7. QKVShare: Quantized KV-Cache Handoff for Multi-Agent On-Device LLMs


8. EvoLM: Self-Evolving Language Models through Co-Evolved Discriminative Rubrics


9. Quantifying the human visual exposome with vision language models


10. Correct Is Not Enough: Training Reasoning Planners with Executor-Grounded Rewards


11. Mechanical Conscience: A Mathematical Framework for Dependability of Machine Intelligenc


12. SOAR: Real-Time Joint Optimization of Order Allocation and Robot Scheduling in Robotic Mobile Fulfillment Systems


13. Agentic-imodels: Evolving agentic interpretability tools via autoresearch


14. ScrapMem: A Bio-inspired Framework for On-device Personalized Agent Memory via Optical Forgetting


15. Say the Mission, Execute the Swarm: Agent-Enhanced LLM Reasoning in the Web-of-Drones


16. What You Think is What You See: Driving Exploration in VLM Agents via Visual-Linguistic Curiosity


17. OracleProto: A Reproducible Framework for Benchmarking LLM Native Forecasting via Knowledge Cutoff and Temporal Masking


18. MEMTIER: Tiered Memory Architecture and Retrieval Bottleneck Analysis for Long-Running Autonomous AI Agents


19. Agent-Based Modeling of Low-Emission Fertilizer Adoption for Dairy Farm Decarbonisation using Empirical Farm Data


20. AdapShot: Adaptive Many-Shot In-Context Learning with Semantic-Aware KV Cache Reuse


21. Self-Improvement for Fast, High-Quality Plan Generation


22. Where Paths Split: Localized, Calibrated Control of Moral Reasoning in Large Language Models


23. Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies


24. Real-Time Evaluation of Autonomous Systems under Adversarial Attacks


25. FinSTaR: Towards Financial Reasoning with Time Series Reasoning Models


26. Replacing Parameters with Preferences: Federated Alignment of Heterogeneous Vision-Language Models


27. Adaptive Dual-Path Framework for Covert Semantic Communication


28. Geometry over Density: Few-Shot Cross-Domain OOD Detection


29. Robust Agent Compensation (RAC): Teaching AI Agents to Compensate


30. GeoDecider: A Coarse-to-Fine Agentic Workflow for Explainable Lithology Classification


31. ReasonAudio: A Benchmark for Evaluating Reasoning Beyond Matching in Text-Audio Retrieval


32. What Happens Inside Agent Memory? Circuit Analysis from Emergence to Diagnosis


33. Automated Large-scale CVRP Solver Design via LLM-assisted Flexible MCTS


34. Revisiting the Travel Planning Capabilities of Large Language Models


35. Enhancing Agent Safety Judgment: Controlled Benchmark Rewriting and Analogical Reasoning for Deceptive Out-of-Distribution Scenarios


36. cotomi Act: Learning to Automate Work by Watching You


37. Evaluating Prompting and Execution-Based Methods for Deterministic Computation in LLMs


38. ADAPTS: Agentic Decomposition for Automated Protocol-agnostic Tracking of Symptoms


39. Stop Automating Peer Review Without Rigorous Evaluation


40. Terminus-4B: Can a Smaller Model Replace Frontier LLMs at Agentic Execution Tasks?


41. Learning Correct Behavior from Examples: Validating Sequential Execution in Autonomous Agents


42. Are you with me? A Framework for Detecting Mental Model Discrepancies in Task-Based Team Dialogues


43. Programmatic Context Augmentation for LLM-based Symbolic Regression


44. Making the Invisible Visible: Understanding the Mismatch Between Organizational Goals and Worker Experiences in AI Adoption


45. Computing Thiele Rules on Interval Elections and their Generalizations


46. Stable Agentic Control: Tool-Mediated LLM Architecture for Autonomous Cyber Defense


47. CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing


48. Safety and accuracy follow different scaling laws in clinical large language models


49. Physics-Grounded Multi-Agent Architecture for Traceable, Risk-Aware Human-AI Decision Support in Manufacturing


50. Flow Sampling: Learning to Sample from Unnormalized Densities via Denoising Conditional Processes


51. Feature-Augmented Transformers for Robust AI-Text Detection Across Domains and Generators


52. Label-Efficient School Detection from Aerial Imagery via Weakly Supervised Pretraining and Fine-Tuning


53. Inconsistent Databases and Argumentation Frameworks with Collective Attacks


54. MOSAIC-Bench: Measuring Compositional Vulnerability Induction in Coding Agents


55. TabSurv: Adapting Modern Tabular Neural Networks to Survival Analysis


56. A Benchmark for Interactive World Models with a Unified Action Generation Framework


57. The Counterexample Game: Iterated Conceptual Analysis and Repair in Language Models


58. Towards Open World Sound Event Detection



60. PHALAR: Phasors for Learned Musical Audio Representations


61. Atomic Fact-Checking Increases Clinician Trust in Large Language Model Recommendations for Oncology Decision Support: A Randomized Controlled Trial


62. Steer Like the LLM: Activation Steering that Mimics Prompting


63. Deco: Extending Personal Physical Objects into Pervasive AI Companion through a Dual-Embodiment Framework


64. DMGD: Train-Free Dataset Distillation with Semantic-Distribution Matching in Diffusion Models


65. Spatiotemporal Convolutions on EEG signal – A Representation Learning Perspective on Efficient and Explainable EEG Classification with Convolutional Neural Nets


66. MCJudgeBench: A Benchmark for Constraint-Level Judge Evaluation in Multi-Constraint Instruction Following


67. TRACE: A Metrologically-Grounded Engineering Framework for Trustworthy Agentic AI Systems in Operationally Critical Domains


68. RoboAlign-R1: Distilled Multimodal Reward Alignment for Robot Video World Models


69. AI Advocate: Educational Path to Transform Squads to the Future


70. Before Forgetting, Learn to Remember: Revisiting Foundational Learning Failures in LVLM Unlearning Benchmarks


71. A Workflow-Oriented Framework for Asynchronous Human-AI Collaboration in Hybrid and Compute-Intensive HPC Environments


72. Rethinking the Rank Threshold for LoRA Fine-Tuning


73. Segmenting Human-LLM Co-authored Text via Change Point Detection


74. Amortized Variational Inference for Joint Posterior and Predictive Distributions in Bayesian Uncertainty Quantification


75. SAM-NER: Semantic Archetype Mediation for Zero-Shot Named Entity Recognition


76. SERE: Structural Example Retrieval for Enhancing LLMs in Event Causality Identification


77. Tailored Prompts, Targeted Protection: Vulnerability-Specific LLM Analysis for Smart Contracts


78. Graph Neural Network based Hierarchy-Aware Embeddings of Knowledge Graphs: Applications to Yeast Phenotype Prediction


79. FUS3DMaps: Scalable and Accurate Open-Vocabulary Semantic Mapping by 3D Fusion of Voxel- and Instance-Level Layers


80. ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity


81. Stage Light is Sequence$^2$: Multi-Light Control via Imitation Learning


82. AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics


83. Multi-Agent Strategic Games with LLMs


84. Unifying Dynamical Systems and Graph Theory to Mechanistically Understand Computation in Neural Networks


85. Flow Matching on Symmetric Spaces


86. PatRe: A Full-Stage Office Action and Rebuttal Generation Benchmark for Patent Examination


87. Disentangling Shared and Task-Specific Representations from Multi-Modal Clinical Data


88. HeadQ: Model-Visible Distortion and Score-Space Correction for KV-Cache Quantization


89. PerFlow: Physics-Embedded Rectified Flow for Efficient Reconstruction and Uncertainty Quantification of Spatiotemporal Dynamics



91. ProgramBench: Can Language Models Rebuild Programs From Scratch?


92. DALPHIN: Benchmarking Digital Pathology AI Copilots Against Pathologists on an Open Multicentric Dataset


93. A Skill-Based AI Agentic Pipeline for Library of Congress Subject Indexing


94. Parametrizing Convex Sets Using Sublinear Neural Networks


95. Revisiting Graph-Tokenizing Large Language Models: A Systematic Evaluation of Graph Token Understanding


96. Brainrot: Deskilling and Addiction are Overlooked AI Risks


97. Meta-Inverse Physics-Informed Neural Networks for High-Dimensional Ordinary Differential Equations


98. BFORE: Butterfly-Firefly Optimized Retinex Enhancement for Low-Light Image Quality Improvement


99. MHPR: Multidimensional Human Perception and Reasoning Benchmark for Large Vision-Languate Models


100. MEMSAD: Gradient-Coupled Anomaly Detection for Memory Poisoning in Retrieval-Augmented Agents


101. CuraView: A Multi-Agent Framework for Medical Hallucination Detection with GraphRAG-Enhanced Knowledge Verification


102. Detecting Stealth Sycophancy in Mental-Health Dialogue with Dynamic Emotional Signature Graphs


103. FINER-SQL: Boosting Small Language Models for Text-to-SQL


104. Learning Generalizable Action Representations via Pre-training AEMG


105. Exposing LLM Safety Gaps Through Mathematical Encoding:New Attacks and Systematic Analysis


106. DynaTab: Dynamic Feature Ordering as Neural Rewiring for High-Dimensional Tabular Data


107. Deepfake Audio Detection Using Self-supervised Fusion Representations


108. Learning to Theorize the World from Observation


109. Smart Passive Acoustic Monitoring: Embedding a Classifier on AudioMoth Microcontroller


110. Discovering Reinforcement Learning Interfaces with Large Language Models


111. APEX: Large-scale Multi-task Aesthetic-Informed Popularity Prediction for AI-Generated Music


112. A Fast Model Counting Algorithm for Two-Variable Logic with Counting and Modulo Counting Quantifiers


113. Enhancing Self-Supervised Talking Head Forgery Detection via a Training-Free Dual-System Framework


114. Local Truncation Error-Guided Neural ODEs for Large Scale Traffic Forecasting


115. SkCC: Portable and Secure Skill Compilation for Cross-Framework LLM Agents


116. Can Multimodal Large Language Models Understand Pathologic Movements? A Pilot Study on Seizure Semiology


117. VLMaxxing through FrameMogging Training-Free Anti-Recomputation for Video Vision-Language Models


118. Toward Structural Multimodal Representations: Specialization, Selection, and Sparsification via Mixture-of-Experts


119. RAG over Thinking Traces Can Improve Reasoning Tasks


120. FreeTimeGS++: Secrets of Dynamic Gaussian Splatting and Their Principles


121. LLM-ADAM: A Generalizable LLM Agent Framework for Pre-Print Anomaly Detection in Additive Manufacturing


122. DGPO: Distribution Guided Policy Optimization for Fine Grained Credit Assignment


123. AHPA: Adaptive Hierarchical Prior Alignment for Diffusion Transformers


124. Cryptographic Registry Provenance: Structural Defense Against Dependency Confusion in AI Package Ecosystems


125. SHIELD: A Diverse Clinical Note Dataset and Distilled Small Language Models for Enterprise-Scale De-identification


126. On the Spectral Structure and Objective Equivalence of Orthogonal Multilabel Fisher Discriminants


127. Copula-Based Endogeneity Correction for Doubly Robust Estimation of Treatment Effect


128. RLDX-1 Technical Report


129. Partially Observed Structural Causal Models


130. Can AI Help You Get Over Your Breakup? One Session with a Belief-Reframing Chatbot Shows Sustained Distress Reduction


131. Ortho-Hydra: Orthogonalized Experts for DiT LoRA


132. Posterior-First Neural PDE Simulation: Inferring Hidden Problem State from a Single Field


133. S^2tory: Story Spine Distillation for Movie Script Summarization


134. OptiLookUp: An Optical ROM-Based Loop up Table Engine for Photonic Accelerators


135. MAGE: Safeguarding LLM Agents against Long-Horizon Threats via Shadow Memory


136. Self-Mined Hardness for Safety Fine-Tuning


137. MenuNet: A Strategy-Proof Mechanism for Matching Markets


138. When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI


139. Human-Provenance Verification should be Treated as Labor Infrastructure in AI-Saturated Markets


140. From Knowledge to Action: Outcomes of the 2025 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry


141. Global and Local Topology-Aware Attention with Persistent Homology and Euler Biases for Time-Series Forecasting


142. Pact: A Choreographic Language for Agentic Ecosystems


143. PIIGuard: Mitigating PII Harvesting under Adversarial Sanitization


144. ARISE: A Repository-level Graph Representation and Toolset for Agentic Fault Localization and Program Repair


145. Cascade Token Selection for Transformer Attention Acceleration


146. Gated Subspace Inference for Transformer Acceleration


147. MedStruct-S: A Benchmark for Key Discovery, Key-Conditioned QA and Semi-Structured Extraction from OCR Clinical Reports


148. From Barrier to Bridge: The Case for AI Data Center/Power Grid Co-Design


149. Refining Compositional Diffusion for Reliable Long-Horizon Planning


150. Neuron-Anchored Rule Extraction for Large Language Models via Contrastive Hierarchical Ablation


151. ARIS: Autonomous Research via Adversarial Multi-Agent Collaboration


152. Mixed-Precision Information Bottlenecks for On-Device Trait-State Disentanglement in Bipolar Agitation Detection


153. Structured Diffusion Bridges: Inductive Bias for Denoising Diffusion Bridges


154. Multilingual Safety Alignment via Self-Distillation


155. Decompose to Understand, Fuse to Detect: Frequency-Decoupled Anomaly Detection for Encrypted Network Traffic


156. Finite-Size Gradient Transport in Large Language Model Pretraining: From Cascade Size to Intensive Transport Efficiency


157. AutoRAGTuner: A Declarative Framework for Automatic Optimization of RAG Pipelines


158. Reward Hacking Benchmark: Measuring Exploits in LLM Agents with Tool Use


159. Analytic Bridge Diffusions for Controlled Path Generation


160. Tracing the Dynamics of Refusal: Exploiting Latent Refusal Trajectories for Robust Jailbreak Detection


161. Kernel Affine Hull Machines for Compute-Efficient Query-Side Semantic Encoding


162. AsymK-Talker: Real-Time and Long-Horizon Talking Head Generation via Asymmetric Kernel Distillation


163. Predicting Euler Characteristics and Constructing Topological Structure Using Machine Learning Techniques


164. RouteHijack: Routing-Aware Attack on Mixture-of-Experts LLMs


165. Exploring Pass-Rate Reward in Reinforcement Learning for Code Generation


166. Healthcare AI GYM for Medical Agents


167. PrismAgent: Illuminating Harm in Memes via a Zero-Shot Interpretable Multi-Agent Framework


168. From Static Analysis to Audience Dissemination: A Training-Free Multimodal Controversy Detection Multi-Agent Framework


169. PAMNet: Cycle-aware Phase-Amplitude Modulation Network for Multivariate Time Series Forecasting


170. Proteo-R1: Reasoning Foundation Models for De Novo Protein Design


171. A Universal Space of Brain Dynamics for Unveiling Cognitive Transitions and Individual Differences


172. DeRelayL: Sustainable Decentralized Relay Learning


173. Keyword spotting using convolutional neural network for speech recognition in Hindi


174. Generalization Bounds of Spiking Neural Networks via Rademacher Complexity


175. EvoJail: Evolutionary Diverse Jailbreak Prompt Generation for Large Language Models


176. Mitigating the reconstruction-detection trade-off in VAE-based unsupervised anomaly detection


177. PRISM-CTG: A Foundation Model for Cardiotocography Analysis with Multi-View SSL


178. When Safety Geometry Collapses: Fine-Tuning Vulnerabilities in Agentic Guard Models


179. Reasoning-Guided Grounding: Elevating Video Anomaly Detection through Multimodal Large Language Models


180. Delay, Plateau, or Collapse: Evaluating the Impact of Systematic Verification Error on RLVR


181. Memorization In Stable Diffusion Is Unexpectedly Driven by CLIP Embeddings


182. On the Invariants of Softmax Attention


183. A User-Centric Analysis of Explainability in AI-Based Medical Image Diagnosis


184. From Passive Feeds to Guided Discovery: AI-Initiated Interaction for Vague Intent in Content Exploration


185. Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses


186. Same Voice, Different Lab: On the Homogenization of Frontier LLM Personalities