전체 AI 논문 - 2026-03-26

1. The Stochastic Gap: A Markovian Framework for Pre-Deployment Reliability and Oversight-Cost Auditing in Agentic Artificial Intelligence


2. Completeness of Unbounded Best-First Minimax and Descent Minimax


3. From Liar Paradox to Incongruent Sets: A Normal Form for Self-Reference


4. Multi-Agent Reasoning with Consistency Verification Improves Uncertainty Calibration in Medical MCQA


5. AI-Supervisor: Autonomous AI Research Supervision via a Persistent Research World Model



7. Enhanced Mycelium of Thought (EMoT): A Bio-Inspired Hierarchical Reasoning Architecture with Strategic Dormancy and Mnemonic Encoding


8. ELITE: Experiential Learning and Intent-Aware Transfer for Self-improving Embodied Agents


9. Language-Grounded Multi-Agent Planning for Personalized and Fair Participatory Urban Sensing



11. AnalogAgent: Self-Improving Analog Circuit Design Automation with LLM Agents


12. DUPLEX: Agentic Dual-System Planning via LLM-Driven Information Extraction




15. SCoOP: Semantic Consistent Opinion Pooling for Uncertainty Quantification in Multiple Vision-Language Model Systems


16. VehicleMemBench: An Executable Benchmark for Multi-User Long-Term Memory in In-Vehicle Agents


17. Learning-guided Prioritized Planning for Lifelong Multi-Agent Path Finding in Warehouse Automation


18. Efficient Benchmarking of AI Agents


19. LLMs Do Not Grade Essays Like Humans


20. Grounding Vision and Language to 3D Masks for Long-Horizon Box Rearrangement


21. GTO Wizard Benchmark


22. Can LLM Agents Be CFOs? A Benchmark for Resource Allocation in Dynamic Enterprise Environments


23. Evaluating a Multi-Agent Voice-Enabled Smart Speaker for Care Homes: A Safety-Focused Framework


24. Environment Maps: Structured Environmental Representations for Long-Horizon Agents


25. PLDR-LLMs Reason At Self-Organized Criticality


26. Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA


27. EndoVGGT: GNN-Enhanced Depth Estimation for Surgical 3D Reconstruction


28. Chameleon: Episodic Memory for Long-Horizon Robotic Manipulation


29. VFIG: Vectorizing Complex Figures in SVG with Vision-Language Models


30. Anti-I2V: Safeguarding your photos from malicious image-to-video generation


31. The Free-Market Algorithm: Self-Organizing Optimization for Open-Ended Complex Systems


32. LensWalk: Agentic Video Understanding by Planning How You See in Videos


33. Evaluating Chunking Strategies For Retrieval-Augmented Generation in Oil and Gas Enterprise Documents


34. A Sociolinguistic Analysis of Automatic Speech Recognition Bias in Newcastle English


35. SEGAR: Selective Enhancement for Generative Augmented Reality


36. CliPPER: Contextual Video-Language Pretraining on Long-form Intraoperative Surgical Procedures for Event Recognition


37. UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience


38. No Single Metric Tells the Whole Story: A Multi-Dimensional Evaluation Framework for Uncertainty Attributions


39. Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs


40. Counting Without Numbers \& Finding Without Words


41. Integrating Causal Machine Learning into Clinical Decision Support Systems: Insights from Literature and Practice


42. CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents


43. Enes Causal Discovery


44. OneSearch-V2: The Latent Reasoning Enhanced Self-distillation Generative Search Framework


45. ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers


46. Real Talk, Virtual Faces: A Formal Concept Analysis of Personality and Sentiment in Influencer Audiences


47. Exploring How Fair Model Representations Relate to Fair Recommendations


48. When AI Meets Early Childhood Education: Large Language Models as Assessment Teammates in Chinese Preschools


49. MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization


50. Language-Guided Structure-Aware Network for Camouflaged Object Detection


51. Evidence of an Emergent “Self” in Continual Robot Learning


52. Enhancing Efficiency and Performance in Deepfake Audio Detection through Neuron-level dropin & Neuroplasticity Mechanisms


53. GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents


54. Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing


55. Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforcement Learning


56. Toward Generalist Neural Motion Planners for Robotic Manipulators: Challenges and Opportunities


57. Cost-Sensitive Neighborhood Aggregation for Heterophilous Graphs: When Does Per-Edge Routing Help?


58. The Specification Gap: Coordination Failure Under Partial Knowledge in Code Agents


59. Bridging Biological Hearing and Neuromorphic Computing: End-to-End Time-Domain Audio Signal Processing with Reservoir Computing


60. Accelerating Diffusion-based Video Editing via Heterogeneous Caching: Beyond Full Computing at Sampled Denoising Timestep


61. Embracing Heteroscedasticity for Probabilistic Time Series Forecasting


62. DVM: Real-Time Kernel Generation for Dynamic AI Models


63. Environment-Grounded Multi-Agent Workflow for Autonomous Penetration Testing


64. Who Benefits from RAG? The Role of Exposure, Utility and Attribution Bias


65. Where Do Your Citations Come From? Citation-Constellation: A Free, Open-Source, No-Code, and Auditable Tool for Citation Network Decomposition with Complementary BARON and HEROCON Scores



67. Powerful Teachers Matter: Text-Guided Multi-view Knowledge Distillation with Visual Prior Enhancement



69. A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula


70. MedAidDialog: A Multilingual Multi-Turn Medical Dialogue Dataset for Accessible Healthcare


71. The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty Estimation


72. Comparative analysis of dual-form networks for live land monitoring using multi-modal satellite image time series


73. KCLNet: Electrically Equivalence-Oriented Graph Representation Learning for Analog Circuits


74. Towards Effective Experiential Learning: Dual Guidance for Utilization and Internalization


75. Knowledge-Guided Manipulation Using Multi-Task Reinforcement Learning


76. When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm


77. Mitigating Object Hallucinations in LVLMs via Attention Imbalance Rectification


78. From Oracle to Noisy Context: Mitigating Contextual Exposure Bias in Speech-LLMs


79. Schema on the Inside: A Two-Phase Fine-Tuning Method for High-Efficiency Text-to-SQL at Scale


80. Understanding the Challenges in Iterative Generative Optimization with LLMs


81. From Untamed Black Box to Interpretable Pedagogical Orchestration: The Ensemble of Specialized LLMs Architecture for Adaptive Tutoring


82. SafeFlow: Real-Time Text-Driven Humanoid Whole-Body Control via Physics-Guided Rectified Flow and Selective Safety Gating


83. Kirchhoff-Inspired Neural Networks for Evolving High-Order Perception


84. The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More


85. Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage


86. Variable-Length Audio Fingerprinting


87. High-Fidelity Face Content Recovery via Tamper-Resilient Versatile Watermarking


88. Revealing Multi-View Hallucination in Large Vision-Language Models


89. DecepGPT: Schema-Driven Deception Detection with Multicultural Datasets and Robust Multimodal Learning


90. Self-Distillation for Multi-Token Prediction


91. Latent Bias Alignment for High-Fidelity Diffusion Inversion in Real-World Image Reconstruction and Manipulation


92. Knowledge-Refined Dual Context-Aware Network for Partially Relevant Video Retrieval


93. SM-Net: Learning a Continuous Spectral Manifold from Multiple Stellar Libraries


94. AgentChemist: A Multi-Agent Experimental Robotic Platform Integrating Chemical Perception and Precise Control


95. The Luna Bound Propagator for Formal Analysis of Neural Networks


96. HDPO: Hybrid Distillation Policy Optimization via Privileged Self-Distillation


97. Can VLMs Reason Robustly? A Neuro-Symbolic Investigation


98. Generative AI User Experience: Developing Human–AI Epistemic Partnership


99. Deep Convolutional Neural Networks for predicting highest priority functional group in organic molecules


100. Why the Maximum Second Derivative of Activations Matters for Adversarial Robustness


101. PoliticsBench: Benchmarking Political Values in Large Language Models with Multi-Turn Roleplay


102. Circuit Complexity of Hierarchical Knowledge Tracing and Implications for Log-Precision Transformers


103. Perturbation: A simple and efficient adversarial tracer for representation learning in language models


104. Willful Disobedience: Automatically Detecting Failures in Agentic Traces


105. Deep Neural Regression Collapse


106. Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt Selection


107. The Cognitive Firewall:Securing Browser Based AI Agents Against Indirect Prompt Injection Via Hybrid Edge Cloud Defense


108. Probabilistic Geometric Alignment via Bayesian Latent Transport for Domain-Adaptive Foundation Models


109. Human-in-the-Loop Pareto Optimization: Trade-off Characterization for Assist-as-Needed Training and Performance Evaluation


110. AI-driven Intent-Based Networking Approach for Self-configuration of Next Generation Networks


111. Self Paced Gaussian Contextual Reinforcement Learning


112. CDMT-EHR: A Continuous-Time Diffusion Framework for Generating Mixed-Type Time-Series Electronic Health Records


113. An In-Depth Study of Filter-Agnostic Vector Search on a PostgreSQL Database System: [Experiments and Analysis]


114. The Diminishing Returns of Early-Exit Decoding in Modern LLMs


115. Assessment Design in the AI Era: A Method for Identifying Items Functioning Differentially for Humans and Chatbots


116. Learning What Can Be Picked: Active Reachability Estimation for Efficient Robotic Fruit Harvesting


117. PLACID: Privacy-preserving Large language models for Acronym Clinical Inference and Disambiguation


118. Prototype Fusion: A Training-Free Multi-Layer Approach to OOD Detection


119. Estimating Individual Tree Height and Species from UAV Imagery


120. Echoes: A semantically-aligned music deepfake detection dataset


121. Probing Ethical Framework Representations in Large Language Models: Structure, Entanglement, and Methodological Challenges


122. λSplit: Self-Supervised Content-Aware Spectral Unmixing for Fluorescence Microscopy



124. Ukrainian Visual Word Sense Disambiguation Benchmark


125. A Theory of LLM Information Susceptibility


126. LLMLOOP: Improving LLM-Generated Code and Tests through Automated Iterative Feedback Loops


127. LLMORPH: Automated Metamorphic Testing of Large Language Models


128. LineMVGNN: Anti-Money Laundering with Line-Graph-Assisted Multi-View Graph Neural Networks


129. AI Generalisation Gap In Comorbid Sleep Disorder Staging


130. Wafer-Level Etch Spatial Profiling for Process Monitoring from Time-Series with Time-LLM


131. APreQEL: Adaptive Mixed Precision Quantization For Edge LLMs


132. PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning


133. Dual-Criterion Curriculum Learning: Application to Temporal Data


134. StateLinFormer: Stateful Training Enhancing Long-term Memory in Navigation


135. AscendOptimizer: Episodic Agent for Ascend NPU Operator Optimization


136. Safe Reinforcement Learning with Preference-based Constraint Inference


137. Synthetic Mixed Training: Scaling Parametric Knowledge Acquisition Beyond RAG


138. CAPTCHA Solving for Native GUI Agents: Automated Reasoning-Action Data Generation and Self-Corrective Training


139. Upper Entropy for 2-Monotone Lower Probabilities


140. Mixture of Demonstrations for Textual Graph Understanding and Question Answering


141. Large Language Models and Scientific Discourse: Where’s the Intelligence?


142. MDKeyChunker: Single-Call LLM Enrichment with Rolling Keys and Key-Based Restructuring for High-Accuracy RAG


143. Generating Hierarchical JSON Representations of Scientific Sentences Using LLMs


144. Did You Forget What I Asked? Prospective Memory Failures in Large Language Models


145. Konkani LLM: Multi-Script Instruction Tuning and Evaluation for a Low-Resource Indian Language


146. Navigating the Concept Space of Language Models


147. Qworld: Question-Specific Evaluation Criteria for LLMs


148. Chitrakshara: A Large Multilingual Multimodal Dataset for Indian languages


149. From Physician Expertise to Clinical Agents: Preserving, Standardizing, and Scaling Physicians’ Medical Expertise with Lightweight LLM


150. MedMT-Bench: Can LLMs Memorize and Understand Long Multi-Turn Conversations in Medical Scenarios?


151. Cluster-R1: Large Reasoning Models Are Instruction-following Clustering Agents


152. Beyond Accuracy: Introducing a Symbolic-Mechanistic Approach to Interpretable Evaluation


153. MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens


154. Training a Large Language Model for Medical Coding Using Privacy-Preserving Synthetic Clinical Data


155. DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Language Models


156. Berta: an open-source, modular tool for AI-enabled clinical documentation


157. S-Path-RAG: Semantic-Aware Shortest-Path Retrieval Augmented Generation for Multi-Hop Knowledge Graph Question Answering


158. DISCO: Document Intelligence Suite for COmparative Evaluation


159. Visuospatial Perspective Taking in Multimodal Language Models


160. Internal Safety Collapse in Frontier Large Language Models


161. Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes


162. Leveraging Computerized Adaptive Testing for Cost-effective Evaluation of Large Language Models in Medical Benchmarking


163. Evidence for Limited Metacognition in LLMs


164. Mitigating Many-Shot Jailbreaking


165. Inspection and Control of Self-Generated-Text Recognition Ability in Llama3-8b-Instruct