전체 AI 논문 - 2026-03-13

1. Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training


2. Portfolio of Solving Strategies in CEGAR-based Object Packing and Scheduling for Sequential 3D Printing


3. Compiling Temporal Numeric Planning into Discrete PDDL+: Extended Version


4. TopoBench: Benchmarking LLMs on Hard Topological Reasoning


5. Increasing intelligence in AI agents can worsen collective outcomes


6. On Information Self-Locking in Reinforcement Learning for Active Reasoning of LLM agents


7. A Robust and Efficient Multi-Agent Reinforcement Learning Framework for Traffic Signal Control


8. XSkill: Continual Learning from Experience and Skills in Multimodal Agents


9. Can RL Improve Generalization of LLM Agents? An Empirical Study


10. Few-for-Many Personalized Federated Learning


11. LABSHIELD: A Multimodal Benchmark for Safety-Critical Reasoning and Planning in Scientific Laboratories


12. Normative Common Ground Replication (NormCoRe): Replication-by-Translation for Studying Norms in Multi-agent AI


13. Learning Transferable Sensor Models via Language-Informed Pretraining


14. Prototype-Based Knowledge Guidance for Fine-Grained Structured Radiology Reporting


15. Fair Learning for Bias Mitigation and Quality Optimization in Paper Recommendation


16. AdaFuse: Accelerating Dynamic Adapter Inference via Token-Level Pre-Gating and Fused Kernel Optimization



18. CreativeBench: Benchmarking and Enhancing Machine Creativity via Self-Evolving Challenges


19. Automated Detection of Malignant Lesions in the Ovary Using Deep Learning Models and XAI


20. VisiFold: Long-Term Traffic Forecasting via Temporal Folding Graph and Node Visibility


21. Automating Skill Acquisition through Large-Scale Mining of Open-Source Agentic Repositories: A Framework for Multi-Agent Procedural Knowledge Extraction


22. A Semi-Decentralized Approach to Multiagent Control


23. DocSage: An Information Structuring Agent for Multi-Doc Multi-Entity Question Answering


24. From Debate to Deliberation: Structured Collective Reasoning with Typed Epistemic Acts


25. An Automatic Text Classification Method Based on Hierarchical Taxonomies, Neural Networks and Document Embedding: The NETHIC Tool


26. Governing Evolving Memory in LLM Agents: Risks, Mechanisms, and the Stability and Safety Governed Memory (SSGM) Framework


27. Understanding Wikidata Qualifiers: An Analysis and Taxonomy


28. Anomaly detection in time-series via inductive biases in the latent space of conditional normalizing flows


29. CINDI: Conditional Imputation and Noisy Data Integrity with Flows in Power Grid Data


30. Gender Bias in Generative AI-assisted Recruitment Processes


31. When OpenClaw Meets Hospital: Toward an Agentic Operating System for Dynamic Clinical Workflows


32. Scaling Laws for Educational AI Agents


33. STAIRS-Former: Spatio-Temporal Attention with Interleaved Recursive Structure Transformer for Offline Multi-task Multi-agent Reinforcement Learning


34. Explicit Logic Channel for Validation and Enhancement of MLLMs on Zero-Shot Tasks


35. LLMs can construct powerful representations and streamline sample-efficient supervised learning


36. VisDoT : Enhancing Visual Reasoning through Human-Like Interpretation Grounding and Decomposition of Thought


37. The Density of Cross-Persistence Diagrams and Its Applications


38. See, Symbolize, Act: Grounding VLMs with Spatial Representations for Better Gameplay


39. Leveraging Large Language Models and Survival Analysis for Early Prediction of Chemotherapy Outcomes


40. AI Knows What’s Wrong But Cannot Fix It: Helicoid Dynamics in Frontier LLMs Under High-Stakes Decisions


41. Expert Threshold Routing for Autoregressive Language Modeling with Dynamic Computation Allocation and Load Balancing


42. Multi-Agent Collaboration for Automated Design Exploration on High Performance Computing Systems


43. Examining Users’ Behavioural Intention to Use OpenClaw Through the Cognition–Affect–Conation Framework


44. Verified Multi-Agent Orchestration: A Plan-Execute-Verify-Replan Framework for Complex Query Resolution


45. GPT4o-Receipt: A Dataset and Human Study for AI-Generated Document Forensics


46. Adversarial Reinforcement Learning for Detecting False Data Injection Attacks in Vehicular Routing


47. Speak or Stay Silent: Context-Aware Turn-Taking in Multi-Party Dialogue


48. Entropy Guided Diversification and Preference Elicitation in Agentic Recommendation Systems


49. Deactivating Refusal Triggers: Understanding and Mitigating Overrefusal in Safety Alignment


50. Detecting Intrinsic and Instrumental Self-Preservation in Autonomous Agents: The Unified Continuation-Interest Protocol


51. The Artificial Self: Characterising the landscape of AI identity


52. TimeSqueeze: Dynamic Patching for Efficient Time Series Forecasting


53. Improving LLM Performance Through Black-Box Online Tuning: A Case for Adding System Specs to Factsheets for Trusted AI


54. FinRule-Bench: A Benchmark for Joint Reasoning over Financial Tables and Principles


55. RewardHackingAgents: Benchmarking Evaluation Integrity for LLM ML-Engineering Agents


56. LLM-Augmented Digital Twin for Policy Evaluation in Short-Video Platforms


57. Counterweights and Complementarities: The Convergence of AI and Blockchain Powering a Decentralized Future


58. AI Psychometrics: Evaluating the Psychological Reasoning of Large Language Models with Psychometric Validities


59. COMPASS: The explainable agentic framework for Sovereignty, Sustainability, Compliance, and Ethics


60. The Unlearning Mirage: A Dynamic Framework for Evaluating LLM Unlearning


61. Mind the Sim2Real Gap in User Simulation for Agentic Tasks


62. Reversible Lifelong Model Editing via Semantic Routing-Based LoRA


63. Measuring AI Agents’ Progress on Multi-Step Cyber Attack Scenarios


64. PACED: Distillation at the Frontier of Student Competence


65. A Survey of Reasoning in Autonomous Driving Systems: Open Challenges and Emerging Paradigms


66. DIVE: Scaling Diversity in Agentic Task Synthesis for Generalizable Tool Use


67. The Latent Color Subspace: Emergent Order in High-Dimensional Chaos


68. SciMDR: Benchmarking and Advancing Scientific Multimodal Document Reasoning


69. Separable neural architectures as a primitive for unified predictive and generative intelligence


70. Incremental Neural Network Verification via Learned Conflicts


71. Security Considerations for Artificial Intelligence Agents


72. Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights


73. Sparking Scientific Creativity via LLM-Driven Interdisciplinary Inspiration


74. RDNet: Region Proportion-Aware Dynamic Adaptive Salient Object Detection Network in Optical Remote Sensing Images


75. WORKSWORLD: A Domain for Integrated Numeric Planning and Scheduling of Distributed Pipelined Workflows


76. Proof-Carrying Materials: Falsifiable Safety Certificates for Machine-Learned Interatomic Potentials


77. Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections


78. BehaviorVLM: Unified Finetuning-Free Behavioral Understanding with Vision-Language Reasoning


79. A Quantitative Characterization of Forgetting in Post-Training


80. GlyphBanana: Advancing Precise Text Rendering Through Agentic Workflows


81. IsoCompute Playbook: Optimally Scaling Sampling Compute for LLM RL


82. FlashMotion: Few-Step Controllable Video Generation with Trajectory Guidance


83. Automatic Generation of High-Performance RL Environments


84. CRAFT: A Tendon-Driven Hand with Hybrid Hard-Soft Compliance


85. SommBench: Assessing Sommelier Expertise of Language Models


86. Taming the Adversary: Stable Minimax Deep Deterministic Policy Gradient via Fractional Objectives


87. Human-Centred LLM Privacy Audits: Findings and Frictions


88. Resource-Efficient Iterative LLM-Based NAS with Feedback Memory


89. A Multi-Label Temporal Convolutional Framework for Transcription Factor Binding Characterization


90. Paper Title: LoV3D: Grounding Cognitive Prognosis Reasoning in Longitudinal 3D Brain MRI via Regional Volume Assessments


91. Beyond Convolution: A Taxonomy of Structured Operators for Learning-Based Image Processing


92. Chemical Reaction Networks Learn Better than Spiking Neural Networks


93. Coarse-Guided Visual Generation via Weighted h-Transform Sampling


94. Slow-Fast Inference: Training-Free Inference Acceleration via Within-Sentence Support Stability


95. Cascade: Composing Software-Hardware Attack Gadgets for Adversarial Threat Amplification in Compound AI Systems


96. Just Use XML: Revisiting Joint Translation and Label Projection


97. Sim-to-reality adaptation for Deep Reinforcement Learning applied to an underwater docking application


98. An Intent of Collaboration: On Agencies between Designers and Emerging (Intelligent) Technologies


99. Flowcean - Model Learning for Cyber-Physical Systems


100. BTZSC: A Benchmark for Zero-Shot Text Classification Across Cross-Encoders, Embedding Models, Rerankers and LLMs


101. HomeSafe-Bench: Evaluating Vision-Language Models on Unsafe Action Detection for Embodied Agents in Household Scenarios


102. Multimodal Emotion Recognition via Bi-directional Cross-Attention and Temporal Modeling


103. Delayed Backdoor Attacks: Exploring the Temporal Dimension as a New Attack Surface in Pre-Trained Models


104. Geometry-Aware Probabilistic Circuits via Voronoi Tessellations


105. Effective Resistance Rewiring: A Simple Topological Correction for Over-Squashing


106. MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices?


107. Understanding LLM Behavior When Encountering User-Supplied Harmful Content in Harmless Tasks


108. EnTransformer: A Deep Generative Transformer for Multivariate Probabilistic Forecasting


109. Think While Watching: Online Streaming Segment-Level Memory for Multi-Turn Video Reasoning in Multimodal Large Language Models


110. Bielik-Minitron-7B: Compressing Large Language Models via Structured Pruning and Knowledge Distillation for the Polish Language


111. The Mirror Design Pattern: Strict Data Geometry over Model Scale for Prompt Injection Detection


112. ELISA: An Interpretable Hybrid Generative AI Agent for Expression-Grounded Discovery in Single-Cell Genomics


113. You Told Me to Do It: Measuring Instructional Text-induced Private Data Leakage in LLM Agents


114. The Landscape of Generative AI in Information Systems: A Synthesis of Secondary Reviews and Research Agendas


115. Hybrid Human-Agent Social Dilemmas in Energy Markets


116. RADAR: Closed-Loop Robotic Data Generation via Semantic Planning and Autonomous Causal Environment Reset


117. Locating Demographic Bias at the Attention-Head Level in CLIP’s Vision Encoder


118. HELM: Hierarchical and Explicit Label Modeling with Graph Learning for Multi-Label Image Classification


119. Exploiting Expertise of Non-Expert and Diverse Agents in Social Bandit Learning: A Free Energy Approach


120. Compression Favors Consistency, Not Truth: When and Why Language Models Prefer Correct Information


121. Adapting Dijkstra for Buffers and Unlimited Transfers


122. Affect Decoding in Phonated and Silent Speech Production from Surface EMG


123. OSCBench: Benchmarking Object State Change in Text-to-Video Generation


124. SemBench: A Universal Semantic Framework for LLM Evaluation


125. Causal Prosody Mediation for Text-to-Speech:Counterfactual Training of Duration, Pitch, and Energy in FastSpeech2


126. Entropy-Preserving Reinforcement Learning


127. From Control to Foresight: Simulation as a New Paradigm for Human-Agent Collaboration


128. Stable Spike: Dual Consistency Optimization via Bitwise AND Operations for Spiking Neural Networks



130. Tokenization Allows Multimodal Large Language Models to Understand, Generate and Edit Architectural Floor Plans


131. MedPruner: Training-Free Hierarchical Token Pruning for Efficient 3D Medical Image Understanding in Vision-Language Models


132. Taming OpenClaw: Security Analysis and Mitigation of Autonomous LLM Agent Threats


133. Survival Meets Classification: A Novel Framework for Early Risk Prediction Models of Chronic Diseases


134. Performance Evaluation of Open-Source Large Language Models for Assisting Pathology Report Writing in Japanese


135. Toward Complex-Valued Neural Networks for Waveform Generation


136. UtilityMax Prompting: A Formal Framework for Multi-Objective Large Language Model Optimization


137. How Intelligence Emerges: A Minimal Theory of Dynamic Adaptive Coordination


138. RoboClaw: An Agentic Framework for Scalable Long-Horizon Robotic Tasks


139. MANSION: Multi-floor lANguage-to-3D Scene generatIOn for loNg-horizon tasks


140. One Supervisor, Many Modalities: Adaptive Tool Orchestration for Autonomous Queries


141. ReHARK: Refined Hybrid Adaptive RBF Kernels for Robust One-Shot Vision-Language Adaptation


142. EReCu: Pseudo-label Evolution Fusion and Refinement with Multi-Cue Learning for Unsupervised Camouflage Detection


143. FBCIR: Balancing Cross-Modal Focuses in Composed Image Retrieval


144. Gen-Fab: A Variation-Aware Generative Model for Predicting Fabrication Variations in Nanophotonic Devices


145. KEPo: Knowledge Evolution Poison on Graph-based Retrieval-Augmented Generation


146. OrthoEraser: Coupled-Neuron Orthogonal Projection for Concept Erasure


147. SPEGC: Continual Test-Time Adaptation via Semantic-Prompt-Enhanced Graph Clustering for Medical Image Segmentation


148. INFACT: A Diagnostic Benchmark for Induced Faithfulness and Factuality Hallucinations in Video-LLMs


149. Grammar of the Wave: Towards Explainable Multivariate Time Series Event Detection via Neuro-Symbolic VLM Agents


150. Stage-Adaptive Reliability Modeling for Continuous Valence-Arousal Estimation


151. Bridging Discrete Marks and Continuous Dynamics: Dual-Path Cross-Interaction for Marked Temporal Point Processes


152. A Stable Neural Statistical Dependence Estimator for Autoencoder Feature Analysis


153. Evaluation format, not model capability, drives triage failure in the assessment of consumer health AI


154. Deployment-Time Reliability of Learned Robot Policies


155. Efficient Cross-View Localization in 6G Space-Air-Ground Integrated Network


156. ARROW: Augmented Replay for RObust World models


157. Stop Listening to Me! How Multi-turn Conversations Can Degrade Diagnostic Reasoning


158. Agentic AI for Embodied-enhanced Beam Prediction in Low-Altitude Economy Networks


159. Ghost Framing Theory: Exploring the role of generative AI in new venture rhetorical legitimation


160. Vision-Based Hand Shadowing for Robotic Manipulation via Inverse Kinematics


161. How do AI agents talk about science and research? An exploration of scientific discussions on Moltbook using BERTopic


162. Resolving Java Code Repository Issues with iSWE Agent


163. Novelty Adaptation Through Hybrid Large Language Model (LLM)-Symbolic Planning and LLM-guided Reinforcement Learning


164. Evaluating Explainable AI Attribution Methods in Neural Machine Translation via Attention-Guided Knowledge Distillation


165. Jailbreak Scaling Laws for Large Language Models: Polynomial-Exponential Crossover


166. Hindsight-Anchored Policy Optimization: Turning Failure into Feedback in Sparse Reward Settings


167. Worst-case low-rank approximations


168. “I followed what felt right, not what I was told”: Autonomy, Coaching, and Recognizing Bias Through AI-Mediated Dialogue


169. Artificial Intelligence for Sentiment Analysis of Persian Poetry


170. Markovian Generation Chains in Large Language Models


171. MDER-DR: Multi-Hop Question Answering with Entity-Centric Summaries


172. A Simple Efficiency Incremental Learning Framework via Vision-Language Model with Nonlinear Multi-Adapters


173. Representation Finetuning for Continual Learning


174. Attention Gathers, MLPs Compose: A Causal Analysis of an Action-Outcome Circuit in VideoViT


175. Procedural Fairness via Group Counterfactual Explanation


176. WebWeaver: Breaking Topology Confidentiality in LLM Multi-Agent Systems with Stealthy Context-Based Inference


177. Task-Conditioned Routing Signatures in Sparse Mixture-of-Experts Transformers


178. ResWM: Residual-Action World Model for Visual RL


179. Thousand-GPU Large-Scale Training and Optimization Recipe for AI-Native Cloud Embodied Intelligence Infrastructure


180. Graph Tokenization for Bridging Graphs and Transformers


181. The Attack and Defense Landscape of Agentic AI: A Comprehensive Survey


182. Quality-Driven Agentic Reasoning for LLM-Assisted Software Design: Questions-of-Thoughts (QoT) as a Time-Series Self-QA Chain


183. CR-Bench: Evaluating the Real-World Utility of AI Code Review Agents


184. Unifying Logical and Physical Layout Representations via Heterogeneous Graphs for Circuit Congestion Prediction


185. OA-NBV: Occlusion-Aware Next-Best-View Planning for Human-Centered Active Perception on Mobile Robots


186. From Phase Prediction to Phase Design: A ReAct Agent Framework for High-Entropy Alloy Discovery


187. Summarize Before You Speak with ARACH: A Training-Free Inference-Time Plug-In for Enhancing LLMs via Global Attention Reallocation


188. Exploring Collatz Dynamics with Human-LLM Collaboration


189. Hybrid Quantum-Classical Encoding for Accurate Residue-Level pKa Prediction


190. A Survey on Quantitative Modeling of Trust in Online Social Networks


191. Structure-Aware Epistemic Uncertainty Quantification for Neural Operator PDE Surrogates


192. OpenSanctions Pairs: Large-Scale Entity Matching with LLMs


193. Implicit Statistical Inference in Transformers: Approximating Likelihood-Ratio Tests In-Context