전체 AI 논문 - 2026-03-05

1. A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development


2. $τ$-Knowledge: Evaluating Conversational Agents over Unstructured Knowledge


3. Agentics 2.0: Logical Transduction Algebra for Agentic Data Workflows


4. Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions


5. BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning


6. Phi-4-reasoning-vision-15B Technical Report


7. Generative AI in Managerial Decision-Making: Redefining Boundaries through Ambiguity Resolution and Sycophancy Analysis


8. From Threat Intelligence to Firewall Rules: Semantic Relations in Hybrid AI Agent and Expert System Architectures


9. In-Context Environments Induce Evaluation-Awareness in Language Models


10. A Rubric-Supervised Critic from Sparse Real-World Outcomes


11. Specification-Driven Generation and Evaluation of Discrete-Event World Models via the DEVS Formalism


12. LifeBench: A Benchmark for Long-Horizon Multi-Source Memory


13. AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation


14. RAGNav: A Retrieval-Augmented Topological Reasoning Framework for Multi-Goal Visual-Language Navigation


15. AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment


16. MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation


17. Mozi: Governed Autonomy for Drug Discovery LLM Agents


18. Build, Judge, Optimize: A Blueprint for Continuous Improvement of Multi-Agent Consumer Assistants


19. Asymmetric Goal Drift in Coding Agents Under Value Conflict


20. ZipMap: Linear-Time Stateful 3D Reconstruction with Test-Time Training


21. Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization


22. Low-Resource Guidance for Controllable Latent Audio Diffusion


23. Dual-Modality Multi-Stage Adversarial Safety Training: Robustifying Multimodal Web Agents Against Cross-Modal Attacks


24. Dissecting Quantization Error: A Concentration-Alignment Perspective


25. RoboCasa365: A Large-Scale Simulation Framework for Training and Benchmarking Generalist Robots


26. Efficient Refusal Ablation in LLM through Optimal Transport


27. RANGER: Sparsely-Gated Mixture-of-Experts with Adaptive Retrieval Re-ranking for Pathology Report Generation


28. SpotIt+: Verification-based Text-to-SQL Evaluation with Database Constraints


29. What Does Flow Matching Bring To TD Learning?


30. SPRINT: Semi-supervised Prototypical Representation for Few-Shot Class-Incremental Tabular Learning


31. World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings


32. MOO: A Multi-view Oriented Observations Dataset for Viewpoint Analysis in Cattle Re-Identification


33. CRESTomics: Analyzing Carotid Plaques in the CREST-2 Trial with a New Additive Classification Model


34. Activation Outliers in Transformer Quantization: Reproduction, Statistical Analysis, and Deployment Tradeoffs


35. LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance


36. CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video


37. IPD: Boosting Sequential Policy with Imaginary Planning Distillation in Offline Reinforcement Learning


38. VANGUARD: Vehicle-Anchored Ground Sample Distance Estimation for UAVs in GPS-Denied Environments


39. Causality Elicitation from Large Language Models


40. When AI Fails, What Works? A Data-Driven Taxonomy of Real-World AI Risk Mitigation Strategies


41. Online Learning for Multi-Layer Hierarchical Inference under Partial and Policy-Dependent Feedback


42. LikeThis! Empowering App Users to Submit UI Improvement Suggestions Instead of Complaints


43. FeedAIde: Guiding App Users to Submit Rich Feedback Reports by Asking Context-Aware Follow-Up Questions


44. PRAM-R: A Perception-Reasoning-Action-Memory Framework with LLM-Guided Modality Routing for Adaptive Autonomous Driving


45. ZeSTA: Zero-Shot TTS Augmentation with Domain-Conditioned Training for Data-Efficient Personalized Speech Synthesis


46. Noise-aware Client Selection for carbon-efficient Federated Learning via Gradient Norm Thresholding


47. CAM-LDS: Cyber Attack Manifestations for Automatic Interpretation of System Logs and Security Alerts


48. Architectural Proprioception in State Space Models: Thermodynamic Training Induces Anticipatory Halt Detection


49. CodeTaste: Can LLMs Generate Human-Level Code Refactorings?


50. PlaneCycle: Training-Free 2D-to-3D Lifting of Foundation Models Without Adapters


51. Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model


52. GarmentPile++: Affordance-Driven Cluttered Garments Retrieval with Vision-Language Reasoning


53. Unbiased Dynamic Pruning for Efficient Group-Based Policy Optimization


54. Crab$^{+}$: A Scalable and Unified Audio-Visual Scene Understanding Model with Explicit Cooperation


55. Data-Aware Random Feature Kernel for Transformers


56. Understanding Sources of Demographic Predictability in Brain MRI via Disentangling Anatomy and Contrast


57. Efficient Point Cloud Processing with High-Dimensional Positional Encoding and Non-Local MLPs


58. End-to-end event reconstruction for precision physics at future colliders


59. SaFeR: Safety-Critical Scenario Generation for Autonomous Driving Test via Feasibility-Constrained Token Resampling


60. Monitoring Emergent Reward Hacking During Generation via Internal Activations


61. Sim2Sea: Sim-to-Real Policy Transfer for Maritime Vessel Navigation in Congested Waters


62. Inference-Time Toxicity Mitigation in Protein Language Models


63. DQE-CIR: Distinctive Query Embeddings through Learnable Attribute Weights and Target Relative Negative Sampling in Composed Image Retrieval


64. The Empty Quadrant: AI Teammates for Embodied Field Learning


65. Self-adapting Robotic Agents through Online Continual Reinforcement Learning with World Model Feedback


66. A Multi-Dimensional Quality Scoring Framework for Decentralized LLM Inference with Proof of Quality


67. Volumetric Directional Diffusion: Anchoring Uncertainty Quantification in Anatomical Consensus for Ambiguous Medical Image Segmentation


68. Discriminative Perception via Anchored Description for Reasoning Segmentation


69. STEM Faculty Perspectives on Generative AI in Higher Education


70. Spectral Surgery: Training-Free Refinement of LoRA via Gradient-Guided Singular Value Reweighting


71. Measuring AI R&D Automation


72. When Visual Evidence is Ambiguous: Pareidolia as a Diagnostic Probe for Vision Models


73. GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery


74. Right in Time: Reactive Reasoning in Regulated Traffic Spaces


75. Upholding Epistemic Agency: A Brouwerian Assertibility Constraint for Responsible AI


76. BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft



78. Towards Generalized Multimodal Homography Estimation


79. GIPO: Gaussian Importance Sampling Policy Optimization


80. RVN-Bench: A Benchmark for Reactive Visual Navigation


81. Cross-Modal Mapping and Dual-Branch Reconstruction for 2D-3D Multimodal Industrial Anomaly Detection


82. Selecting Offline Reinforcement Learning Algorithms for Stochastic Network Control


83. BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning


84. Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personality Effects


85. PatchDecomp: Interpretable Patch-Based Time Series Forecasting


86. IROSA: Interactive Robot Skill Adaptation using Natural Language


87. A novel network for classification of cuneiform tablet metadata


88. CzechTopic: A Benchmark for Zero-Shot Topic Localization in Historical Czech Documents


89. On the Suitability of LLM-Driven Agents for Dark Pattern Audits


90. Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators


91. Structure-Aware Distributed Backdoor Attacks in Federated Learning


92. From Narrow to Panoramic Vision: Attention-Guided Cold-Start Reshapes Multimodal Reasoning


93. SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration


94. Fairness Begins with State: Purifying Latent Preferences for Hierarchical Reinforcement Learning in Interactive Recommendation


95. Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning


96. Separators in Enhancing Autoregressive Pretraining for Vision Mamba


97. Relational In-Context Learning via Synthetic Pre-training with Structural Prior


98. Zero-Knowledge Proof (ZKP) Authentication for Offline CBDC Payment System Using IoT Devices


99. When and Where to Reset Matters for Long-Term Test-Time Adaptation


100. T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning


101. DisenReason: Behavior Disentanglement and Latent Reasoning for Shared-Account Sequential Recommendation


102. MACC: Multi-Agent Collaborative Competition for Scientific Exploration


103. Towards Effective Orchestration of AI x DB Workloads


104. Not All Candidates are Created Equal: A Heterogeneity-Aware Approach to Pre-ranking in Recommender Systems


105. IntroductionDMD-augmented Unpaired Neural Schrödinger Bridge for Ultra-Low Field MRI Enhancement


106. Cognition to Control - Multi-Agent Learning for Human-Humanoid Collaborative Transport


107. Learning Approximate Nash Equilibria in Cooperative Multi-Agent Reinforcement Learning via Mean-Field Subsampling


108. Agentic Peer-to-Peer Networks: From Content Distribution to Capability and Action Sharing


109. Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning


110. Interaction-Aware Whole-Body Control for Compliant Object Transport


111. JANUS: Structured Bidirectional Generation for Guaranteed Constraints and Analytical Uncertainty


112. HALyPO: Heterogeneous-Agent Lyapunov Policy Optimization for Human-Robot Collaboration


113. PROSPECT: Unified Streaming Vision-Language Navigation via Semantic–Spatial Fusion and Latent Predictive Representation


114. Understanding Parents’ Desires in Moderating Children’s Interactions with GenAI Chatbots through LLM-Generated Probes


115. Why Do Unlearnable Examples Work: A Novel Perspective of Mutual Information


116. Order Is Not Layout: Order-to-Space Bias in Image Generation


117. MPFlow: Multi-modal Posterior-Guided Flow Matching for Zero-Shot MRI Reconstruction


118. Large-Language-Model-Guided State Estimation for Partially Observable Task and Motion Planning


119. UrbanHuRo: A Two-Layer Human-Robot Collaboration Framework for the Joint Optimization of Heterogeneous Urban Services


120. Generalization Properties of Score-matching Diffusion Models for Intrinsically Low-dimensional Data


121. Error as Signal: Stiffness-Aware Diffusion Sampling via Embedded Runge-Kutta Guidance


122. Mathematicians in the age of AI


123. EvoPrune: Early-Stage Visual Token Pruning for Efficient MLLMs


124. MIND: Unified Inquiry and Diagnosis RL with Criteria Grounded Clinical Supports for Psychiatric Consultation


125. Local Shapley: Model-Induced Locality and Optimal Reuse in Data Valuation


126. Graph Negative Feedback Bias Correction Framework for Adaptive Heterophily Modeling


127. InEdit-Bench: Benchmarking Intermediate Logical Pathways for Intelligent Image Editing Models


128. Field imaging framework for morphological characterization of aggregates with computer vision: Algorithms and applications


129. Bridging Pedagogy and Play: Introducing a Language Mapping Interface for Human-AI Co-Creation in Educational Game Design


130. Image-based Prompt Injection: Hijacking Multimodal LLMs through Visually Embedded Adversarial Instructions


131. Goal-Driven Risk Assessment for LLM-Powered Systems: A Healthcare Case Study


132. Social Norm Reasoning in Multimodal Language Models: An Evaluation


133. Belief-Sim: Towards Belief-Driven Simulation of Demographic Misinformation Susceptibility


134. Molt Dynamics: Emergent Social Phenomena in Autonomous AI Agent Populations


135. Tucano 2 Cool: Better Open Source LLMs for Portuguese


136. RAG-X: Systematic Diagnosis of Retrieval-Augmented Generation for Medical Question Answering


137. SafeCRS: Personalized Safety Alignment for LLM-Based Conversational Recommender Systems


138. Role-Aware Conditional Inference for Spatiotemporal Ecosystem Carbon Flux Prediction


139. Directional Neural Collapse Explains Few-Shot Transfer in Self-Supervised Learning


140. mlx-snn: Spiking Neural Networks on Apple Silicon via MLX


141. Multi-Agent Influence Diagrams to Hybrid Threat Modeling


142. Test-Time Meta-Adaptation with Self-Synthesis


143. MMAI Gym for Science: Training Liquid Foundation Models for Drug Discovery


144. The Controllability Trap: A Governance Framework for Military AI Agents


145. Baseline Performance of AI Tools in Classifying Cognitive Demand of Mathematical Tasks


146. Raising Bars, Not Parameters: LilMoo Compact Language Model for Hindi


147. PhyPrompt: RL-based Prompt Refinement for Physically Plausible Text-to-Video Generation


148. Phys4D: Fine-Grained Physics-Consistent 4D Modeling from Video Diffusion


149. Optimal trajectory-guided stochastic co-optimization for e-fuel system design and real-time operation


150. Beyond Pixel Histories: World Models with Persistent 3D State


151. When Shallow Wins: Silent Failures and the Depth-Accuracy Paradox in Latent Reasoning


152. Graph Hopfield Networks: Energy-Based Node Classification with Associative Memory


153. Parallel Test-Time Scaling with Multi-Sequence Verifiers


154. Farther the Shift, Sparser the Representation: Analyzing OOD Mechanisms in LLMs


155. PRIVATEEDIT: A Privacy-Preserving Pipeline for Face-Centric Generative Image Editing


156. On Google’s SynthID-Text LLM Watermarking System: Theoretical Analysis and Empirical Validation


157. Heterogeneous Time Constants Improve Stability in Equilibrium Propagation


158. Zero-Knowledge Federated Learning with Lattice-Based Hybrid Encryption for Quantum-Resilient Medical AI


159. Multi-Agent-Based Simulation of Archaeological Mobility in Uneven Landscapes


160. RADAR: Learning to Route with Asymmetry-aware DistAnce Representations


161. Learning Order Forest for Qualitative-Attribute Data Clustering


162. LiteVLA-Edge: Quantized On-Device Multimodal Control for Embedded Robotics


163. MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning


164. AOI: Turning Failed Trajectories into Training Signals for Autonomous Cloud Diagnosis


165. Sleeper Cell: Injecting Latent Malice Temporal Backdoors into Tool-Using LLMs


166. Bridging the Reproducibility Divide: Open Source Software’s Role in Standardizing Healthcare AI


167. ACES: Accent Subspaces for Coupling, Explanations, and Stress-Testing in Automatic Speech Recognition


168. Inhibitory Cross-Talk Enables Functional Lateralization in Attention-Coupled Latent Memory


169. Non-Invasive Reconstruction of Intracranial EEG Across the Deep Temporal Lobe from Scalp EEG based on Conditional Normalizing Flow


170. Perfect score on IPhO 2025 theory by Gemini agent


171. Physics-constrained symbolic regression for discovering closed-form equations of multimodal water retention curves from experimental data


172. GreenPhase: A Green Learning Approach for Earthquake Phase Picking


173. Neuro-Symbolic Decoding of Neural Activity


174. Cryo-SWAN: the Multi-Scale Wavelet-decomposition-inspired Autoencoder Network for molecular density representation of molecular volumes


175. Ethical and Explainable AI in Reusable MLOps Pipelines


176. Fragile Thoughts: How Large Language Models Handle Chain-of-Thought Perturbations


177. PulseLM: A Foundation Dataset and Benchmark for PPG-Text Learning


178. Certainty robustness: Evaluating LLM stability under self-challenging prompts


179. AutoHarness: improving LLM agents by automatically synthesizing a code harness


180. StructLens: A Structural Lens for Language Models via Maximum Spanning Trees


181. A benchmark for joint dialogue satisfaction, emotion recognition, and emotion state transition prediction


182. Controllable and explainable personality sliders for LLMs at inference time


183. IntPro: A Proxy Agent for Context-Aware Intent Understanding via Retrieval-conditioned Inference


184. Controlling Chat Style in Language Models via Single-Direction Editing


185. Discern Truth from Falsehood: Reducing Over-Refusal via Contrastive Refinement


186. Can Large Language Models Derive New Knowledge? A Dynamic Benchmark for Biological Knowledge Discovery


187. DIALEVAL: Automated Type-Theoretic Evaluation of LLM Instruction Following


188. From We to Me: Theory Informed Narrative Shift with Abductive Reasoning


189. Automated Concept Discovery for LLM-as-a-Judge Preference Analysis


190. Quantum-Inspired Self-Attention in a Large Language Model


191. The Influence of Iconicity in Transfer Learning for Sign Language Recognition


192. M-QUEST – Meme Question-Understanding Evaluation on Semantics and Toxicity


193. Towards Self-Robust LLMs: Intrinsic Prompt Noise Resistance via CoIPO


194. How does fine-tuning improve sensorimotor representations in large language models?


195. Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding


196. Old Habits Die Hard: How Conversational History Geometrically Traps LLMs


197. TopicENA: Enabling Epistemic Network Analysis at Scale through Automated Topic-Based Coding


198. Token-Oriented Object Notation vs JSON: A Benchmark of Plain and Constrained Decoding Generation


199. Draft-Conditioned Constrained Decoding for Structured Generation in LLMs


200. Knowledge Graph and Hypergraph Transformers with Repository-Attention and Journey-Based Role Transport


201. HumanLM: Simulating Users with State Alignment Beats Response Imitation


202. Developing an AI Assistant for Knowledge Management and Workforce Training in State DOTs


203. From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings


204. TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation


205. TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement


206. PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents


207. Language Model Goal Selection Differs from Humans’ in an Open-Ended Task


208. Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory


209. From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG


210. One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models


211. AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents