전체 AI 논문 - 2026-03-04

1. Inherited Goal Drift: Contextual Pressure Can Undermine Agentic Goals


2. Valet: A Standardized Testbed of Traditional Imperfect-Information Card Games


3. Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals


4. AI-for-Science Low-code Platform with Bayesian Adversarial Multi-Agent Framework


5. NeuroSkill(tm): Proactive Real-Time Agentic System Capable of Modeling Human State of Mind


6. No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models


7. Expectation and Acoustic Neural Network Representations Enhance Music Identification from Brain Activity


8. Neuro-Symbolic Artificial Intelligence: A Task-Directed Survey in the Black-Box Models Era


9. FEAST: Retrieval-Augmented Multi-Hierarchical Food Classification for the FoodEx2 System


10. Saarthi for AGI: Towards Domain-Specific General Intelligence for Formal Verification


11. Agentic AI-based Coverage Closure for Formal Verification


12. AI Space Physics: Constitutive boundary semantics for open AI institutions


13. Beyond Task Completion: Revealing Corrupt Success in LLM Agents through Procedure-Aware Evaluation


14. Odin: Multi-Signal Graph Intelligence for Autonomous Discovery in Knowledge Graphs


15. Beyond Factual Correctness: Mitigating Preference-Inconsistent Explanations in Explainable Recommendation


16. RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization


17. TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning


18. REGAL: A Registry-Driven Architecture for Deterministic Grounding of Agentic AI in Enterprise Telemetry


19. OrchMAS: Orchestrated Reasoning with Multi Collaborative Heterogeneous Scientific Expert Structured Agents


20. SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models


21. Architecting Trust in Artificial Epistemic Agents


22. ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization


23. SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training


24. Retrievit: In-context Retrieval Capabilities of Transformers, State Space Models, and Hybrid Architectures


25. LLM-based Argument Mining meets Argumentation and Description Logics: a Unified Framework for Reasoning about Debates


26. Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification


27. Agentified Assessment of Logical Reasoning Agents


28. Rethinking Code Similarity for Automated Algorithm Design with LLMs


29. EvoSkill: Automated Skill Discovery for Multi-Agent Systems


30. A Natural Language Agentic Approach to Study Affective Polarization


31. FinTexTS: Financial Text-Paired Time-Series Dataset via Semantic-Based and Multi-Level Pairing


32. Retrieval-Augmented Robots via Retrieve-Reason-Act


33. LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization


34. SorryDB: Can AI Provers Complete Real-World Lean Theorems?


35. See and Remember: A Multimodal Agent for Web Traversal


36. AgentAssay: Token-Efficient Regression Testing for Non-Deterministic AI Agent Workflows


37. SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving


38. LiveAgentBench: Comprehensive Benchmarking of Agentic Systems Across 104 Real-World Challenges


39. AnchorDrive: LLM Scenario Rollout with Anchor-Guided Diffusion Regeneration for Safety-Critical Scenario Generation


40. A Neuropsychologically Grounded Evaluation of LLM Cognitive Abilities


41. LLM-MLFFN: Multi-Level Autonomous Driving Behavior Feature Fusion via Large Language Model


42. NeuroProlog: Multi-Task Fine-Tuning for Neurosymbolic Mathematical Reasoning via the Cocktail Effect


43. Revealing Positive and Negative Role Models to Help People Make Good Decisions


44. PRISM: Pushing the Frontier of Deep Think via Process Reward Model-Guided Inference


45. Diagnosing Retrieval vs. Utilization Bottlenecks in LLM Agent Memory


46. VL-KGE: Vision-Language Models Meet Knowledge Graph Embeddings


47. COOL-MC: Verifying and Explaining RL Policies for Platelet Inventory Management


48. Can machines be uncertain?


49. Estimating Visual Attribute Effects in Advertising from Observational Data: A Deepfake-Informed Double Machine Learning Approach


50. SuperLocalMemory: Privacy-Preserving Multi-Agent Memory with Bayesian Trust Defense Against Memory Poisoning


51. Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents


52. Federated Inference: Toward Privacy-Preserving Collaborative and Incentivized Model Serving


53. How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference


54. Tether: Autonomous Functional Play with Correspondence-Driven Trajectory Warping


55. UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?


56. SynthCharge: An Electric Vehicle Routing Instance Generator with Feasibility Screening to Enable Learning-Based Optimization and Benchmarking


57. Stabilized Adaptive Loss and Residual-Based Collocation for Physics-Informed Neural Networks


58. Understanding and Mitigating Dataset Corruption in LLM Steering


59. Chain of World: World Model Thinking in Latent Motion


60. Type-Aware Retrieval-Augmented Generation with Dependency Closure for Solver-Executable Industrial Optimization Modeling


61. Conditioned Activation Transport for T2I Safety Steering


62. An Investigation Into Various Approaches For Bengali Long-Form Speech Transcription and Bengali Speaker Diarization


63. Information Routing in Atomistic Foundation Models: How Equivariance Creates Linearly Disentangled Representations


64. Channel-Adaptive Edge AI: Maximizing Inference Throughput by Adapting Computational Complexity to Channel States


65. Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing


66. APRES: An Agentic Paper Revision and Evaluation System


67. How to Model AI Agents as Personas?: Applying the Persona Ecosystem Playground to 41,300 Posts on Moltbook for Behavioral Insights


68. Joint Training Across Multiple Activation Sparsity Regimes


69. From Complex Dynamics to DynFormer: Rethinking Transformers for PDEs


70. Multi-Scale Adaptive Neighborhood Awareness Transformer For Graph Fraud Detection


71. MoECLIP: Patch-Specialized Experts for Zero-shot Anomaly Detection


72. Why Adam Can Beat SGD: Second-Moment Normalization Yields Sharper Tails


73. Compact Prompting in Instruction-tuned LLMs for Joint Argumentative Component Detection


74. Proactive Guiding Strategy for Item-side Fairness in Interactive Recommendation


75. On the Expressive Power of Transformers for Maxout Networks and Continuous Piecewise Linear Functions


76. TinyIceNet: Low-Power SAR Sea Ice Segmentation for On-Board FPGA Inference


77. Design Generative AI for Practitioners: Exploring Interaction Approaches Aligned with Creative Practice


78. Reinforcement Learning with Symbolic Reward Machines


79. TrustMH-Bench: A Comprehensive Benchmark for Evaluating the Trustworthiness of Large Language Models in Mental Health


80. QFlowNet: Fast, Diverse, and Efficient Unitary Synthesis with Generative Flow Networks


81. IoUCert: Robustness Verification for Anchor-based Object Detectors


82. cPNN: Continuous Progressive Neural Networks for Evolving Streaming Time Series


83. MA-CoNav: A Master-Slave Multi-Agent Framework with Hierarchical Collaboration and Dual-Level Reflection for Long-Horizon Embodied VLN


84. Why Does RLAIF Work At All?


85. Contextualized Privacy Defense for LLM Agents


86. Delegation and Verification Under AI


87. Layer-wise QUBO-Based Training of CNN Classifiers for Quantum Annealing


88. The Geometry of Learning Under AI Delegation


89. SEALing the Gap: A Reference Framework for LLM Inference Carbon Estimation via Multi-Benchmark Driven Embodiment


90. Enhancing Physics-Informed Neural Networks with Domain-aware Fourier Features: Towards Improved Performance and Interpretable Results


91. Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models


92. On the Structural Limitations of Weight-Based Neural Adaptation and the Role of Reversible Behavioral Learning


93. Interpretable Motion-Attentive Maps: Spatio-Temporally Localizing Concepts in Video Diffusion Transformers


94. Eliciting Numerical Predictive Distributions of LLMs Without Autoregression


95. Learning to Generate and Extract: A Multi-Agent Collaboration Framework For Zero-shot Document-level Event Arguments Extraction


96. StegaFFD: Privacy-Preserving Face Forgery Detection via Fine-Grained Steganographic Domain Lifting


97. CoFL: Continuous Flow Fields for Language-Conditioned Navigation


98. Learning Memory-Enhanced Improvement Heuristics for Flexible Job Shop Scheduling


99. SPARC: Spatial-Aware Path Planning via Attentive Robot Communication


100. Faster, Cheaper, More Accurate: Specialised Knowledge Tracing Models Outperform LLMs


101. BrandFusion: A Multi-Agent Framework for Seamless Brand Integration in Text-to-Video Generation


102. Differentiable Time-Varying IIR Filtering for Real-Time Speech Denoising


103. OCR or Not? Rethinking Document Information Extraction in the MLLMs Era with Real-World Large-Scale Datasets


104. Scores Know Bobs Voice: Speaker Impersonation Attack


105. ITO: Images and Texts as One via Synergizing Multiple Alignment and Training-Time Fusion


106. Next Embedding Prediction Makes World Models Stronger


107. Efficient Self-Evaluation for Diffusion Language Models via Sequence Regeneration


108. iGVLM: Dynamic Instruction-Guided Vision Encoding for Question-Aware Multimodal Understanding


109. Enhancing User Throughput in Multi-panel mmWave Radio Access Networks for Beam-based MU-MIMO Using a DRL Method


110. Practical FP4 Training for Large-Scale MoE Models on Hopper GPUs


111. Sensory-Aware Sequential Recommendation via Review-Distilled Representations


112. Intelligent Pathological Diagnosis of Gestational Trophoblastic Diseases via Visual-Language Deep Learning Model


113. ShareVerse: Multi-Agent Consistent Video Generation for Shared World Modeling


114. ITLC at SemEval-2026 Task 11: Normalization and Deterministic Parsing for Formal Reasoning in LLMs


115. Real-Time Generation of Game Video Commentary with Multimodal LLMs: Pause-Aware Decoding Approaches


116. AlphaFree: Recommendation Free from Users, IDs, and GNNs


117. Improving Diffusion Planners by Self-Supervised Action Gating with Energies


118. Credibility Governance: A Social Mechanism for Collective Self-Correction under Weak Truth Signals


119. The Vienna 4G/5G Drive-Test Dataset


120. Robust Heterogeneous Analog-Digital Computing for Mixture-of-Experts Models with Theoretical Generalization Guarantees


121. MASPOB: Bandit-Based Prompt Optimization for Multi-Agent Systems with Graph Neural Networks


122. Detecting Structural Heart Disease from Electrocardiograms via a Generalized Additive Model of Interpretable Foundation-Model Predictors


123. GPUTOK: GPU Accelerated Byte Level BPE Tokenization


124. How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities


125. CAPT: Confusion-Aware Prompt Tuning for Reducing Vision-Language Misalignment


126. Through the Lens of Contrast: Self-Improving Visual Reasoning in VLMs


127. CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think


128. Bridging Diffusion Guidance and Anderson Acceleration via Hopfield Dynamics


129. Human-Certified Module Repositories for the AI Age


130. Learning Object-Centric Spatial Reasoning for Sequential Manipulation in Cluttered Environments


131. What Capable Agents Must Know: Selection Theorems for Robust Decision-Making under Uncertainty


132. Deep Learning Based Wildfire Detection for Peatland Fires Using Transfer Learning


133. GLoRIA: Gated Low-Rank Interpretable Adaptation for Dialectal ASR


134. Can Computational Reducibility Lead to Transferable Models for Graph Combinatorial Optimization?


135. Manifold Aware Denoising Score Matching (MAD)


136. MIRAGE: Knowledge Graph-Guided Cross-Cohort MRI Synthesis for Alzheimer’s Disease Prediction


137. Learning to Pay Attention: Unsupervised Modeling of Attentive and Inattentive Respondents in Survey Data


138. A Directed Graph Model and Experimental Framework for Design and Study of Time-Dependent Text Visualisation


139. Slurry-as-a-Service: A Modest Proposal on Scalable Pluralistic Alignment for Nutrient Optimization


140. From Fewer Samples to Fewer Bits: Reframing Dataset Distillation as Joint Optimization of Precision and Compactness


141. Rigidity-Aware Geometric Pretraining for Protein Design and Conformational Ensembles


142. PlayWrite: A Multimodal System for AI Supported Narrative Co-Authoring Through Play in XR


143. Diffusion-MPC in Discrete Domains: Feasibility Constraints, Horizon Effects, and Critic Alignment: Case study with Tetris


144. Large Electron Model: A Universal Ground State Predictor


145. RIVA: Leveraging LLM Agents for Reliable Configuration Drift Detection


146. Preconditioned Score and Flow Matching


147. ZeroDayBench: Evaluating LLM Agents on Unseen Zero-Day Vulnerabilities for Cyberdefense


148. The Malignant Tail: Spectral Segregation of Label Noise in Over-Parameterized Networks


149. Beyond Prompt Degradation: Prototype-guided Dual-pool Prompting for Incremental Object Detection


150. Quantum-Inspired Fine-Tuning for Few-Shot AIGC Detection via Phase-Structured Reparameterization


151. Temporal Imbalance of Positive and Negative Supervision in Class-Incremental Learning


152. Quantifying Frontier LLM Capabilities for Container Sandbox Escape


153. Contextual Invertible World Models: A Neuro-Symbolic Agentic Framework for Colorectal Cancer Drug Response


154. Characterizing VLA Models: Identifying the Action Generation Bottleneck for Edge AI Architectures


155. PRISM: Exploring Heterogeneous Pretrained EEG Foundation Model Transfer to Clinical Differential Diagnosis


156. Boosting Meta-Learning for Few-Shot Text Classification via Label-guided Distance Scaling


157. When Scaling Fails: Mitigating Audio Perception Decay of LALMs via Multi-Step Perception-Aware Reasoning


158. High-order Knowledge Based Network Controllability Robustness Prediction: A Hypergraph Neural Network Approach


159. Social-JEPA: Emergent Geometric Isomorphism


160. Silent Sabotage During Fine-Tuning: Few-Shot Rationale Poisoning of Compact Medical LLMs


161. Universal Conceptual Structure in Neural Translation: Probing NLLB-200’s Multilingual Geometry


162. MEBM-Speech: Multi-scale Enhanced BrainMagic for Robust MEG Speech Detection


163. MEBM-Phoneme: Multi-scale Enhanced BrainMagic for End-to-End MEG Phoneme Classification


164. Whisper-RIR-Mega: A Paired Clean-Reverberant Speech Benchmark for ASR Robustness to Room Acoustics


165. A Benchmark Analysis of Graph and Non-Graph Methods for Caenorhabditis Elegans Neuron Classification


166. Concept Heterogeneity-aware Representation Steering


167. CUDABench: Benchmarking LLMs for Text-to-CUDA Generation


168. Talking with Verifiers: Automatic Specification Generation for Neural Network Verification


169. Structured vs. Unstructured Pruning: An Exponential Gap


170. Adaptive Personalized Federated Learning via Multi-task Averaging of Kernel Mean Embeddings


171. Beyond Binary Preferences: A Principled Framework for Reward Modeling with Ordinal Feedback


172. Physics-Informed Neural Networks with Architectural Physics Embedding for Large-Scale Wave Field Reconstruction


173. Generalized Discrete Diffusion with Self-Correction


174. Neural Paging: Learning Context Management Policies for Turing-Complete Agents


175. Characterizing and Predicting Wildfire Evacuation Behavior: A Dual-Stage ML Approach


176. MedCalc-Bench Doesn’t Measure What You Think: A Benchmark Audit and the Case for Open-Book Evaluation


177. MedFeat: Model-Aware and Explainability-Driven Feature Engineering with LLMs for Clinical Tabular Prediction


178. Forecasting as Rendering: A 2D Gaussian Splatting Framework for Time Series Forecasting


179. NExT-Guard: Training-Free Streaming Safeguard without Token-Level Labels


180. Self-Play Only Evolves When Self-Synthetic Pipeline Ensures Learnable Information Gain


181. Is Retraining-Free Enough? The Necessity of Router Calibration for Efficient MoE Compression


182. ATPO: Adaptive Tree Policy Optimization for Multi-Turn Medical Dialogue


183. RxnNano:Training Compact LLMs for Chemical Reaction and Retrosynthesis Prediction via Hierarchical Curriculum Learning


184. GLEAN: Grounded Lightweight Evaluation Anchors for Contamination-Aware Tabular Reasoning


185. Param$Δ$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost


186. On the Parameter Estimation of Sinusoidal Models for Speech and Audio Signals


187. Predicting Tuberculosis from Real-World Cough Audio Recordings and Metadata