전체 AI 논문 - 2026-03-20

1. OS-Themis: A Scalable Critic Framework for Generalist GUI Rewards


2. Box Maze: A Process-Control Architecture for Reliable LLM Reasoning


3. cuGenOpt: A GPU-Accelerated General-Purpose Metaheuristic Framework for Combinatorial Optimization


4. D5P4: Partition Determinantal Point Process for Diversity in Parallel Discrete Diffusion Decoding


5. Implicit Patterns in LLM-Based Binary Analysis


6. How Uncertainty Estimation Scales with Sampling in Reasoning Models


7. LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling


8. Serendipity by Design: Evaluating the Impact of Cross-domain Mappings on Human and LLM Creativity


9. Man and machine: artificial intelligence and judicial decision making


10. Behavioral Fingerprints for LLM Endpoint Stability and Identity


11. Regret Bounds for Competitive Resource Allocation with Endogenous Costs


12. Evaluating Game Difficulty in Tetris Block Puzzle


13. Unmasking Algorithmic Bias in Predictive Policing: A GAN-Based Simulation Framework with Multi-City Temporal Analysis


14. Evaluating 5W3H Structured Prompting for Intent Alignment in Human-AI Interaction


15. Teleological Inference in Structural Causal Models via Intentional Interventions


16. Agentic Business Process Management: A Research Manifesto


17. Secure Linear Alignment of Large Language Models


18. I Can’t Believe It’s Corrupt: Evaluating Corruption in Multi-Agent Governance Systems


19. Quantitative Introspection in Language Models: Tracking Internal States Across Conversation


20. Reasoning over mathematical objects: on-policy reward modeling and test time aggregation


21. Geography According to ChatGPT – How Generative AI Represents and Reasons about Geography


22. Bridging Network Fragmentation: A Semantic-Augmented DRL Framework for UAV-aided VANETs


23. Conflict-Based Search for Multi Agent Path Finding with Asynchronous Actions


24. RewardFlow: Topology-Aware Reward Propagation on State Graphs for Agentic RL with Large Language Models


25. ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents


26. Can LLM generate interesting mathematical research problems?


27. dTRPO: Trajectory Reduction in Policy Optimization of Diffusion Large Language Models


28. Proceedings of the 2nd Workshop on Advancing Artificial Intelligence through Theory of Mind


29. A Concept is More Than a Word: Diversified Unlearning in Text-to-Image Diffusion Models


30. NeuroGame Transformer: Gibbs-Inspired Attention Driven by Game Theory and Statistical Physics


31. Memento-Skills: Let Agents Design Agents


32. Analysis Of Linguistic Stereotypes in Single and Multi-Agent Generative AI Architectures


33. MemMA: Coordinating the Memory Cycle through Multi-Agent Reasoning and In-Situ Self-Evolution


34. Accurate and Efficient Multi-Channel Time Series Forecasting via Sparse Attention Mechanism


35. MANAR: Memory-augmented Attention with Navigational Abstract Conceptual Representation


36. Thinking with Constructions: A Benchmark and Policy Optimization for Visual-Text Interleaved Geometric Reasoning


37. Balanced Thinking: Improving Chain of Thought Training in Vision Language Models


38. An Onto-Relational-Sophic Framework for Governing Synthetic Minds


39. D-Mem: A Dual-Process Memory System for LLM Agents


40. Agentic Flow Steering and Parallel Rollout Search for Spatially Grounded Text-to-Image Generation


41. ZEBRAARENA: A Diagnostic Simulation Environment for Studying Reasoning-Action Coupling in Tool-Augmented LLMs


42. MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning


43. Interplay: Training Independent Simulators for Reference-Free Conversational Recommendation


44. CAPSUL: A Comprehensive Human Protein Benchmark for Subcellular Localization


45. Reasonably reasoning AI agents can avoid game-theoretic failures in zero-shot, provably


46. Correlation-Weighted Multi-Reward Optimization for Compositional Generation


47. Expert Personas Improve LLM Alignment but Damage Accuracy: Bootstrapping Intent-Based Persona Routing with PRISM


48. Cross-Domain Demo-to-Code via Neurosymbolic Counterfactual Reasoning


49. Cognitive Mismatch in Multimodal Large Language Models for Discrete Symbol Understanding


50. AlignMamba-2: Enhancing Multimodal Fusion and Sentiment Analysis with Modality-Aware Mamba


51. AS2 – Attention-Based Soft Answer Sets: An End-to-End Differentiable Neuro-Soft-Symbolic Reasoning Architecture


52. Prune-then-Quantize or Quantize-then-Prune? Understanding the Impact of Compression Order in Joint Model Compression


53. From Topic to Transition Structure: Unsupervised Concept Discovery at Corpus Scale via Predictive Associative Memory


54. Reflection in the Dark: Exposing and Escaping the Black Box in Reflective Prompt Optimization


55. From Weak Cues to Real Identities: Evaluating Inference-Driven De-Anonymization in LLM Agents


56. LGESynthNet: Controlled Scar Synthesis for Improved Scar Segmentation in Cardiac LGE-MRI Imaging


57. Interpretability without actionability: mechanistic methods cannot correct language model errors despite near-perfect internal representations


58. Large-Scale Analysis of Political Propaganda on Moltbook


59. Understanding the Theoretical Foundations of Deep Neural Networks through Differential Equations


60. MemArchitect: A Policy Driven Memory Governance Layer


61. FaithSteer-BENCH: A Deployment-Aligned Stress-Testing Benchmark for Inference-Time Steering


62. Consumer-to-Clinical Language Shifts in Ambient AI Draft Notes and Clinician-Finalized Documentation: A Multi-level Analysis


63. The Validity Gap in Health AI Evaluation: A Cross-Sectional Analysis of Benchmark Composition


64. CORE: Robust Out-of-Distribution Detection via Confidence and Orthogonal Residual Scoring


65. EDM-ARS: A Domain-Specific Multi-Agent System for Automated Educational Data Mining Research


66. Retrieval-Augmented LLM Agents: Learning to Learn from Experience


67. A Computationally Efficient Learning of Artificial Intelligence System Reliability Considering Error Propagation


68. Access Controlled Website Interaction for Agentic AI with Delegated Critical Tasks


69. TeachingCoach: A Fine-Tuned Scaffolding Chatbot for Instructional Guidance to Instructors


70. Efficient Dense Crowd Trajectory Prediction Via Dynamic Clustering


71. Don’t Vibe Code, Do Skele-Code: Interactive No-Code Notebooks for Subject Matter Experts to Build Lower-Cost Agentic Workflows


72. Adaptive Domain Models: Bayesian Evolution, Warm Rotation, and Principled Training for Geometric and Neuromorphic AI


73. Multi-Trait Subspace Steering to Reveal the Dark Side of Human-AI Interaction


74. Continually self-improving AI


75. DEAF: A Benchmark for Diagnostic Evaluation of Acoustic Faithfulness in Audio Language Models


76. NavTrust: Benchmarking Trustworthiness for Embodied Navigation


77. FinTradeBench: A Financial Reasoning Benchmark for LLMs


78. F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World


79. Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation


80. DreamPartGen: Semantically Grounded Part-Level 3D Generation via Collaborative Latent Denoising


81. $R$-equivalence on Cubic Surfaces I: Existing Cases with Non-Trivial Universal Equivalence


82. SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GPU Kernels Against Hardware Limits


83. ARIADNE: A Perception-Reasoning Synergy Framework for Trustworthy Coronary Angiography Analysis


84. Meanings and Measurements: Multi-Agent Probabilistic Grounding for Vision-Language Navigation


85. VEPO: Variable Entropy Policy Optimization for Low-Resource Language Foundation Models


86. UGID: Unified Graph Isomorphism for Debiasing Large Language Models


87. Adaptive Regime-Aware Stock Price Prediction Using Autoencoder-Gated Dual Node Transformers with Reinforcement Learning Control


88. CustomTex: High-fidelity Indoor Scene Texturing via Multi-Reference Customization


89. FedTrident: Resilient Road Condition Classification Against Poisoning Attacks in Federated Learning


90. DaPT: A Dual-Path Framework for Multilingual Multi-hop Question Answering


91. SAVeS: Steering Safety Judgments in Vision-Language Models via Semantic Cues


92. CAMO: A Conditional Neural Solver for the Multi-objective Multiple Traveling Salesman Problem


93. Parallelograms Strike Back: LLMs Generate Better Analogies than People


94. Em-Garde: A Propose-Match Framework for Proactive Streaming Video Understanding


95. SEM: Sparse Embedding Modulation for Post-Hoc Debiasing of Vision-Language Models


96. What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time?


97. Security awareness in LLM agents: the NDAI zone case


98. Hypothesis-Conditioned Query Rewriting for Decision-Useful Retrieval


99. AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science


100. Foundations of Schrödinger Bridges for Generative Modeling


101. PRIOR: Perceptive Learning for Humanoid Locomotion with Reference Gait Priors


102. Improving moment tensor solutions under Earth structure uncertainty with simulation-based inference


103. Security, privacy, and agentic AI in a regulatory view: From definitions and distinctions to provisions and reflections


104. Progressive Training for Explainable Citation-Grounded Dialogue: Reducing Hallucination to Zero in English-Hindi LLMs


105. Act While Thinking: Accelerating LLM Agents via Pattern-Aware Speculative Tool Execution


106. Translating MRI to PET through Conditional Diffusion Models with Enhanced Pathology Awareness


107. From Accuracy to Readiness: Metrics and Benchmarks for Human-AI Decision-Making


108. MultihopSpatial: Multi-hop Compositional Spatial Reasoning Benchmark for Vision-Language Model


109. Evaluating LLM-Generated Lessons from the Language Learning Students’ Perspective: A Short Case Study on Duolingo


110. Through the Looking-Glass: AI-Mediated Video Communication Reduces Interpersonal Trust and Confidence in Judgments


111. Motion-o: Trajectory-Grounded Video Reasoning


112. Agent Control Protocol: Admission Control for Agent Actions


113. Student views in AI Ethics and Social Impact


114. Perceptio: Perception Enhanced Vision Language Models via Spatial Token Generation


115. Functional Subspace Watermarking for Large Language Models


116. Mi:dm K 2.5 Pro


117. Points-to-3D: Structure-Aware 3D Generation with Point Cloud Priors


118. Automatic Configuration of LLM Post-Training Pipelines


119. ClawTrap: A MITM-Based Red-Teaming Framework for Real-World OpenClaw Security Evaluation


120. Are complicated loss functions necessary for teaching LLMs to reason?


121. WeNLEX: Weakly Supervised Natural Language Explanations for Multilabel Chest X-ray Classification


122. Measuring and Exploiting Confirmation Bias in LLM-Assisted Security Code Review


123. CausalRM: Causal-Theoretic Reward Modeling for RLHF from Observational User Feedbacks


124. Ontology-Guided Diffusion for Zero-Shot Visual Sim2Real Transfer


125. HISR: Hindsight Information Modulated Segmental Process Rewards For Multi-turn Agentic Reinforcement Learning


126. Cognitive Amplification vs Cognitive Delegation in Human-AI Systems: A Metric Framework


127. Multiscale Switch for Semi-Supervised and Contrastive Learning in Medical Ultrasound Image Segmentation


128. Benchmarking PDF Parsers on Table Extraction with LLM-based Semantic Evaluation


129. Beyond TVLA: Anderson-Darling Leakage Assessment for Neural Network Side-Channel Leakage Detection


130. REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation


131. OpenT2M: No-frill Motion Generation with Open-source,Large-scale, High-quality Data


132. Learning to Self-Evolve


133. AutORAN: LLM-driven Natural Language Programming for Agile xApp Development


134. myMNIST: Benchmark of PETNN, KAN, and Classical Deep Learning Models for Burmese Handwritten Digit Recognition


135. Elastic Weight Consolidation Done Right for Continual Learning


136. ICE: Intervention-Consistent Explanation Evaluation with Statistical Grounding for LLMs


137. SpecForge: A Flexible and Efficient Open-Source Training Framework for Speculative Decoding


138. Transformers Learn Robust In-Context Regression under Distributional Uncertainty


139. HiMu: Hierarchical Multimodal Frame Selection for Long Video Question Answering


140. CoDA: Exploring Chain-of-Distribution Attacks and Post-Hoc Token-Space Repair for Medical Vision-Language Models


141. SCISSR: Scribble-Conditioned Interactive Surgical Segmentation and Refinement


142. Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds


143. When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making


144. Counting Circuits: Mechanistic Interpretability of Visual Reasoning in Large Vision-Language Models


145. CAFlow: Adaptive-Depth Single-Step Flow Matching for Efficient Histopathology Super-Resolution


146. Foundations and Architectures of Artificial Intelligence for Motor Insurance


147. Efficient Video Diffusion with Sparse Information Transmission for Video Compression


148. FILT3R: Latent State Adaptive Kalman Filter for Streaming 3D Reconstruction


149. Do Vision Language Models Understand Human Engagement in Games?


150. WASD: Locating Critical Neurons as Sufficient Conditions for Explaining and Controlling LLM Behavior


151. Interpretable Prostate Cancer Detection using a Small Cohort of MRI Images


152. HypeMed: Enhancing Medication Recommendations with Hypergraph-Based Patient Relationships


153. SODIUM: From Open Web Data to Queryable Databases


154. Discounted Beta–Bernoulli Reward Estimation for Sample-Efficient Reinforcement Learning with Verifiable Rewards


155. Adaptive Decoding via Test-Time Policy Learning for Self-Improving Generation


156. R&D: Balancing Reliability and Diversity in Synthetic Data Augmentation for Semantic Segmentation


157. The Impact of Corporate AI Washing on Farmers’ Digital Financial Behavior Response – An Analysis from the Perspective of Digital Financial Exclusion


158. Mind the Rarities: Can Rare Skin Diseases Be Reliably Diagnosed via Diagnostic Reasoning?


159. Self-Tuning Sparse Attention: Multi-Fidelity Hyperparameter Optimization for Transformer Acceleration


160. The Spillover Effects of Peer AI Rinsing on Corporate Green Innovation


161. TARo: Token-level Adaptive Routing for LLM Test-time Alignment


162. An SO(3)-equivariant reciprocal-space neural potential for long-range interactions


163. Evolutionarily Stable Stackelberg Equilibrium


164. PlanTwin: Privacy-Preserving Planning Abstractions for Cloud-Assisted LLM Agents


165. To See or To Please: Uncovering Visual Sycophancy and Split Beliefs in VLMs


166. PowerFlow: Unlocking the Dual Nature of LLMs via Principled Distribution Matching


167. From Noise to Signal: When Outliers Seed New Topics


168. Shifting Uncertainty to Critical Moments: Towards Reliable Uncertainty Quantification for VLA Model


169. Can LLMs Reason Like Automated Theorem Provers for Rust Verification? VCoT-Bench: Evaluating via Verification Chain of Thought


170. DriveVLM-RL: Neuroscience-Inspired Reinforcement Learning with Vision-Language Models for Safe and Deployable Autonomous Driving


171. Approximate Subgraph Matching with Neural Graph Representations and Reinforcement Learning


172. Auditing Preferences for Brands and Cultures in LLMs


173. Sparse3DTrack: Monocular 3D Object Tracking Using Sparse Supervision


174. Offload or Overload: A Platform Measurement Study of Mobile Robotic Manipulation Workloads


175. Detection Is Cheap, Routing Is Learned: Why Refusal-Based Alignment Evaluation Fails


176. Enactor: From Traffic Simulators to Surrogate World Models


177. LRConv-NeRV: Low Rank Convolution for Efficient Neural Video Compression


178. Sharpness-Aware Minimization in Logit Space Efficiently Enhances Direct Preference Optimization


179. Discovering What You Can Control: Interventional Boundary Discovery for Reinforcement Learning


180. MolRGen: A Training and Evaluation Setting for De Novo Molecular Generation with Reasonning Models


181. Gradient-Informed Temporal Sampling Improves Rollout Accuracy in PDE Surrogate Training


182. R2-Dreamer: Redundancy-Reduced World Models without Decoders or Augmentation


183. Retrieval-Augmented LLMs for Security Incident Analysis


184. VLM-AutoDrive: Post-Training Vision-Language Models for Safety-Critical Autonomous Driving Events


185. How LLMs Distort Our Written Language


186. Final Report for the Workshop on Robotics & AI in Medicine


187. Understanding Task Aggregation for Generalizable Ultrasound Foundation Models


188. Insight-V++: Towards Advanced Long-Chain Visual Reasoning with Multimodal Large Language Models


189. Intellectual Stewardship: Re-adapting Human Minds for Creative Knowledge Work in the Age of AI


190. LLM-Augmented Computational Phenotyping of Long Covid


191. VC-Soup: Value-Consistency Guided Multi-Value Alignment for Large Language Models


192. Tula: Optimizing Time, Cost, and Generalization in Distributed Large-Batch Training


193. Discovery of Bimodal Drift Rate Structure in FRB 20240114A: Evidence for Dual Emission Regions


194. ARTEMIS: A Neuro Symbolic Framework for Economically Constrained Market Dynamics


195. Training-Only Heterogeneous Image-Patch-Text Graph Supervision for Advancing Few-Shot Learning Adapters


196. A Trace-Based Assurance Framework for Agentic AI Orchestration: Contracts, Testing, and Governance


197. MOSS-TTS Technical Report


198. CytoSyn: a Foundation Diffusion Model for Histopathology – Tech Report


199. Enhancing Reinforcement Learning Fine-Tuning with an Online Refiner


200. Uncovering Latent Phase Structures and Branching Logic in Locomotion Policies: A Case Study on HalfCheetah


201. Probabilistic Federated Learning on Uncertain and Heterogeneous Data with Model Personalization


202. SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training


203. Lightweight Adaptation for LLM-based Technical Service Agent: Latent Logic Augmentation and Robust Noise Reduction


204. A Synthesizable RTL Implementation of Predictive Coding Networks


205. MCP-38: A Comprehensive Threat Taxonomy for Model Context Protocol Systems (v1.0)


206. S3T-Former: A Purely Spike-Driven State-Space Topology Transformer for Skeleton Action Recognition


207. NANOZK: Layerwise Zero-Knowledge Proofs for Verifiable Large Language Model Inference


208. The Provenance Paradox in Multi-Agent LLM Routing: Delegation Contracts and Attested Identity in LDP


209. Semantic Chameleon: Corpus-Dependent Poisoning Attacks and Defenses in RAG Systems


210. Towards Differentiating Between Failures and Domain Shifts in Industrial Data Streams


211. InfoMamba: An Attention-Free Hybrid Mamba-Transformer Model


212. Quine: Realizing LLM Agents as Native POSIX Processes


213. Engineering Verifiable Modularity in Transformers via Per-Layer Supervision


214. Clinically Meaningful Explainability for NeuroAI: An ethical, technical, and clinical perspective


215. KD-EKF: Knowledge-Distilled Adaptive Covariance EKF for Robust UWB/PDR Indoor Localization


216. Understanding the Relationship Between Firms’ AI Technology Innovation and Consumer Complaints


217. ProKWS: Personalized Keyword Spotting via Collaborative Learning of Phonemes and Prosody


218. PCOV-KWS: Multi-task Learning for Personalized Customizable Open Vocabulary Keyword Spotting


219. Using Laplace Transform To Optimize the Hallucination of Generation Models


220. BenchBrowser – Collecting Evidence for Evaluating Benchmark Validity


221. MineDraft: A Framework for Batch Parallel Speculative Decoding


222. Beyond Accuracy: An Explainability-Driven Analysis of Harmful Content Detection


223. DynaRAG: Bridging Static and Dynamic Knowledge in Retrieval-Augmented Generation


224. Agentic Framework for Political Biography Extraction


225. How Confident Is the First Token? An Uncertainty-Calibrated Prompt Optimization Framework for Large Language Model Classification and Understanding


226. TherapyGym: Evaluating and Aligning Clinical Fidelity and Safety in Therapy Chatbots


227. Do Large Language Models Possess a Theory of Mind? A Comparative Evaluation Using the Strange Stories Paradigm


228. Using Optimal Transport as Alignment Objective for fine-tuning Multilingual Contextualized Embeddings