전체 AI 논문 - 2026-04-17

1. Generalization in LLM Problem Solving: The Case of the Shortest Path


2. Diagnosing LLM Judge Reliability: Conformal Prediction Sets and Transitivity Violations


3. How Do LLMs and VLMs Understand Viewpoint Rotation Without Vision? An Interpretability Study


4. Blue Data Intelligence Layer: Streaming Data and Agents for Multi-source Multi-modal Data-Centric Applications


5. RadAgent: A tool-using AI agent for stepwise interpretation of chest computed tomography


6. Context Over Content: Exposing Evaluation Faking in Automated Judges


7. Learning to Think Like a Cartoon Captionist: Incongruity-Resolution Supervision for Multimodal Humor Understanding


8. Meituan Merchant Business Diagnosis via Policy-Guided Dual-Process User Simulation


9. Agent-Aided Design for Dynamic CAD Models


10. IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning


11. An Axiomatic Benchmark for Evaluation of Scientific Novelty Metrics


12. SRMU: Relevance-Gated Updates for Streaming Hyperdimensional Memories


13. HyperSpace: A Generalized Framework for Spatial Encoding in Hyperdimensional Representations


14. OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis


15. Where are the Humans? A Scoping Review of Fairness in Multi-agent AI Systems


16. From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench


17. Autogenesis: A Self-Evolving Agent Protocol


18. Towards Faster Language Model Inference Using Mixture-of-Experts Flow Matching


19. COEVO: Co-Evolutionary Framework for Joint Functional Correctness and PPA Optimization in LLM-Based RTL Generation


20. Predicting Power-System Dynamic Trajectories with Foundation Models


21. The Possibility of Artificial Intelligence Becoming a Subject and the Alignment Problem


22. Dr.~RTL: Autonomous Agentic RTL Optimization through Tool-Grounded Self-Improvement


23. AI-Enabled Covert Channel Detection in RF Receiver Architectures


24. Hybrid Decision Making via Conformal VLM-generated Guidance


25. Discovering Novel LLM Experts via Task-Capability Coevolution


26. WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training


27. Dual-Axis Generative Reward Model Toward Semantic and Turn-taking Robustness in Interactive Spoken Dialogue Models


28. ADAPT: Benchmarking Commonsense Planning under Unspecified Affordance Constraints


29. Governing Reflective Human-AI Collaboration: A Framework for Epistemic Scaffolding and Traceable Reasoning


30. Toward Agentic RAG for Ukrainian


31. MemoSight: Unifying Context Compression and Multi Token Prediction for Reasoning Acceleration


32. Cooperate to Compete: Strategic Data Generation and Incentivization Framework for Coopetitive Cross-Silo Federated Learning


33. The Missing Knowledge Layer in AI: A Framework for Stable Human-AI Reasoning


34. Benchmarks for Trajectory Safety Evaluation and Diagnosis in OpenClaw and Codex: ATBench-Claw and ATBench-CodeX


35. TrigReason: Trigger-Based Collaboration between Small and Large Reasoning Models


36. Intermediate Layers Encode Optimal Biological Representations in Single-Cell Foundation Models


37. Beyond Literal Summarization: Redefining Hallucination for Medical SOAP Note Evaluation


38. The LLM Fallacy: Misattribution in AI-Assisted Cognitive Workflows


39. Diffusion Crossover: Defining Evolutionary Recombination in Diffusion Models via Noise Sequence Interpolation


40. A Comparative Study of CNN Optimization Methods for Edge AI: Exploring the Role of Early Exits



42. CogEvolution: A Human-like Generative Educational Agent to Simulate Student’s Cognitive Evolution


43. MirrorBench: Evaluating Self-centric Intelligence in MLLMs by Introducing a Mirror


44. CoTEvol: Self-Evolving Chain-of-Thoughts for Data Synthesis in Mathematical Reasoning


45. Disentangle-then-Refine: LLM-Guided Decoupling and Structure-Aware Refinement for Graph Contrastive Learning


46. Personalized and Context-Aware Transformer Models for Predicting Post-Intervention Physiological Responses from Wearable Sensor Data


47. The Agentification of Scientific Research: A Physicist’s Perspective


48. Layered Mutability: Continuity and Governance in Persistent Self-Modifying Agents


49. SGA-MCTS: Decoupling Planning from Execution via Training-Free Atomic Experience Retrieval


50. HWE-Bench: Benchmarking LLM Agents on Real-World Hardware Bug Repair Tasks


51. SynHAT: A Two-stage Coarse-to-Fine Diffusion Framework for Synthesizing Human Activity Traces


52. CAMO: An Agentic Framework for Automated Causal Discovery from Micro Behaviors to Macro Emergence in LLM Agent Simulations


53. M2-PALE: A Framework for Explaining Multi-Agent MCTS–Minimax Hybrids via Process Mining and LLMs


54. DR$^{3}$-Eval: Towards Realistic and Reproducible Deep Research Evaluation


55. Acceptance Dynamics Across Cognitive Domains in Speculative Decoding


56. Rethinking Patient Education as Multi-turn Multi-modal Interaction


57. AgentGA: Evolving Code Solutions in Agent-Seed Space


58. Targeted Exploration via Unified Entropy Control for Reinforcement Learning


59. Learning to Draw ASCII Improves Spatial Reasoning in Language Models


60. A Parallel Approach to Counting Exact Covers Based on Decomposability Property


61. CoDaS: AI Co-Data-Scientist for Biomarker Discovery via Wearable Sensors


62. El Agente Forjador: Task-Driven Agent Generation for Quantum Simulation


63. GDPR Auto-Formalization with AI Agents and Human Verification


64. Prompt Optimization Is a Coin Flip: Diagnosing When It Helps in Compound AI Systems


65. Enhancing Mental Health Counseling Support in Bangladesh using Culturally-Grounded Knowledge


66. MARS$^2$: Scaling Multi-Agent Tree Search via Reinforcement Learning for Code Generation


67. TRACER: Trace-Based Adaptive Cost-Efficient Routing for LLM Classification


68. Dissecting Failure Dynamics in Large Language Model Reasoning


69. Quantifying Cross-Query Contradictions in Multi-Query LLM Reasoning


70. Mind DeepResearch Technical Report


71. Perspective on Bias in Biomedical AI: Preventing Downstream Healthcare Disparities


72. Geometric Metrics for MoE Specialization: From Fisher Information to Early Failure Detection


73. Improving Machine Learning Performance with Synthetic Augmentation


74. Pushing the Limits of On-Device Streaming ASR: A Compact, High-Accuracy English Model for Low-Latency Inference


75. Seeing Through Circuits: Faithful Mechanistic Interpretability for Vision Transformers


76. Evo-MedAgent: Beyond One-Shot Diagnosis with Agents That Remember, Reflect, and Improve


77. Response-Aware User Memory Selection for LLM Personalization


78. Improving Human Performance with Value-Aware Interventions: A Case Study in Chess


79. AIBuildAI: An AI Agent for Automatically Building AI Models


80. On Tackling Complex Tasks with Reward Machines and Signal Temporal Logics


81. Geometric Routing Enables Causal Expert Control in Mixture of Experts


82. Demonstration of Pneuma-Seeker: Agentic System for Reifying and Fulfilling Information Needs on Tabular Data


83. Equifinality in Mixture of Experts: Routing Topology Does Not Determine Language Modeling Quality


84. Credo: Declarative Control of LLM Pipelines via Beliefs and Policies


85. Mistake gating leads to energy and memory efficient continual learning


86. Seeing Through Experts Eyes A Foundational Vision Language Model Trained on Radiologists Gaze and Reasoning


87. GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification


88. Formalizing Kantian Ethics: Formula of the Universal Law Logic (FULL)


89. Interpretable and Explainable Surrogate Modeling for Simulations: A State-of-the-Art Survey and Perspectives on Explainable AI for Decision-Making


90. Fun-TSG: A Function-Driven Multivariate Time Series Generator with Variable-Level Anomaly Labeling


91. Simulating Human Cognition: Heartbeat-Driven Autonomous Thinking Activity Scheduling for LLM-based AI systems


92. NuHF Claw: A Risk Constrained Cognitive Agent Framework for Human Centered Procedure Support in Digital Nuclear Control Rooms


93. MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation


94. AD4AD: Benchmarking Visual Anomaly Detection Models for Safer Autonomous Driving


95. Why Do Vision Language Models Struggle To Recognize Human Emotions?


96. Prism: Symbolic Superoptimization of Tensor Programs


97. SegWithU: Uncertainty as Perturbation Energy for Single-Forward-Pass Risk-Aware Medical Image Segmentation


98. CoopEval: Benchmarking Cooperation-Sustaining Mechanisms and LLM Agents in Social Dilemmas


99. Stability and Generalization in Looped Transformers


100. Agentic Microphysics: A Manifesto for Generative AI Safety


101. AI-Assisted Requirements Engineering: An Empirical Evaluation Relative to Expert Judgment


102. Benchmarking Classical Coverage Path Planning Heuristics on Irregular Hexagonal Grids for Maritime Coverage Scenarios


103. VisPCO: Visual Token Pruning Configuration Optimization via Budget-Aware Pareto-Frontier Learning for Vision-Language Models


104. Scepsy: Serving Agentic Workflows Using Aggregate LLM Pipelines


105. MambaSL: Exploring Single-Layer Mamba for Time Series Classification


106. Class Unlearning via Depth-Aware Removal of Forget-Specific Directions


107. Compressing Sequences in the Latent Embedding Space: $K$-Token Merging for Large Language Models


108. LLMs Gaming Verifiers: RLVR can Lead to Reward Hacking


109. Structure as Computation: Developmental Generation of Minimal Neural Circuits


110. Amortized Optimal Transport from Sliced Potentials


111. IUQ: Interrogative Uncertainty Quantification for Long-Form Large Language Model Generation


112. Autonomous Evolution of EDA Tools: Multi-Agent Self-Evolved ABC


113. NEAT-NC: NEAT guided Navigation Cells for Robot Path Planning


114. No More Guessing: a Verifiable Gradient Inversion Attack in Federated Learning


115. CoGrid & the Multi-User Gymnasium: A Framework for Multi-Agent Experimentation


116. When Fairness Metrics Disagree: Evaluating the Reliability of Demographic Fairness Assessment in Machine Learning


117. Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization


118. What Is the Minimum Architecture for Prolepsis? Early Irrevocable Commitment Across Tasks in Small Transformers


119. Agentic Explainability at Scale: Between Corporate Fears and XAI Needs


120. UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards


121. Calibration-Gated LLM Pseudo-Observations for Online Contextual Bandits


122. RaTA-Tool: Retrieval-based Tool Selection with Multimodal Large Language Models


123. STEP-Parts: Geometric Partitioning of Boundary Representations for Large-Scale CAD Processing


124. Improving Sparse Autoencoder with Dynamic Attention


125. Beyond Importance Sampling: Rejection-Gated Policy Optimization


126. Can LLMs Score Medical Diagnoses and Clinical Reasoning as well as Expert Panels?


127. Reasoning Dynamics and the Limits of Monitoring Modality Reliance in Vision-Language Models


128. RACER: Retrieval-Augmented Contextual Rapid Speculative Decoding


129. SOLIS: Physics-Informed Learning of Interpretable Neural Surrogates for Nonlinear Systems


130. GenRec: A Preference-Oriented Generative Framework for Large-Scale Recommendation


131. Vibe-Coding: Feedback-Based Automated Verification with no Human Code Inspection, a Feasibility Study


132. MetaDent: Labeling Clinical Images for Vision-Language Models in Dentistry


133. Schema Key Wording as an Instruction Channel in Structured Generation under Constrained Decoding


134. ClimateCause: Complex and Implicit Causal Structures in Climate Reports


135. Efficient Search of Implantable Adaptive Cells for Medical Image Segmentation


136. Zero-Shot Retail Theft Detection via Orchestrated Vision Models: A Model-Agnostic, Cost-Effective Alternative to Trained Single-Model Systems


137. Temporal Cross-Modal Knowledge-Distillation-Based Transfer-Learning for Gas Turbine Vibration Fault Detection


138. Which bird does not have wings: Negative-constrained KGQA with Schema-guided Semantic Matching and Self-directed Refinement


139. Catching Every Ripple: Enhanced Anomaly Awareness via Dynamic Concept Adaptation


140. Bounded Autonomy for Enterprise AI: Typed Action Contracts and Consumer-Side Execution


141. AIPC: Agent-Based Automation for AI Model Deployment with Qualcomm AI Runtime


142. Seen-to-Scene: Keep the Seen, Generate the Unseen for Video Outpainting


143. Chaotic CNN for Limited Data Image Classification


144. Fact4ac at the Financial Misinformation Detection Challenge Task: Reference-Free Financial Misinformation Detection via Fine-Tuning and Few-Shot Prompting of Large Language Models


145. StoryCoder: Narrative Reformulation for Structured Reasoning in LLM Code Generation


146. ELMoE-3D: Leveraging Intrinsic Elasticity of MoE for Hybrid-Bonding-Enabled Self-Speculative Decoding in On-Premises Serving


147. Asking What Matters: Reward-Driven Clarification for Software Engineering Tasks


148. Retrieve, Then Classify: Corpus-Grounded Automation of Clinical Value Set Authoring


149. Uncertainty-aware Generative Learning Path Recommendation with Cognition-Adaptive Diffusion


150. Hijacking Large Audio-Language Models via Context-Agnostic and Imperceptible Auditory Prompt Injection


151. CausalDetox: Causal Head Selection and Intervention for Language Model Detoxification


152. Mechanistic Decoding of Cognitive Constructs in LLMs


153. AgileLog: A Forkable Shared Log for Agents on Data Streams


154. CPGRec+: A Balance-oriented Framework for Personalized Video Game Recommendations


155. Generative Augmented Inference


156. Don’t Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG


157. Controllable Video Object Insertion via Multiview Priors


158. VeriGraphi: A Multi-Agent Framework of Hierarchical RTL Generation for Large Hardware Designs


159. CSRA: Controlled Spectral Residual Augmentation for Robust Sepsis Prediction


160. CBCL: Safe Self-Extending Agent Communication


161. NewsTorch: A PyTorch-based Toolkit for Learner-oriented News Recommendation


162. On the Expressive Power and Limitations of Multi-Layer SSMs


163. Decoupling Identity from Utility: Privacy-by-Design Frameworks for Financial Ecosystems


164. A Nonasymptotic Theory of Gain-Dependent Error Dynamics in Behavior Cloning


165. Auxiliary Finite-Difference Residual-Gradient Regularization for PINNs


166. FocalLens: Visualizing Narratives through Focalization


167. FAIR Universe Weak Lensing ML Uncertainty Challenge: Handling Uncertainties and Distribution Shifts for Precision Cosmology


168. Crowdsourcing of Real-world Image Annotation via Visual Properties


169. Robustness Analysis of Machine Learning Models for IoT Intrusion Detection Under Data Poisoning Attacks


170. Hierarchical vs. Flat Iteration in Shared-Weight Transformers


171. LLMs taking shortcuts in test generation: A study with SAP HANA and LevelDB


172. Three-Phase Transformer


173. SpaceMind: A Modular and Self-Evolving Embodied Vision-Language Agent Framework for Autonomous On-orbit Servicing


174. Generating Concept Lexicalizations via Dictionary-Based Cross-Lingual Sense Projection


175. BiCon-Gate: Consistency-Gated De-colloquialisation for Dialogue Fact-Checking


176. Coalition Formation in LLM Agent Networks: Stability Analysis and Convergence Guarantees


177. Step-level Denoising-time Diffusion Alignment with Multiple Objectives


178. Modular Continual Learning via Zero-Leakage Reconstruction Routing and Autonomous Task Discovery


179. SatBLIP: Context Understanding and Feature Identification from Satellite Imagery with Vision-Language Learning


180. The Cost of Language: Centroid Erasure Exposes and Exploits Modal Competition in Multimodal Language Models


181. APEX-MEM: Agentic Semi-Structured Memory with Temporal Reasoning for Long-Term Conversational AI


182. When PCOS Meets Eating Disorders: An Explainable AI Approach to Detecting the Hidden Triple Burden


183. Tight Sample Complexity Bounds for Best-Arm Identification Under Bounded Systematic Bias


184. Mamba-SSM with LLM Reasoning for Biomarker Discovery: Causal Feature Refinement via Chain-of-Thought Gene Evaluation


185. Thermodynamic Diffusion Inference with Minimal Digital Conditioning


186. Faithfulness Serum: Mitigating the Faithfulness Gap in Textual Explanations of LLM Decisions via Attribution Guidance


187. Challenges and Future Directions in Agentic Reverse Engineering Systems


188. DharmaOCR: Specialized Small Language Models for Structured OCR that outperform Open-Source and Commercial Baselines


189. Aerial Multi-Functional RIS in Fluid Antennas-Aided Full-Duplex Networks: A Self-Optimized Hybrid Deep Reinforcement Learning Approach


190. EuropeMedQA Study Protocol: A Multilingual, Multimodal Medical Examination Dataset for Language Model Evaluation


191. Quantum-inspired tensor networks in machine learning models


192. Enhancing LLM-based Search Agents via Contribution Weighted Group Relative Policy Optimization


193. Reinforcement Learning via Value Gradient Flow


194. GUI-Perturbed: Domain Randomization Reveals Systematic Brittleness in GUI Grounding Models


195. ReviewGrounder: Improving Review Substantiveness with Rubric-Guided, Tool-Integrated Agents


196. Evaluation of Agents under Simulated AI Marketplace Dynamics


197. Awakening Dormant Experts:Counterfactual Routing to Mitigate MoE Hallucinations


198. Optimistic Policy Learning under Pessimistic Adversaries with Regret and Violation Guarantees


199. Graph-Based Fraud Detection with Dual-Path Graph Filtering


200. Explainable Graph Neural Networks for Interbank Contagion Surveillance: A Regulatory-Aligned Framework for the U.S. Banking Sector


201. Shapley Value-Guided Adaptive Ensemble Learning for Explainable Financial Fraud Detection with U.S. Regulatory Compliance Validation


202. Magnitude Is All You Need? Rethinking Phase in Quantum Encoding of Complex SAR Data


203. Dive into Claude Code: The Design Space of Today’s and Future AI Agent Systems


204. FRESCO: Benchmarking and Optimizing Re-rankers for Evolving Semantic Conflict in Retrieval-Augmented Generation


205. TRACE: A Conversational Framework for Sustainable Tourism Recommendation with Agentic Counterfactual Explanations



207. Knowledge Graph RAG: Agentic Crawling and Graph Construction in Enterprise Documents


208. MEME-Fusion@CHiPSAL 2026: Multimodal Ablation Study of Hate Detection and Sentiment Analysis on Nepali Memes


209. Neuro-Oracle: A Trajectory-Aware Agentic RAG Framework for Interpretable Epilepsy Surgical Prognosis


210. PriHA: A RAG-Enhanced LLM Framework for Primary Healthcare Assistant in Hong Kong


211. CROP: Token-Efficient Reasoning in Large Language Models via Regularized Prompt Optimization


212. Ollivier-Ricci Curvature of Riemannian Manifolds and Directed Graphs with Applications to Graph Neural Networks


213. Towards Verified and Targeted Explanations through Formal Methods


214. Disentangled Dual-Branch Graph Learning for Conversational Emotion Recognition


215. Bridging scalp and intracranial EEG in BCI via pretrained neural representations and geometric constraint embedding


216. Retina gap junctions support the robust perception by warping neural representational geometries along the visual hierarchy


217. PolyBench: Benchmarking LLM Forecasting and Trading Capabilities on Live Prediction Market Data


218. MixAtlas: Uncertainty-aware Data Mixture Optimization for Multimodal LLM Midtraining


219. The PICCO Framework for Large Language Model Prompting: A Taxonomy and Reference Architecture for Prompt Structure


220. Grading the Unspoken: Evaluating Tacit Reasoning in Quantum Field Theory and String Theory with LLMs


221. HARNESS: Lightweight Distilled Arabic Speech Foundation Models


222. End-to-End Learning-based Operation of Integrated Energy Systems for Buildings and Data Centers


223. Internal Knowledge Without External Expression: Probing the Generalization Boundary of a Classical Chinese Language Model


224. An Underexplored Frontier: Large Language Models for Rare Disease Patient Education and Communication – A scoping review


225. Listen, Correct, and Feed Back: Spoken Pedagogical Feedback Generation


226. The Devil Is in Gradient Entanglement: Energy-Aware Gradient Coordinator for Robust Generalized Category Discovery


227. QU-NLP at ArchEHR-QA 2026: Two-Stage QLoRA Fine-Tuning of Qwen3-4B for Patient-Oriented Clinical Question Answering and Evidence Sentence Alignment


228. Tug-of-War within A Decade: Conflict Resolution in Vulnerability Analysis via Teacher-Guided Retrieval-Augmented Generations


229. Benchmarking Linguistic Adaptation in Comparable-Sized LLMs: A Study of Llama-3.1-8B, Mistral-7B-v0.1, and Qwen3-8B on Romanized Nepali


230. Stateful Evidence-Driven Retrieval-Augmented Generation with Iterative Reasoning


231. SAGE Celer 2.6 Technical Card


232. Chinese Essay Rhetoric Recognition Using LoRA, In-context Learning and Model Ensemble


233. SeaAlert: Critical Information Extraction From Maritime Distress Communications with Large Language Models


234. Can Large Language Models Detect Methodological Flaws? Evidence from Gesture Recognition for UAV-Based Rescue Operation Based on Deep Learning


235. HUOZIIME: An On-Device LLM-enhanced Input Method for Deep Personalization


236. MemGround: Long-Term Memory Evaluation Kit for Large Language Models in Gamified Scenarios


237. An Edge-Cloud Collaborative Architecture for Proactive Elderly Care: Real-Time Risk Assessment and Three-Level Emergency Response


238. From Black Box to Glass Box: Cross-Model ASR Disagreement to Prioto Review in Ambient AI Scribe Documentation


239. Gaussian Process Regression of Steering Vectors With Physics-Aware Deep Composite Kernels for Augmented Listening