전체 AI 논문 - 2026-04-24

1. From Research Question to Scientific Workflow: Leveraging Agentic AI for Science Automation


2. Nemobot Games: Crafting Strategic AI Gaming Agents for Interactive Learning with Large Language Models


3. Bounding the Black Box: A Statistical Certification Framework for AI Risk Regulation


4. Alignment has a Fantasia Problem


5. Tool Attention Is All You Need: Dynamic Tool Gating and Lazy Schema Loading for Eliminating the MCP/Tools Tax in Scalable Agentic Workflows


6. Learning to Communicate: Toward End-to-End Optimization of Multi-Agent Language Systems


7. Inferring High-Level Events from Timestamped Data: Complexity and Medical Applications


8. Who Defines “Best”? Towards Interactive, User-Defined Evaluation of LLM Leaderboards


9. Thinking with Reasoning Skills: Fewer Tokens, More Accuracy


10. Bridging the Training-Deployment Gap: Gated Encoding and Multi-Scale Refinement for Efficient Quantization-Aware Image Enhancement


11. Enabling and Inhibitory Pathways of University Students’ Willingness to Disclose AI Use: A Cognition-Affect-Conation Perspective


12. GS-Quant: Granular Semantic and Generative Structural Quantization for Knowledge Graph Completion


13. To See the Unseen: on the Generalization Ability of Transformers in Symbolic Reasoning


14. CoFEE: Reasoning Control for LLM-Based Feature Discovery


15. Separable Expert Architecture: Toward Privacy-Preserving LLM Personalization via Composable Adapters and Deletable User Proxies


16. Probabilistic Verification of Neural Networks via Efficient Probabilistic Hull Generation


17. Engaged AI Governance: Addressing the Last Mile Challenge Through Internal Expert Collaboration


18. Unbiased Prevalence Estimation with Multicalibrated LLMs


19. The CriticalSet problem: Identifying Critical Contributors in Bipartite Dependency Networks


20. Satisfying Rationality Postulates of Structured Argumentation Through Deductive Support – Technical Report


21. BioMiner: A Multi-modal System for Automated Mining of Protein-Ligand Bioactivity Data from Literature


22. GeoMind: An Agentic Workflow for Lithology Classification with Reasoned Tool Invocation


23. How English Print Media Frames Human-Elephant Conflicts in India


24. Efficient Agent Evaluation via Diversity-Guided User Simulation


25. AI-Gram: When Visual Agents Interact in a Social Network


26. HiCrew: Hierarchical Reasoning for Long-Form Video Understanding via Question-Aware Multi-Agent Collaboration


27. Brief chatbot interactions produce lasting changes in human moral values


28. FairQE: Multi-Agent Framework for Mitigating Gender Bias in Translation Quality Estimation


29. SemanticAgent: A Semantics-Aware Framework for Text-to-SQL Data Synthesis


30. Time, Causality, and Observability Failures in Distributed AI Inference Systems


31. ReaGeo: Reasoning-Enhanced End-to-End Geocoding with LLMs


32. Symbolic Grounding Reveals Representational Bottlenecks in Abstract Visual Reasoning


33. Evaluating AI Meeting Summaries with a Reusable Cross-Domain Pipeline


34. Ideological Bias in LLMs’ Economic Causal Reasoning


35. Spatial Metaphors for LLM Memory: A Critical Analysis of the MemPalace Architecture


36. Can MLLMs “Read” What is Missing?


37. Enhancing Online Recruitment with Category-Aware MoE and LLM-based Data Augmentation


38. Trustworthy Clinical Decision Support Using Meta-Predicates and Domain-Specific Languages


39. Robustness Analysis of POMDP Policies to Observation Perturbations


40. ReCAPA: Hierarchical Predictive Correction to Mitigate Cascading Failures


41. Align Generative Artificial Intelligence with Human Preferences: A Novel Large Language Model Fine-Tuning Method for Online Review Management


42. Trust but Verify: Introducing DAVinCI – A Framework for Dual Attribution and Verification in Claim Inference for Language Models


43. Multi-Agent Empowerment and Emergence of Complex Behavior in Groups


44. Agentic AI for Personalized Physiotherapy: A Multi-Agent Framework for Generative Video Training and Real-Time Pose Correction


45. AI Governance under Political Turnover: The Alignment Surface of Compliance Design


46. Propensity Inference: Environmental Contributors to LLM Behaviour


47. Mind the Prompt: Self-adaptive Generation of Task Plan Explanations via LLMs


48. InVitroVision: a Multi-Modal AI Model for Automated Description of Embryo Development using Natural Language


49. Active Data


50. Who Defines Fairness? Target-Based Prompting for Demographic Representation in Generative Models


51. HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering


52. Adaptive Test-Time Compute Allocation with Evolving In-Context Demonstrations


53. Deep FinResearch Bench: Evaluating AI’s Ability to Conduct Professional Financial Investment Research


54. The Last Harness You’ll Ever Build


55. Value-Conflict Diagnostics Reveal Widespread Alignment Faking in Language Models


56. Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks


57. Escaping the Agreement Trap: Defensibility Signals for Evaluating Rule-Governed AI


58. Architecture of an AI-Based Automated Course of Action Generation System for Military Operations


59. Seeing Fast and Slow: Learning the Flow of Time in Videos


60. When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs


61. Equity Bias: An Ethical Framework for AI Design


62. A Scale-Adaptive Framework for Joint Spatiotemporal Super-Resolution with Diffusion Models


63. GiVA: Gradient-Informed Bases for Vector-Based Adaptation


64. A Multi-Stage Warm-Start Deep Learning Framework for Unit Commitment


65. TingIS: Real-time Risk Event Discovery from Noisy Customer Incidents at Enterprise Scale


66. A Multimodal Text- and Graph-Based Approach for Open-Domain Event Extraction from Documents


67. Addressing Image Authenticity When Cameras Use Generative AI


68. Replay-buffer engineering for noise-robust quantum circuit optimization


69. Transient Turn Injection: Exposing Stateless Multi-Turn Vulnerabilities in Large Language Models


70. TraceScope: Interactive URL Triage via Decoupled Checklist Adjudication


71. Modulating Cross-Modal Convergence with Single-Stimulus, Intra-Modal Dispersion


72. Divide-then-Diagnose: Weaving Clinician-Inspired Contexts for Ultra-Long Capsule Endoscopy Videos


73. Probably Approximately Consensus: On the Learning Theory of Finding Common Ground


74. Quotient-Space Diffusion Models


75. SyMTRS: Benchmark Multi-Task Synthetic Dataset for Depth, Domain Adaptation and Super-Resolution in Aerial Imagery


76. Why are all LLMs Obsessed with Japanese Culture? On the Hidden Cultural and Regional Biases of LLMs


77. StructMem: Structured Memory for Long-Horizon Behavior in LLMs


78. Agentic AI-assisted coding offers a unique opportunity to instill epistemic grounding during software development


79. AEL: Agent Evolving Learning for Open-Ended Environments


80. Building a Precise Video Language with Human-AI Oversight


81. Fairness under uncertainty in sequential decisions


82. Stealthy Backdoor Attacks against LLMs Based on Natural Style Triggers


83. Efficient Logic Gate Networks for Video Copy Detection


84. Geometric Monomial (GEM): a family of rational 2N-differentiable activation functions


85. Fine-Grained Perspectives: Modeling Explanations with Annotator-Specific Rationales


86. Causal Disentanglement for Full-Reference Image Quality Assessment


87. Dilated CNNs for Periodic Signal Processing: A Low-Complexity Approach


88. Task-specific Subnetwork Discovery in Reinforcement Learning for Autonomous Underwater Navigation


89. Promoting Simple Agents: Ensemble Methods for Event-Log Prediction


90. Process Supervision via Verbal Critique Improves Reasoning in Large Language Models


91. Using ASP(Q) to Handle Inconsistent Prioritized Data


92. On the Role of Preprocessing and Memristor Dynamics in Reservoir Computing for Image Classification


93. DryRUN: On the Role of Public Tests in LLM-Driven Code Generation


94. A Metamorphic Testing Approach to Diagnosing Memorization in LLM-Based Program Repair


95. Hybrid Deep Learning Approach for Coupled Demand Forecasting and Supply Chain Optimization


96. Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation


97. Attention-based multiple instance learning for predominant growth pattern prediction in lung adenocarcinoma wsi using foundation models


98. Architectures for Robust Self-Organizing Energy Systems under Information and Control Constraints


99. Generalizing Numerical Reasoning in Table Data through Operation Sketches and Self-Supervised Learning


100. MISTY: High-Throughput Motion Planning via Mixer-based Single-step Drifting


101. Drug Synergy Prediction via Residual Graph Isomorphism Networks and Attention Mechanisms


102. Dynamical Priors as a Training Objective in Reinforcement Learning


103. Reasoning Primitives in Hybrid and Non-Hybrid LLMs


104. VARestorer: One-Step VAR Distillation for Real-World Image Super-Resolution


105. Differentially Private De-identification of Dutch Clinical Notes: A Comparative Evaluation


106. CSC: Turning the Adversary’s Poison against Itself


107. VG-CoT: Towards Trustworthy Visual Reasoning via Grounded Chain-of-Thought


108. Supervised Learning Has a Necessary Geometric Blind Spot: Theory, Consequences, and Minimal Repair


109. From Noise to Intent: Anchoring Generative VLA Policies with Residual Bridges


110. Conjecture and Inquiry: Quantifying Software Performance Requirements via Interactive Retrieval-Augmented Preference Elicitation


111. VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI Automation


112. mcdok at SemEval-2026 Task 13: Finetuning LLMs for Detection of Machine-Generated Code


113. Trust-SSL: Additive-Residual Selective Invariance for Robust Aerial Self-Supervised Learning


114. Beyond Single Plots: A Benchmark for Question Answering on Multi-Charts


115. Understanding and Mitigating Spurious Signal Amplification in Test-Time Reinforcement Learning for Math Reasoning


116. MiMIC: Mitigating Visual Modality Collapse in Universal Multimodal Retrieval While Avoiding Semantic Misalignment


117. The First Challenge on Remote Sensing Infrared Image Super-Resolution at NTIRE 2026: Benchmark Results and Method Overview


118. Adversarial Evasion in Non-Stationary Malware Detection: Minimizing Drift Signals through Similarity-Constrained Perturbations


119. Exploring the Role of Synthetic Data Augmentation in Controllable Human-Centric Video Generation


120. Cross-Entropy Is Load-Bearing: A Pre-Registered Scope Test of the K-Way Energy Probe on Bidirectional Predictive Coding


121. Do LLM Decoders Listen Fairly? Benchmarking How Language Model Priors Shape Bias in Speech Recognition


122. Measure Twice, Click Once: Co-evolving Proposer and Visual Critic via Reinforcement Learning for GUI Grounding


123. Calibeating Prediction-Powered Inference


124. Planning Beyond Text: Graph-based Reasoning for Complex Narrative Generation


125. CAP: Controllable Alignment Prompting for Unlearning in LLMs


126. CorridorVLA: Explicit Spatial Constraints for Generative Action Heads via Sparse Anchors


127. SparKV: Overhead-Aware KV Cache Loading for Efficient On-Device LLM Inference


128. EngramaBench: Evaluating Long-Term Conversational Memory with Structured Graph Retrieval


129. Zero-Shot Detection of LLM-Generated Text via Implicit Reward Model


130. Post-AGI Economies: Autonomy and the First Fundamental Theorem of Welfare Economics


131. SQLyzr: A Comprehensive Benchmark and Evaluation Platform for Text-to-SQL


132. On Reasoning Behind Next Occupation Recommendation


133. How VLAs (Really) Work In Open-World Environments


134. Doubly Saturated Ramsey Graphs: A Case Study in Computer-Assisted Mathematical Discovery


135. Scaling of Gaussian Kolmogorov–Arnold Networks


136. TAPO-Description Logic for Information Behavior: Refined OBoxes, Inference, and Categorical Semantics


137. Adaptive Instruction Composition for Automated LLM Red-Teaming


138. Dialect vs Demographics: Quantifying LLM Bias from Implicit Linguistic Signals vs. Explicit User Profiles


139. Using Machine Mental Imagery for Representing Common Ground in Situated Dialogue


140. Navigating the Clutter: Waypoint-Based Bi-Level Planning for Multi-Robot Systems


141. Enhancing Science Classroom Discourse Analysis through Joint Multi-Task Learning for Reasoning-Component Classification


142. Cross-Session Threats in AI Agents: Benchmark, Evaluation, and Algorithms


143. AGNT2: Autonomous Agent Economies on Interaction-Optimized Layer 2 Infrastructure


144. Materialistic RIR: Material Conditioned Realistic RIR Generation


145. Leveraging Multimodal LLMs for Built Environment and Housing Attribute Assessment from Street-View Imagery


146. TRAVELFRAUDBENCH: A Configurable Evaluation Framework for GNN Fraud Ring Detection in Travel Networks


147. Structural Quality Gaps in Practitioner AI Governance Prompts: An Empirical Study Using a Five-Principle Evaluation Framework


148. Behavioral Consistency and Transparency Analysis on Large Language Model API Gateways


149. Serialisation Strategy Matters: How FHIR Data Format Affects LLM Medication Reconciliation


150. Generative Discovery of Magnetic Insulators under Competing Physical Constraints


151. Expanding the extreme-k dielectric materials space through physics-validated generative reasoning


152. StyleVAR: Controllable Image Style Transfer via Visual Autoregressive Modeling


153. Strategic Polysemy in AI Discourse: A Philosophical Analysis of Language, Hype, and Power


154. Synthetic Data in Education: Empirical Insights from Traditional Resampling and Deep Generative Models


155. A Systematic Review and Taxonomy of Reinforcement Learning-Model Predictive Control Integration for Linear Systems


156. Integrated packing, placement, scheduling, and routing of personalized production: a pharmaceutical Industry 4.0 use-case with a planar transport system


157. A Deep U-Net Framework for Flood Hazard Mapping Using Hydraulic Simulations of the Wupper Catchment


158. Open-H-Embodiment: A Large-Scale Dataset for Enabling Foundation Models in Medical Robotics


159. SGD at the Edge of Stability: The Stochastic Sharpness Gap


160. Breaking MCP with Function Hijacking Attacks: Novel Threats for Function Calling and Agentic Models


161. Differentially Private Model Merging


162. Thinking Like a Botanist: Challenging Multimodal Language Models with Intent-Driven Chain-of-Inquiry


163. LAF-Based Evaluation and UTTL-Based Learning Strategies with MIATTs


164. HARBOR: Automated Harness Optimization


165. Data-Driven Open-Loop Simulation for Digital-Twin Operator Decision Support in Wastewater Treatment


166. IRIS: Interpolative Rényi Iterative Self-play for Large Language Model Fine-Tuning


167. Adaptive Defense Orchestration for RAG: A Sentinel-Strategist Architecture against Multi-Vector Attacks


168. SafeRedirect: Defeating Internal Safety Collapse via Task-Completion Redirection in Frontier LLMs


169. Domain-Aware Hierarchical Contrastive Learning for Semi-Supervised Generalization Fault Diagnosis


170. The Path Not Taken: Duality in Reasoning about Program Execution


171. Absorber LLM: Harnessing Causal Synchronization for Test-Time Training


172. Omission Constraints Decay While Commission Constraints Persist in Long-Context LLM Agents


173. Planetary Exploration 3.0: A Roadmap for Software-Defined, Radically Adaptive Space Systems


174. Biomedical systems biology workflow orchestration and execution with PoSyMed


175. Reinforcing privacy reasoning in LLMs via normative simulacra from fiction


176. Frequency-Forcing: From Scaling-as-Time to Soft Frequency Guidance


177. Predicting Scale-Up of Metal-Organic Framework Syntheses with Large Language Models


178. Watts-per-Intelligence Part II: Algorithmic Catalysis


179. Ternary Memristive Logic: Hardware for Reasoning Realized via Domain Algebra


180. HHL with a Coherent Fourier Oracle: A Proof-of-Concept Quantum Architecture for Joint Melody-Harmony Generation


181. M-CARE: Standardized Clinical Case Reporting for AI Model Behavioral Disorders, with a 20-Case Atlas and Experimental Validation


182. Clinical Reasoning AI for Oncology Treatment Planning: A Multi-Specialty Case-Based Evaluation


183. The AI Criminal Mastermind


184. Preserving Decision Sovereignty in Military AI: A Trade-Secret-Safe Architectural Framework for Model Replaceability, Human Authority, and State Control


185. Deep Interest Mining with Cross-Modal Alignment for SemanticID Generation in Generative Recommendation


186. RealRoute: Dynamic Query Routing System via Retrieve-then-Verify Paradigm


187. KGiRAG: An Iterative GraphRAG Approach for Responding Sensemaking Queries


188. Mixture of Sequence: Theme-Aware Mixture-of-Experts for Long-Sequence Recommendation


189. DiagramBank: A Large-scale Dataset of Diagram Design Exemplars with Paper Metadata for Retrieval-Augmented Generation


190. ERA: Evidence-based Reliability Alignment for Honest Retrieval-Augmented Generation


191. DenoiseRank: Learning to Rank by Diffusion Models


192. Robust Test-time Video-Text Retrieval: Benchmarking and Adapting for Query Shifts


193. Association Is Not Similarity: Learning Corpus-Specific Associations for Multi-Hop Retrieval


194. SPIRE: Structure-Preserving Interpretable Retrieval of Evidence


195. MATRAG: Multi-Agent Transparent Retrieval-Augmented Generation for Explainable Recommendations


196. Revisiting Content-Based Music Recommendation: Efficient Feature Aggregation from Large-Scale Music Models


197. ADS-POI: Agentic Spatiotemporal State Decomposition for Next Point-of-Interest Recommendation


198. CaST-POI: Candidate-Conditioned Spatiotemporal Modeling for Next POI Recommendation


199. AtomicRAG: Atom-Entity Graphs for Retrieval-Augmented Generation


200. The Effect of Idea Elaboration on the Automatic Assessment of Idea Originality


201. Mango: Multi-Agent Web Navigation via Global-View Optimization