전체 AI 논문 - 2026-04-16

1. TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration


2. Hierarchical Reinforcement Learning with Runtime Safety Shielding for Power Grid Operation


3. Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents


4. Reward Design for Physical Reasoning in Vision-Language Models


5. [Emerging Ideas] Artificial Tripartite Intelligence: A Bio-Inspired, Sensor-First Architecture for Physical AI


6. AI-Assisted Peer Review at Scale: The AAAI-26 AI Review Pilot


7. GeoAgentBench: A Dynamic Execution Benchmark for Tool-Augmented Agents in Spatial Analysis


8. AlphaCNOT: Learning CNOT Minimization with Model-Based Planning


9. The cognitive companion: a lightweight parallel monitoring architecture for detecting and recovering from reasoning degradation in LLM agents


10. Rethinking AI Hardware: A Three-Layer Cognitive Architecture for Autonomous Agents


11. Weight Patching: Toward Source-Level Mechanistic Localization in LLMs


12. RiskWebWorld: A Realistic Interactive Benchmark for GUI Agents in E-commerce Risk Management


13. Towards Scalable Lightweight GUI Agents via Multi-role Orchestration


14. Quantifying and Understanding Uncertainty in Large Reasoning Models


15. ReSS: Learning Reasoning Models for Tabular Data Prediction via Symbolic Scaffold


16. Listening Alone, Understanding Together: Collaborative Context Recovery for Privacy-Aware AI


17. WebXSkill: Skill Learning for Autonomous Web Agents


18. Optimizing Earth Observation Satellite Schedules under Unknown Operational Constraints: An Active Constraint Acquisition Approach


19. Numerical Instability and Chaos: Quantifying the Unpredictability of Large Language Models


20. SciFi: A Safe, Lightweight, User-Friendly, and Fully Autonomous Agentic AI Workflow for Scientific Applications


21. Exploration and Exploitation Errors Are Measurable for Language Model Agents


22. From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space


23. LongCoT: Benchmarking Long-Horizon Chain-of-Thought Reasoning


24. From Feelings to Metrics: Understanding and Formalizing How Users Vibe-Test LLMs


25. Rhetorical Questions in LLM Representations: A Linear Probing Study


26. HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System


27. UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding


28. UMI-3D: Extending Universal Manipulation Interface from Vision-Limited to 3D Spatial Perception


29. TIP: Token Importance in On-Policy Distillation


30. First-See-Then-Design: A Multi-Stakeholder View for Optimal Performance-Fairness Trade-Offs



32. Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective


33. MAny: Merge Anything for Multimodal Continual Instruction Tuning


34. Towards Multi-Object-Tracking with Radar on a Fast Moving Vehicle: On the Potential of Processing Radar in the Frequency Domain


35. Diffusion Language Models for Speech Recognition


36. Adaptive Conformal Prediction for Improving Factuality of Generations by Large Language Models


37. Leveraging LLM-GNN Integration for Open-World Question Answering over Knowledge Graphs


38. How Can We Synthesize High-Quality Pretraining Data? A Systematic Study of Prompt Design, Generator Model, and Source Data


39. Creo: From One-Shot Image Generation to Progressive, Co-Creative Ideation


40. HINTBench: Horizon-agent Intrinsic Non-attack Trajectory Benchmark


41. ASTER: Latent Pseudo-Anomaly Generation for Unsupervised Time-Series Anomaly Detection


42. Do We Still Need Humans in the Loop? Comparing Human and LLM Annotation in Active Learning for Hostility Detection


43. Beyond Conservative Automated Driving in Multi-Agent Scenarios via Coupled Model Predictive Control and Deep Reinforcement Learning


44. Evaluating Supervised Machine Learning Models: Principles, Pitfalls, and Metric Selection


45. MCPThreatHive: Automated Threat Intelligence for Model Context Protocol Ecosystems


46. SparseBalance: Load-Balanced Long Context Training with Dynamic Sparse Attention


47. Sentiment analysis for software engineering: How far can zero-shot learning (ZSL) go?


48. Cognitive Offloading in Agile Teams: How Artificial Intelligence Reshapes Risk Assessment and Planning Quality


49. Gaslight, Gatekeep, V1-V3: Early Visual Cortex Alignment Shields Vision-Language Models from Sycophantic Manipulation


50. Soft $Q(λ)$: A multi-step off-policy method for entropy regularised reinforcement learning using eligibility traces


51. From Anchors to Supervision: Memory-Graph Guided Corpus-Free Unlearning for Large Language Models


52. A Dynamic-Growing Fuzzy-Neuro Controller, Application to a 3PSP Parallel Robot


53. TokenFormer: Unify the Multi-Field and Sequential Recommendation Worlds


54. Jump-Start Reinforcement Learning with Vision-Language-Action Regularization


55. FRAGATA: Semantic Retrieval of HPC Support Tickets via Hybrid RAG over 20 Years of Request Tracker History


56. Towards Fine-grained Temporal Perception: Post-Training Large Audio-Language Models with Audio-Side Time Prompt


57. Beyond Arrow’s Impossibility: Fairness as an Emergent Property of Multi-Agent Collaboration


58. MIND: AI Co-Scientist for Material Research


59. Med-CAM: Minimal Evidence for Explaining Medical Decision Making


60. Beyond Voxel 3D Editing: Learning from 3D Masks and Self-Constructed Data


61. IndicDB – Benchmarking Multilingual Text-to-SQL Capabilities in Indian Languages


62. Automatically Inferring Teachers’ Geometric Content Knowledge: A Skills Based Approach


63. Ordinary Least Squares is a Special Case of Transformer


64. A Mechanistic Analysis of Sim-and-Real Co-Training in Generative Robot Policies


65. SafeHarness: Lifecycle-Integrated Security Architecture for LLM-based Agent Deployment


66. Syn-TurnTurk: A Synthetic Dataset for Turn-Taking Prediction in Turkish Dialogues


67. Golden Handcuffs make safer AI agents


68. Design Space Exploration of Hybrid Quantum Neural Networks for Chronic Kidney Disease



70. Comparison of window shapes and lengths in short-time feature extraction for classification of heart sound signals


71. UHR-BAT: Budget-Aware Token Compression Vision-Language model for Ultra-High-Resolution Remote Sensing


72. CLIP Architecture for Abdominal CT Image-Text Alignment and Zero-Shot Learning: Investigating Batch Composition and Data Scaling


73. Training-Free Test-Time Contrastive Learning for Large Language Models


74. Free Lunch for Unified Multimodal Models: Enhancing Generation via Reflective Rectification with Inherent Understanding


75. C-voting: Confidence-Based Test-Time Voting without Explicit Energy Functions


76. From Alignment to Prediction: A Study of Self-Supervised Learning and Predictive Representation Learning


77. Representation over Routing: Overcoming Surrogate Hacking in Multi-Timescale PPO


78. SFT-GRPO Data Overlap as a Post-Training Hyperparameter for Autoformalization


79. Chain of Uncertain Rewards with Large Language Models for Reinforcement Learning


80. Monthly Diffusion v0.9: A Latent Diffusion Model for the First AI-MIP


81. Secure and Privacy-Preserving Vertical Federated Learning


82. Bridging MARL to SARL: An Order-Independent Multi-Agent Transformer via Latent Consensus


83. Functional Emotions or Situational Contexts? A Discriminating Test from the Mythos Preview System Card


84. Learning from Change: Predictive Models for Incident Prevention in a Regulated IT Environment


85. From Order to Distribution: A Spectral Characterization of Forgetting in Continual Learning


86. Asymmetric-Loss-Guided Hybrid CNN-BiLSTM-Attention Model for Industrial RUL Prediction with Interpretable Failure Heatmaps


87. Outperforming Self-Attention Mechanisms in Solar Irradiance Forecasting via Physics-Guided Neural Networks


88. A Study of Failure Modes in Two-Stage Human-Object Interaction Detection


89. A KL Lens on Quantization: Fast, Forward-Only Sensitivity for Mixed-Precision SSM-Transformer Models


90. MaMe & MaRe: Matrix-Based Token Merging and Restoration for Efficient Visual Perception and Synthesis


91. A Unified Conditional Flow for Motion Generation, Editing, and Intra-Structural Retargeting


92. Event-Adaptive State Transition and Gated Fusion for RGB-Event Object Tracking


93. MERRIN: A Benchmark for Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments


94. The Cognitive Circuit Breaker: A Systems Engineering Framework for Intrinsic AI Reliability


95. DF3DV-1K: A Large-Scale Dataset and Benchmark for Distractor-Free Novel View Synthesis


96. Minimax Optimality and Spectral Routing for Majority-Vote Ensembles under Markov Dependence


97. From Prediction to Justification: Aligning Sentiment Reasoning with Human Rationale via Reinforcement Learning


98. On the Use of Evolutionary Optimization for the Dynamic Chance Constrained Open-Pit Mine Scheduling Problem


99. Young people’s perceptions and recommendations for conversational generative artificial intelligence in youth mental health


100. A 3D SAM-Based Progressive Prompting Framework for Multi-Task Segmentation of Radiotherapy-induced Normal Tissue Injuries in Limited-Data Settings


101. Peer-Predictive Self-Training for Language Model Reasoning


102. Finetuning-Free Diffusion Model with Adaptive Constraint Guidance for Inorganic Crystal Structure Generation


103. Beyond Uniform Sampling: Synergistic Active Learning and Input Denoising for Robust Neural Operators


104. Can Cross-Layer Transcoders Replace Vision Transformer Activations? An Interpretable Perspective on Vision



106. English is Not All You Need: Systematically Exploring the Role of Multilinguality in LLM Post-Training


107. L2D-Clinical: Learning to Defer for Adaptive Model Selection in Clinical Text Classification


108. Explainable Fall Detection for Elderly Care via Temporally Stable SHAP in Skeleton-Based Human Activity Recognition


109. Rethinking Uncertainty in Segmentation: From Estimation to Decision


110. Hessian-Enhanced Token Attribution (HETA): Interpreting Autoregressive LLMs


111. Out of Context: Reliability in Multimodal Anomaly Detection Requires Contextual Inference


112. GeoVision-Enabled Digital Twin for Hybrid Autonomous-Teleoperated Medical Responses


113. 4th Workshop on Maritime Computer Vision (MaCVi): Challenge Overview


114. Lazy or Efficient? Towards Accessible Eye-Tracking Event Detection Using LLMs


115. On the Creativity of AI Agents


116. SemiFA: An Agentic Multi-Modal Framework for Autonomous Semiconductor Failure Analysis Report Generation


117. KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs


118. Identifiability of Potentially Degenerate Gaussian Mixture Models With Piecewise Affine Mixing


119. Multitasking Embedding for Embryo Blastocyst Grading Prediction (MEmEBG)


120. Inclusive Kitchen Design for Older Adults: Generative AI Visualizations to Support Mild Cognitive Impairment


121. InfiniteScienceGym: An Unbounded, Procedurally-Generated Benchmark for Scientific Analysis


122. Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization


123. Graph Propagated Projection Unlearning: A Unified Framework for Vision and Audio Discriminative Models


124. Spectral Entropy Collapse as an Empirical Signature of Delayed Generalisation in Grokking


125. AgentForge: Execution-Grounded Multi-Agent LLM Framework for Autonomous Software Engineering


126. The Code Whisperer: LLM and Graph-Based AI for Smell and Vulnerability Resolution


127. Applying an Agentic Coding Tool for Improving Published Algorithm Implementations


128. Formal Architecture Descriptors as Navigation Primitives for AI Coding Agents


129. Can Coding Agents Be General Agents?


130. CCCE: A Continuous Code Calibration Engine for Autonomous Enterprise Codebase Maintenance via Knowledge Graph Traversal and Adaptive Decision Gating


131. Building Trust in the Skies: A Knowledge-Grounded LLM-based Framework for Aviation Safety


132. Contract-Coding: Towards Repo-Level Generation via Structured Symbolic Paradigm


133. ECM Contracts: Contract-Aware, Versioned, and Governable Capability Interfaces for Embodied Agents


134. Design Conditions for Intra-Group Learning of Sequence-Level Rewards: Token Gradient Cancellation


135. Adaptive Memory Crystallization for Autonomous AI Agent Learning in Dynamic Environments


136. The Long Delay to Arithmetic Generalization: When Learned Representations Outrun Behavior


137. Sparse Goodness: How Selective Measurement Transforms Forward-Forward Learning


138. Alignment as Institutional Design: From Behavioral Correction to Transaction Structure in Intelligent Systems


139. Document-tuning for robust alignment to animals


140. DeEscalWild: A Real-World Benchmark for Automated De-Escalation Training with SLMs


141. OmniTrace: A Unified Framework for Generation-Time Attribution in Omni-Modal LLMs


142. LiveClawBench: Benchmarking LLM Agents on Complex, Real-World Assistant Tasks


143. EVE: A Domain-Specific LLM Framework for Earth Intelligence


144. Curation of a Palaeohispanic Dataset for Machine Learning


145. Lossless Prompt Compression via Dictionary-Encoding and In-Context Learning: Enabling Cost-Effective LLM Analysis of Repetitive Data


146. Correct Chains, Wrong Answers: Dissociating Reasoning from Output in LLM Logic


147. Bi-Predictability: A Real-Time Signal for Monitoring LLM Interaction Integrity


148. A Proactive EMR Assistant for Doctor-Patient Dialogue: Streaming ASR, Belief Stabilization, and Preliminary Controlled Evaluation


149. Text-as-Signal: Quantitative Semantic Scoring with Embeddings, Logprobs, and Noise Reduction


150. WorkRB: A Community-Driven Evaluation Framework for AI in the Work Domain


151. Caption First, VQA Second: Knowledge Density, Not Task Format, Drives Multimodal Scaling


152. Form Without Function: Agent Social Behavior in the Moltbook Network


153. Hijacking online reviews: sparse manipulation and behavioral buffering in popularity-biased rating systems


154. From Natural Language to PromQL: A Catalog-Driven Framework with Dynamic Temporal Resolution for Cloud-Native Observability


155. Integration of Deep Reinforcement Learning and Agent-based Simulation to Explore Strategies Counteracting Information Disorder


156. A Pythonic Functional Approach for Semantic Data Harmonisation in the ILIAD Project


157. TableNet A Large-Scale Table Dataset with LLM-Powered Autonomous


158. OVT-MLCS: An Online Visual Tool for MLCS Mining from Long or Big Sequences


159. When Reasoning Models Hurt Behavioral Simulation: A Solver-Sampler Mismatch in Multi-Agent LLM Negotiation