전체 AI 논문 - 2026-03-30

1. Stabilizing Rubric Integration Training via Decoupled Advantage Normalization


2. CADSmith: Multi-Agent CAD Generation with Programmatic Geometric Validation


3. AIRA_2: Overcoming Bottlenecks in AI Research Agents


4. GUIDE: Resolving Domain Bias in GUI Agents through Real-Time Web Video Retrieval and Plug-and-Play Annotation


5. Semi-Automated Knowledge Engineering and Process Mapping for Total Airport Management


6. AutoB2G: A Large Language Model-Driven Agentic Framework For Automated Building-Grid Co-Simulation


7. BeSafe-Bench: Unveiling Behavioral Safety Risks of Situated Agents in Functional Environments


8. Ruka-v2: Tendon Driven Open-Source Dexterous Hand with Wrist and Abduction for Robot Learning


9. PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning


10. Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification


11. Make Geometry Matter for Spatial Reasoning


12. Machine Learning Transferability for Malware Detection


13. Think over Trajectories: Leveraging Video Generation to Reconstruct GPS Trajectories from Cellular Signaling


14. Sustainability Is Not Linear: Quantifying Performance, Energy, and Privacy Trade-offs in On-Device Intelligence


15. Evaluating Interactive 2D Visualization as a Sample Selection Strategy for Biomedical Time-Series Data Annotation


16. Generation Is Compression: Zero-Shot Video Coding via Stochastic Rectified Flow


17. Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering


18. When Perplexity Lies: Generation-Focused Distillation of Hybrid Sequence Models


19. Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones


20. The Multi-AMR Buffer Storage, Retrieval, and Reshuffling Problem: Exact and Heuristic Approaches


21. How Open Must Language Models be to Enable Reliable Scientific Inference?


22. ALBA: A European Portuguese Benchmark for Evaluating Language and Linguistic Dimensions in Generative LLMs


23. JAL-Turn: Joint Acoustic-Linguistic Modeling for Real-Time and Robust Turn-Taking Detection in Full-Duplex Spoken Dialogue Systems


24. AMALIA Technical Report: A Fully Open Source Large Language Model for European Portuguese


25. Rocks, Pebbles and Sand: Modality-aware Scheduling for Multimodal Large Language Model Inference


26. Foundation Model for Cardiac Time Series via Masked Latent Attention


27. UNIFERENCE: A Discrete Event Simulation Framework for Developing Distributed AI Models


28. A Boltzmann-machine-enhanced Transformer For DNA Sequence Classification


29. Neuro-Symbolic Process Anomaly Detection


30. Can AI Models Direct Each Other? Organizational Structure as a Probe into Training Limitations


31. CPUBone: Efficient Vision Backbone Design for Devices with Low Parallelization Capabilities


32. KMM-CP: Practical Conformal Prediction under Covariate Shift via Selective Kernel Mean Matching


33. Why Models Know But Don’t Say: Chain-of-Thought Faithfulness Divergence Between Thinking Tokens and Answers in Open-Weight Reasoning Models


34. Generative Modeling in Protein Design: Neural Representations, Conditional Generation, and Evaluation Standards


35. Automated near-term quantum algorithm discovery for molecular ground states


36. Generative Score Inference for Multimodal Data


37. Reflect to Inform: Boosting Multimodal Reasoning via Information-Gain-Driven Verification



39. Mitigating the Reasoning Tax in Vision-Language Fine-Tuning with Input-Adaptive Depth Aggregation


40. PRISMA: Toward a Normative Information Infrastructure for Responsible Pharmaceutical Knowledge Management


41. From Human Cognition to Neural Activations: Probing the Computational Primitives of Spatial Reasoning in LLMs


42. Label-Free Cross-Task LoRA Merging with Null-Space Compression


43. Preference-Aligned LoRA Merging: Preserving Subspace Coverage and Addressing Directional Anisotropy


44. findsylls: A Language-Agnostic Toolkit for Syllable-Level Speech Tokenization and Embedding


45. PhysVid: Physics Aware Local Conditioning for Generative Video Models


46. Knowdit: Agentic Smart Contract Vulnerability Detection with Auditing Knowledge Summarization


47. GeoGuide: Hierarchical Geometric Guidance for Open-Vocabulary 3D Semantic Segmentation


48. Working Notes on Late Interaction Dynamics: Analyzing Targeted Behaviors of Late Interaction Models


49. ARTA: Adaptive Mixed-Resolution Token Allocation for Efficient Dense Feature Extraction


50. Channelling, Coordinating, Collaborating: A Three-Layer Framework for Disability-Centered Human-Agent Collaboration


51. Automatic Speech Recognition for Documenting Endangered Languages: Case Study of Ikema Miyakoan


52. Distilling Conversations: Abstract Compression of Conversational Audio Context for LLM-based ASR


53. Physics-Informed Neural Networks and Sequence Encoder: Application to heating and early cooling of thermo-stamping process


54. Automating Domain-Driven Design: Experience with a Prompting Framework


55. Clawed and Dangerous: Can We Trust Open Agentic Systems?


56. Towards GUI Agents: Vision-Language Diffusion Models for GUI Grounding


57. Sparse Auto-Encoders and Holism about Large Language Models


58. An Object Web Seminar: A Retrospective on a Technical Dialogue Still Reverbarating


59. MemCam: Memory-Augmented Camera Control for Consistent Video Generation


60. Progressive Learning with Anatomical Priors for Reliable Left Atrial Scar Segmentation from Late Gadolinium Enhancement MRI


61. On the Complexity of Optimal Graph Rewiring for Oversmoothing and Oversquashing in Graph Neural Networks


62. ATime-Consistent Benchmark for Repository-Level Software Engineering Evaluation


63. SWE-PRBench: Benchmarking AI Code Review Quality Against Pull Request Feedback


64. Finding Distributed Object-Centric Properties in Self-Supervised Transformers


65. SkinGPT-X: A Self-Evolving Collaborative Multi-Agent System for Transparent and Trustworthy Dermatological Diagnosis


66. DPD-Cancer: Explainable Graph-based Deep Learning for Small Molecule Anti-Cancer Activity Prediction


67. “Oops! ChatGPT is Temporarily Unavailable!”: A Diary Study on Knowledge Workers’ Experiences of LLM Withdrawal


68. A Human-Inspired Decoupled Architecture for Efficient Audio Representation Learning


69. Dynamic Tokenization via Reinforcement Patching: End-to-end Training and Zero-shot Transfer


70. Selective Deficits in LLM Mental Self-Modeling in a Behavior-Based Test of Theory of Mind


71. When Identities Collapse: A Stress-Test Benchmark for Multi-Subject Personalization


72. R-PGA: Robust Physical Adversarial Camouflage Generation via Relightable 3D Gaussian Splatting


73. MuDD: A Multimodal Deception Detection Dataset and GSR-Guided Progressive Distillation for Non-Contact Deception Detection


74. Bridging Pixels and Words: Mask-Aware Local Semantic Fusion for Multimodal Media Verification


75. Seeing Like Radiologists: Context- and Gaze-Guided Vision-Language Pretraining for Chest X-rays


76. H-Node Attack and Defense in Large Language Models


77. Designing Fatigue-Aware VR Interfaces via Biomechanical Models


78. Unlabeled Cross-Center Automatic Analysis for TAAD: An Integrated Framework from Segmentation to Clinical Features


79. VLAgeBench: Benchmarking Large Vision-Language Models for Zero-Shot Human Age Estimation


80. FairLLaVA: Fairness-Aware Parameter-Efficient Fine-Tuning for Large Vision-Language Assistants


81. Longitudinal Boundary Sharpness Coefficient Slopes Predict Time to Alzheimer’s Disease Conversion in Mild Cognitive Impairment: A Survival Analysis Using the ADNI Cohort


82. Policy-Guided World Model Planning for Language-Conditioned Visual Navigation


83. Do Neurons Dream of Primitive Operators? Wake-Sleep Compression Rediscovers Schank’s Event Semantics


84. When Chain-of-Thought Backfires: Evaluating Prompt Sensitivity in Medical Language Models


85. Collision-Aware Vision-Language Learning for End-to-End Driving with Multimodal Infraction Datasets



87. Reinforcing Structured Chain-of-Thought for Video Understanding


88. DenseSwinV2: Channel Attentive Dual Branch CNN Transformer Learning for Cassava Leaf Disease Classification


89. DiReCT: Disentangled Regularization of Contrastive Trajectories for Physics-Refined Video Generation


90. Good Scores, Bad Data: A Metric for Multimodal Coherence


91. Decoding Defensive Coverage Responsibilities in American Football Using Factorized Attention Based Transformer Models


92. On Integrating Resilience and Human Oversight into LLM-Assisted Modeling Workflows for Digital Twins


93. Spectral Coherence Index: A Model-Free Metric for Protein Structural Ensemble Quality Assessment


94. GUIDE: A Benchmark for Understanding and Assisting Users in Open-Ended GUI Tasks


95. Dynamic LIBRAS Gesture Recognition via CNN over Spatiotemporal Matrix Representation


96. Methods for Knowledge Graph Construction from Text Collections: Development and Applications


97. Why Safety Probes Catch Liars But Miss Fanatics


98. GazeQwen: Lightweight Gaze-Conditioned LLM Modulation for Streaming Video Understanding


99. A Compression Perspective on Simplicity Bias


100. ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners?


101. Doctorina MedBench: End-to-End Evaluation of Agent-Based Medical AI


102. MAGNET: Autonomous Expert Model Generation via Decentralized Autoresearch and BitNet Training


103. Beyond identifiability: Learning causal representations with few environments and finite samples


104. Pure and Physics-Guided Deep Learning Solutions for Spatio-Temporal Groundwater Level Prediction at Arbitrary Locations


105. Challenges and opportunities for AI to help deliver fusion energy


106. Empowering Epidemic Response: The Role of Reinforcement Learning in Infectious Disease Control


107. ReCUBE: Evaluating Repository-Level Context Utilization in Code Generation


108. IncreRTL: Traceability-Guided Incremental RTL Generation under Requirement Evolution


109. UCAgent: An End-to-End Agent for Block-Level Functional Verification


110. Unlocking Strong Supervision: A Data-Centric Study of General-Purpose Audio Pre-Training Methods


111. ETA-VLA: Efficient Token Adaptation via Temporal Fusion and Intra-LLM Sparsification for Vision-Language-Action Models


112. Consistency Amplifies: How Behavioral Variance Shapes Agent Accuracy


113. CANGuard: A Spatio-Temporal CNN-GRU-Attention Hybrid Architecture for Intrusion Detection in In-Vehicle CAN Networks


114. A-SelecT: Automatic Timestep Selection for Diffusion Transformer Representation Learning


115. Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models


116. A Lightweight, Transferable, and Self-Adaptive Framework for Intelligent DC Arc-Fault Detection in Photovoltaic Systems


117. DesignWeaver: Dimensional Scaffolding for Text-to-Image Product Design