전체 AI 논문 - 2026-02-25

1. Aletheia tackles FirstProof autonomously


2. NoRD: A Data-Efficient Vision-Language-Action Model that Drives without Reasoning


3. CG-DMER: Hybrid Contrastive-Generative Framework for Disentangled Multimodal ECG Representation Learning


4. A Benchmark for Deep Information Synthesis


5. The Initial Exploration Problem in Knowledge Graph Exploration


6. Motivation is Something You Need


7. Tool Building as a Path to “Superintelligence”


8. LogicGraph : Benchmarking Multi-Path Logical Reasoning via Neuro-Symbolic Generation and Verification


9. Architecting AgentOS: From Token-Level Context to Emergent System-Level Intelligence


10. HELP: HyperNode Expansion and Logical Path-Guided Evidence Localization for Accurate and Efficient GraphRAG


11. Predicting Sentence Acceptability Judgments in Multimodal Contexts


12. Diagnosing Causal Reasoning in Vision-Language Models via Structured Relevance Graphs


13. Pressure Reveals Character: Behavioural Alignment Evaluation at Depth


14. Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset


15. POMDPPlanners: Open-Source Package for POMDP Planning


16. Pipeline for Verifying LLM-Generated Mathematical Solutions


17. PyVision-RL: Forging Open Agentic Vision Models via RL


18. CHESS: Context-aware Hierarchical Efficient Semantic Selection for Long-Context LLM Inference


19. Balancing Multiple Objectives in Urban Traffic Control with Reinforcement Learning from AI Feedback


20. Modality-Guided Mixture of Graph Experts with Entropy-Triggered Routing for Multimodal Recommendation


21. Buffer Matters: Unleashing the Power of Off-Policy Reinforcement Learning in Large Language Model Reasoning


22. Counterfactual Simulation Training for Chain-of-Thought Faithfulness


23. ICON: Indirect Prompt Injection Defense for Agents based on Inference-Time Correction


24. Online Algorithms with Unreliable Guidance


25. PromptCD: Test-Time Behavior Enhancement via Polarity-Prompt Contrastive Decoding


26. How Foundational Skills Influence VLM-based Embodied Agents:A Native Perspective


27. Recursive Belief Vision Language Model


28. Grounding LLMs in Scientific Discovery via Embodied Actions


29. Identifying two piecewise linear additive value functions from anonymous preference information


30. When can we trust untrusted monitoring? A safety case sketch across collusion strategies


31. Physics-based phenomenological characterization of cross-modal bias in multimodal models


32. CausalReasoningBenchmark: A Real-World Benchmark for Disentangled Evaluation of Causal Identification and Estimation


33. From Logs to Language: Learning Optimal Verbalization for LLM-Based Recommendation in Production


34. Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination


35. ActionEngine: From Reactive to Programmatic GUI Agents via State Machine Memory


36. KairosVL: Orchestrating Time Series and Semantics for Unified Reasoning


37. PreScience: A Benchmark for Forecasting Scientific Contributions


38. Learning to Rewrite Tool Descriptions for Reliable LLM-Agent Tool Use


39. Implicit Intelligence – Evaluating Agents on What Users Don’t Say


40. Diffusion Modulation via Environment Mechanism Modeling for Planning


41. DMCD: Semantic-Statistical Framework for Causal Discovery


42. An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes using large language models


43. Multilevel Determinants of Overweight and Obesity Among U.S. Children Aged 10-17: Comparative Evaluation of Statistical and Machine Learning Approaches Using the 2021 National Survey of Children’s Health


44. Test-Time Training with KV Binding Is Secretly Linear Attention


45. Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs


46. Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training


47. XMorph: Explainable Brain Tumor Analysis Via LLM-Assisted Hybrid Deep Intelligence


48. Efficient Hierarchical Any-Angle Path Planning on Multi-Resolution 3D Grids


49. PVminer: A Domain-Specific Tool to Detect the Patient Voice in Patient Generated Data


50. SparkMe: Adaptive Semi-Structured Interviewing for Qualitative Insight Discovery


51. “Are You Sure?”: An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems


52. Cooperative-Competitive Team Play of Real-World Craft Robots


53. Attention-Based SINR Estimation in User-Centric Non-Terrestrial Networks


54. Probing Graph Neural Network Activation Patterns Through Graph Topology


55. Localized Dynamics-Aware Domain Adaption for Off-Dynamics Offline Reinforcement Learning


56. VAUQ: Vision-Aware Uncertainty Quantification for LVLM Self-Evaluation


57. Position-Aware Sequential Attention for Accurate Next Item Recommendations


58. MIP Candy: A Modular PyTorch Framework for Medical Image Processing


59. Multimodal MRI Report Findings Supervised Brain Lesion Segmentation with Substructures


60. Echoes Over Time: Unlocking Length Generalization in Video-to-Audio Generation Models


61. CrystaL: Spontaneous Emergence of Visual Latents in MLLMs


62. Toward an Agentic Infused Software Ecosystem


63. Does Order Matter : Connecting The Law of Robustness to Robust Generalization


64. Training-Free Intelligibility-Guided Observation Addition for Noisy ASR


65. EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations


66. See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis


67. Some Simple Economics of AGI


68. The Art of Efficient Reasoning: Data, Reward, and Optimization


69. Airavat: An Agentic Framework for Internet Measurement


70. E-MMKGR: A Unified Multimodal Knowledge Graph Framework for E-commerce Applications


71. SoK: Agentic Skills – Beyond Tool Use in LLM Agents


72. Regret-Guided Search Control for Efficient Learning in AlphaZero


73. OrthoDiffusion: A Generalizable Multi-Task Diffusion Foundation Model for Musculoskeletal MRI Interpretation


74. SibylSense: Adaptive Rubric Learning via Memory Tuning and Adversarial Probing


75. Voices of the Mountains: Deep Learning-Based Vocal Error Detection System for Kurdish Maqams


76. RMIT-ADM+S at the MMU-RAG NeurIPS 2025 Competition


77. Communication-Inspired Tokenization for Structured Image Representations


78. AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMs


79. Onboard-Targeted Segmentation of Straylight in Space Camera Sensors


80. Agile V: A Compliance-Ready Framework for AI-Augmented Engineering – From Concept to Audit-Ready Delivery


81. UrbanFM: Scaling Urban Spatio-Temporal Foundation Models


82. PRECTR-V2:Unified Relevance-CTR Framework with Cross-User Preference Mining, Exposure Bias Correction, and LLM-Distilled Encoder Optimization


83. CAMEL: Confidence-Gated Reflection for Reward Modeling


84. Vision-Language Models for Ergonomic Assessment of Manual Lifting Tasks: Estimating Horizontal and Vertical Hand Distances from RGB Video


85. Dataset Color Quantization: A Training-Oriented Framework for Dataset-Level Compression


86. TrajGPT-R: Generating Urban Mobility Trajectory with Reinforcement Learning-Enhanced Generative Pre-trained Transformer


87. SurgAtt-Tracker: Online Surgical Attention Tracking via Temporal Proposal Reranking and Motion-Aware Refinement


88. Enhancing Hate Speech Detection on Social Media: A Comparative Analysis of Machine Learning Models and Text Transformation Approaches


89. OptiLeak: Efficient Prompt Reconstruction via Reinforcement Learning in Multi-tenant LLM Services


90. Personal Information Parroting in Language Models


91. What Drives Students’ Use of AI Chatbots? Technology Acceptance in Conversational AI


92. Maximin Share Guarantees via Limited Cost-Sensitive Sharing


93. Actor-Curator: Co-adaptive Curriculum Learning via Policy-Improvement Bandits for RL Post-Training


94. A Generalized Apprenticeship Learning Framework for Capturing Evolving Student Pedagogical Strategies


95. How Do Inpainting Artifacts Propagate to Language?


96. LESA: Learnable Stage-Aware Predictors for Diffusion Model Acceleration


97. Wireless Federated Multi-Task LLM Fine-Tuning via Sparse-and-Orthogonal LoRA


98. Hybrid LLM-Embedded Dialogue Agents for Learner Reflection: Designing Responsive and Theory-Driven Interactions


99. VINA: Variational Invertible Neural Architectures


100. Elimination-compensation pruning for fully-connected neural networks


101. Protein Language Models Diverge from Natural Language: Comparative Analysis and Improved Inference


102. Imputation of Unknown Missingness in Sparse Electronic Health Records


103. Examining and Addressing Barriers to Diversity in LLM-Generated Ideas


104. Three Concrete Challenges and Two Hopes for the Safety of Unsupervised Elicitation


105. Case-Aware LLM-as-a-Judge Evaluation for Enterprise-Scale RAG Systems


106. Learning During Detection: Continual Learning for Neural OFDM Receivers via DMRS


107. Hierarchical Molecular Representation Learning via Fragment-Based Self-Supervised Embedding Prediction


108. No One Size Fits All: QueryBandits for Hallucination Mitigation


109. Circuit Tracing in Vision-Language Models: Understanding the Internal Mechanisms of Multimodal Thinking


110. Learning Physical Principles from Interaction: Self-Evolving Planning via Test-Time Memory


111. Fast Spectrogram Event Extraction via Offline Self-Supervised Learning: From Fusion Diagnostics to Bioacoustics


112. Shape-informed cardiac mechanics surrogates in data-scarce regimes via geometric encoding and generative augmentation


113. What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance


114. InterviewSim: A Scalable Framework for Interview-Grounded Personality Simulation


115. Quantifying the Expectation-Realisation Gap for Agentic AI Systems


116. Uncertainty-Aware Delivery Delay Duration Prediction via Multi-Task Deep Learning


117. Exploring Anti-Aging Literature via ConvexTopics and Large Language Models


118. MultiModalPFN: Extending Prior-Data Fitted Networks for Multimodal Tabular Learning


119. What Matters for Simulation to Online Reinforcement Learning on Real Robots


120. An Approach to Combining Video and Speech with Large Language Models in Human-Robot Interaction


121. KnapSpec: Self-Speculative Decoding via Adaptive Layer Selection as a Knapsack Problem


122. Right to History: A Sovereignty Kernel for Verifiable AI Agent Execution


123. CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions


124. Multimodal Crystal Flow: Any-to-Any Modality Generation for Unified Crystal Modeling


125. Model Merging in the Essential Subspace


126. Golden Layers and Where to Find Them: Improved Knowledge Editing for Large Language Models Via Layer Gradient Analysis


127. Mitigating “Epistemic Debt” in Generative AI-Scaffolded Novice Programming using Metacognitive Scripts


128. Analyzing Latency Hiding and Parallelism in an MLIR-based AI Kernel Compiler


129. Evaluating the Reliability of Digital Forensic Evidence Discovered by Large Language Model: A Case Study


130. Global Prior Meets Local Consistency: Dual-Memory Augmented Vision-Language-Action Model for Efficient Robotic Manipulation


131. IMOVNO+: A Regional Partitioning and Meta-Heuristic Ensemble Framework for Imbalanced Multi-Class Learning


132. Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning


133. OpenPort Protocol: A Security Governance Specification for AI Agent Tool Access


134. When Backdoors Go Beyond Triggers: Semantic Drift in Diffusion Models Under Encoder Attacks


135. MoBiQuant: Mixture-of-Bits Quantization for Token-Adaptive Elastic LLMs


136. AINet: Anchor Instances Learning for Regional Heterogeneity in Whole Slide Image


137. Closing the Expertise Gap in Residential Building Energy Retrofits: A Domain-Specific LLM for Informed Decision-Making


138. Enhancing Heat Sink Efficiency in MOSFETs using Physics Informed Neural Networks: A Systematic Study on Coolant Velocity Estimation


139. CAGE: A Framework for Culturally Adaptive Red-Teaming Benchmark Generation


140. Autonomous AI and Ownership Rules


141. Benchmarking Early Deterioration Prediction Across Hospital-Rich and MCI-Like Emergency Triage Under Constrained Sensing


142. ConceptRM: The Quest to Mitigate Alert Fatigue through Consensus-Based Purity-Driven Data Cleaning for Reflection Modelling


143. Talking to Yourself: Defying Forgetting in Large Language Models



145. Interpretable Medical Image Classification using Prototype Learning and Privileged Information


146. ShaRP: Shape-Regularized Multidimensional Projections