전체 AI 논문 - 2025-09-29

1. Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective


2. Dynamic Experts Search: Enhancing Reasoning in Mixture-of-Experts LLMs at Test Time


3. UniMIC: Token-Based Multimodal Interactive Coding for Human-AI Collaboration


4. StepORLM: A Self-Evolving Framework With Generative Process Supervision For Operations Research Language Models


5. The Emergence of Altruism in Large-Language-Model Agents Society


6. REMA: A Unified Reasoning Manifold Framework for Interpreting Large Language Model


7. TrueGradeAI: Retrieval-Augmented and Bias-Resistant AI for Transparent and Explainable Digital Assessments


8. Estimating the Empowerment of Language Model Agents


9. InfiAgent: Self-Evolving Pyramid Agent Framework for Infinite Scenarios


10. GeoSketch: A Neural-Symbolic Approach to Geometric Multimodal Reasoning with Auxiliary Line Construction and Affine Transformation


11. Guiding Evolution of Artificial Life Using Vision-Language Models


12. EMMA: Generalizing Real-World Robot Manipulation via Generative Visual Transfer


13. Do LLM Agents Know How to Ground, Recover, and Assess? A Benchmark for Epistemic Competence in Information-Seeking Agents


14. PRIME: Planning and Retrieval-Integrated Memory for Enhanced Reasoning


15. Large Language Models as Nondeterministic Causal Models


16. Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models


17. InfiMed-Foundation: Pioneering Advanced Multimodal Medical Models with Compute-Efficient Pre-Training and Multi-Stage Fine-Tuning


18. Evaluating LLMs for Combinatorial Optimization: One-Phase and Two-Phase Heuristics for 2D Bin-Packing


19. Clinical Uncertainty Impacts Machine Learning Evaluations


20. Log2Plan: An Adaptive GUI Automation Framework Integrated with Task Mining Approach


21. Ground-Truthing AI Energy Consumption: Validating CodeCarbon Against External Measurements


22. Generalizing Multi-Objective Search via Objective-Aggregation Functions


23. A2R: An Asymmetric Two-Stage Reasoning Framework for Parallel Reasoning


24. The Thinking Spectrum: An Emperical Study of Tunable Reasoning in LLMs through Model Merging


25. GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments


26. Bilinear relational structure fixes reversal curse and enables consistent model editing


27. RISK: A Framework for GUI Agents in E-commerce Risk Management


28. CoBel-World: Harnessing LLM Reasoning to Build a Collaborative Belief World for Optimizing Embodied Multi-Agent Collaboration


29. Outlier Detection in Plantar Pressure: Human-Centered Comparison of Statistical Parametric Mapping and Explainable Machine Learning


30. DyRo-MCTS: A Robust Monte Carlo Tree Search Approach to Dynamic Job Shop Scheduling


31. GenesisGeo: Technical Report


32. TRACE: Learning to Compute on Graphs


33. Reimagining Agent-based Modeling with Large Language Model Agents via Shachi


34. DeepTravel: An End-to-End Agentic Reinforcement Learning Framework for Autonomous Travel Planning Agents


35. Axiomatic Choice and the Decision-Evaluation Paradox


36. DS-STAR: Data Science Agent via Iterative Planning and Verification


37. ProRe: A Proactive Reward System for GUI Agents via Reasoner-Actor Collaboration


38. D-Artemis: A Deliberative Cognitive Framework for Mobile GUI Multi-Agents


39. Benchmarking MLLM-based Web Understanding: Reasoning, Robustness and Safety


40. UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios


41. Lifelong Learning with Behavior Consolidation for Vehicle Routing


42. Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts


43. Align2Speak: Improving TTS for Low Resource Languages via ASR-Guided Online Preference Optimization


44. Can AI Perceive Physical Danger and Intervene?


45. Semantic F1 Scores: Fair Evaluation Under Fuzzy Class Boundaries


46. Automated and Interpretable Survival Analysis from Multimodal Data


47. GeoEvolve: Automating Geospatial Model Discovery via Multi-Agent Large Language Models


48. EEG-Based Consumer Behaviour Prediction: An Exploration from Classical Machine Learning to Graph Neural Networks


49. AutoClimDS: Climate Data Science Agentic AI – A Knowledge Graph is All You Need


50. Correct Reasoning Paths Visit Shared Decision Pivots


51. Towards mitigating information leakage when evaluating safety monitors


52. See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation


53. VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing


54. Toward a Physics of Deep Learning and Brains


55. CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning


56. Learning Human-Perceived Fakeness in AI-Generated Videos via Multimodal LLMs


57. Hierarchical Representation Matching for CLIP-based Class-Incremental Learning


58. WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning


59. Death of the Novel(ty): Beyond n-Gram Novelty as a Metric for Textual Creativity


60. Language Models Can Learn from Verbal Feedback Without Scalar Rewards


61. Variational Reasoning for Language Models


62. Towards Efficient Online Exploration for Reinforcement Learning with Human Feedback


63. StateX: Enhancing RNN Recall via Post-training State Expansion


64. Learning Admissible Heuristics for A*: Theory and Practice


65. A Theoretical Analysis of Discrete Flow Matching Generative Models


66. IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning


67. Vision-Language Alignment from Compressed Image Representations using 2D Gaussian Splatting


68. Quantile Advantage Estimation for Entropy-Safe Reasoning


69. Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning


70. From Parameters to Behavior: Unsupervised Compression of the Policy Space


71. Retrieval-Augmented Guardrails for AI-Drafted Patient-Portal Messages: Error Taxonomy Construction and Large-Scale Evaluation


72. Activation Function Design Sustains Plasticity in Continual Learning


73. ConQuER: Modular Architectures for Control and Bias Mitigation in IQP Quantum Generative Models


74. Does AI Coaching Prepare us for Workplace Negotiations?


75. InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models


76. Mental Health Impacts of AI Companions: Triangulating Social Media Quasi-Experiments, User Perspectives, and Relational Theory


77. Ontological foundations for contrastive explanatory narration of robot plans


78. A Machine Learning Pipeline for Multiple Sclerosis Biomarker Discovery: Comparing explainable AI and Traditional Statistical Approaches


79. OFMU: Optimization-Driven Framework for Machine Unlearning


80. Exploring Solution Divergence and Its Effect on Large Language Model Problem Solving



82. Learning the Neighborhood: Contrast-Free Multimodal Self-Supervised Molecular Graph Pretraining


83. MDAR: A Multi-scene Dynamic Audio Reasoning Benchmark


84. Physics-informed GNN for medium-high voltage AC power flow with edge-aware attention and line search correction operator


85. Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers


86. Learning to Ball: Composing Policies for Long-Horizon Basketball Moves


87. Chimera: Diagnosing Shortcut Learning in Visual-Language Understanding


88. Global Convergence in Neural ODEs: Impact of Activation Functions


89. An Ontology for Unified Modeling of Tasks, Actions, Environments, and Capabilities in Personal Service Robotics


90. Partial Parameter Updates for Efficient Distributed Training


91. Explaining multimodal LLMs via intra-modal token interactions


92. RAU: Reference-based Anatomical Understanding with Vision Language Models


93. Deep Learning-Based Cross-Anatomy CT Synthesis Using Adapted nnResU-Net with Anatomical Feature Prioritized Loss


94. SpinGPT: A Large-Language-Model Approach to Playing Poker Correctly


95. Zero-Effort Image-to-Music Generation: An Interpretable RAG-based VLM Approach


96. What Is The Political Content in LLMs’ Pre- and Post-Training Data?


97. CHRONOBERG: Capturing Language Evolution and Temporal Awareness in Foundation Models


98. Forecasting the Future with Yesterday’s Climate: Temperature Bias in AI Weather and Climate Models


99. Stochastic activations


100. Context and Diversity Matter: The Emergence of In-Context Learning in World Models


101. SurvDiff: A Diffusion Model for Generating Synthetic Data in Survival Analysis


102. Transformers Can Learn Connectivity in Some Graphs but Not Others


103. Advancing Natural Language Formalization to First Order Logic with Fine-tuned LLMs


104. Spectral Collapse Drives Loss of Plasticity in Deep Continual Learning


105. Pedestrian Attribute Recognition via Hierarchical Cross-Modality HyperGraph Learning


106. Progressive Weight Loading: Accelerating Initial Inference and Gradually Boosting Performance on Resource-Constrained Environments


107. Adaptive Policy Backbone via Shared Network


108. HiGS: History-Guided Sampling for Plug-and-Play Enhancement of Diffusion Models


109. HEAPr: Hessian-based Efficient Atomic Expert Pruning in Output Space


110. Jailbreaking on Text-to-Video Models via Scene Splitting Strategy


111. Bridging Fairness and Explainability: Can Input-Based Explanations Promote Fairness in Hate Speech Detection?


112. Leveraging Large Language Models for Robot-Assisted Learning of Morphological Structures in Preschool Children with Language Vulnerabilities


113. A Global Analysis of Cyber Threats to the Energy Sector: “Currents of Conflict” from a Geopolitical Perspective


114. Wavelet-Induced Rotary Encodings: RoPE Meets Graphs


115. Beyond Classification Accuracy: Neural-MedBench and the Need for Deeper Reasoning Benchmarks


116. Secure and Efficient Access Control for Computer-Use Agents via Context Space


117. Beyond Textual Context: Structural Graph Encoding with Adaptive Space Alignment to alleviate the hallucination of LLMs


118. Safety Compliance: Rethinking LLM Safety Reasoning through the Lens of Compliance


119. ASSESS: A Semantic and Structural Evaluation Framework for Statement Similarity


120. FeatBench: Evaluating Coding Agents on Feature Implementation for Vibe Coding


121. Fairness-Aware Reinforcement Learning (FAReL): A Framework for Transparent and Balanced Sequential Decision-Making


122. Polysemous Language Gaussian Splatting via Matching-based Mask Lifting


123. Thinking in Many Modes: How Composite Reasoning Elevates Large Language Model Performance with Limited Data


124. Rigidity-Aware 3D Gaussian Deformation from a Single Image


125. Automatic Discovery of One Parameter Subgroups of $SO(n)$


126. VizGen: Data Exploration and Visualization from Natural Language via a Multi-Agent AI Architecture


127. Impact of Collective Behaviors of Autonomous Vehicles on Urban Traffic Dynamics: A Multi-Agent Reinforcement Learning Approach


128. Question-Driven Analysis and Synthesis: Building Interpretable Thematic Trees with LLMs for Text Clustering and Controllable Generation


129. Reversible GNS for Dissipative Fluids with Consistent Bidirectional Dynamics


130. The Outputs of Large Language Models are Meaningless


131. MimicDreamer: Aligning Human and Robot Demonstrations for Scalable VLA Training


132. Learning Equivariant Functions via Quadratic Forms


133. Efficiency Boost in Decentralized Optimization: Reimagining Neighborhood Aggregation with Minimal Overhead


134. Teaching AI to Feel: A Collaborative, Full-Body Exploration of Emotive Communication


135. Lightweight error mitigation strategies for post-training N:M activation sparsity in LLMs


136. Pushing Toward the Simplex Vertices: A Simple Remedy for Code Collapse in Smoothed Vector Quantization


137. From Long to Lean: Performance-aware and Adaptive Chain-of-Thought Compression via Multi-round Refinement


138. REFINE-CONTROL: A Semi-supervised Distillation Method For Conditional Image Generation


139. Bridging Draft Policy Misalignment: Group Tree Optimization for Speculative Decoding


140. R-Capsule: Compressing High-Level Plans for Efficient Large Language Model Reasoning


141. Multi-Agent Path Finding via Offline RL and LLM Collaboration


142. Universal Legal Article Prediction via Tight Collaboration between Supervised Classification Model and LLM


143. The AI_INFN Platform: Artificial Intelligence Development in the Cloud


144. Learning More with Less: A Dynamic Dual-Level Down-Sampling Framework for Efficient Policy Optimization


145. Reinforcement Learning for Durable Algorithmic Recourse


146. SecureAgentBench: Benchmarking Secure Code Generation under Realistic Vulnerability Scenarios


147. Action-aware Dynamic Pruning for Efficient Vision-Language-Action Manipulation


148. The Rogue Scalpel: Activation Steering Compromises LLM Safety


149. The QCET Taxonomy of Standard Quality Criterion Names and Definitions for the Evaluation of NLP Systems


150. Decoding Deception: Understanding Automatic Speech Recognition Vulnerabilities in Evasion and Poisoning Attacks


151. An Adaptive ICP LiDAR Odometry Based on Reliable Initial Pose


152. Fuzzy Reasoning Chain (FRC): An Innovative Reasoning Framework from Fuzziness to Clarity


153. Latent Diffusion : Multi-Dimension Stable Diffusion Latent Space Explorer


154. Lightweight Structured Multimodal Reasoning for Clinical Scene Understanding in Robotics


155. Black-Box Hallucination Detection via Consistency Under the Uncertain Expression


156. ERGO: Efficient High-Resolution Visual Understanding for Vision-Language Models


157. Developing Vision-Language-Action Model from Egocentric Videos


158. Hybrid Diffusion for Simultaneous Symbolic and Continuous Planning


159. Benchmarking and Mitigate Psychological Sycophancy in Medical Vision-Language Models


160. Geo-R1: Improving Few-Shot Geospatial Referring Expression Understanding with Reinforcement Fine-Tuning


161. From Superficial Outputs to Superficial Learning: Risks of Large Language Models in Education


162. No-Reference Image Contrast Assessment with Customized EfficientNet-B0


163. FlowDrive: moderated flow matching with data balancing for trajectory planning


164. Active Attacks: Red-teaming LLMs via Adaptive Environments


165. Debiasing Large Language Models in Thai Political Stance Detection via Counterfactual Calibration


166. Unveiling Many Faces of Surrogate Models for Configuration Tuning: A Fitness Landscape Analysis Perspective


167. SemanticControl: A Training-Free Approach for Handling Loosely Aligned Visual Conditions in ControlNet


168. Why Chain of Thought Fails in Clinical Text Understanding


169. SAGE: Scene Graph-Aware Guidance and Execution for Long-Horizon Manipulation Tasks


170. Generation Properties of Stochastic Interpolation under Finite Training Set


171. EqDiff-CT: Equivariant Conditional Diffusion model for CT Image Synthesis from CBCT


172. AutoSCORE: Enhancing Automated Scoring with Multi-Agent Large Language Models via Structured Component Recognition


173. A Large-Scale Dataset and Citation Intent Classification in Turkish with LLMs


174. Elastic MoE: Unlocking the Inference-Time Scalability of Mixture-of-Experts


175. You Can’t Steal Nothing: Mitigating Prompt Leakages in LLMs via System Vectors


176. Position: The Hidden Costs and Measurement Gaps of Reinforcement Learning with Verifiable Rewards


177. No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping


178. Unlocking the Essence of Beauty: Advanced Aesthetic Reasoning with Relative-Absolute Policy Optimization


179. Enhancing Low-Rank Adaptation with Structured Nonlinear Transformations


180. Graph of Agents: Principled Long Context Modeling by Emergent Multi-Agent Collaboration


181. Beyond Johnson-Lindenstrauss: Uniform Bounds for Sketched Bilinear Forms


182. Can Large Language Models Autoformalize Kinematics?


183. DiTraj: training-free trajectory control for video diffusion transformer


184. ChaosNexus: A Foundation Model for Universal Chaotic System Forecasting with Multi-scale Representations


185. Evaluating and Improving Cultural Awareness of Reward Models for LLM Alignment


186. FastGRPO: Accelerating Policy Optimization via Concurrency-aware Speculative Decoding and Online Draft Learning


187. Unbiased Binning: Fairness-aware Attribute Representation


188. Beyond Structure: Invariant Crystal Property Prediction with Pseudo-Particle Ray Diffraction


189. Backdoor Attribution: Elucidating and Controlling Backdoor in Language Models


190. SubZeroCore: A Submodular Approach with Zero Training for Coreset Selection


191. HyperCore: Coreset Selection under Noise via Hypersphere Models


192. Brain PathoGraph Learning


193. Self-Speculative Biased Decoding for Faster Live Translation


194. LFA-Net: A Lightweight Network with LiteFusion Attention for Retinal Vessel Segmentation


195. POLO: Preference-Guided Multi-Turn Reinforcement Learning for Lead Optimization


196. Uncovering Alzheimer’s Disease Progression via SDE-based Spatio-Temporal Graph Deep Learning on Longitudinal Brain Networks


197. UISim: An Interactive Image-Based UI Simulator for Dynamic Mobile Environments


198. Developing Strategies to Increase Capacity in AI Education


199. Not My Agent, Not My Boundary? Elicitation of Personal Privacy Boundaries in AI-Delegated Information Sharing


200. Optimizing the non-Clifford-count in unitary synthesis using Reinforcement Learning


201. QueryGym: Step-by-Step Interaction with Relational Databases


202. SlotFM: A Motion Foundation Model with Slot Attention for Diverse Downstream Tasks


203. MORPH: Shape-agnostic PDE Foundation Models


204. DIM: Enforcing Domain-Informed Monotonicity in Deep Neural Networks


205. Logic of Hypotheses: from Zero to Full Knowledge in Neurosymbolic Integration


206. Limitations on Safe, Trusted, Artificial General Intelligence


207. MobiLLM: An Agentic AI Framework for Closed-Loop Threat Mitigation in 6G Open RANs


208. InvBench: Can LLMs Accelerate Program Verification with Invariant Synthesis?


209. A Data-driven Typology of Vision Models from Integrated Representational Metrics


210. Guiding Audio Editing with Audio Language Model


211. OjaKV: Context-Aware Online Low-Rank KV Cache Compression with Oja’s Rule


212. LANCE: Low Rank Activation Compression for Efficient On-Device Continual Learning


213. Multi-Objective Reinforcement Learning for Large Language Model Optimization: Visionary Perspective


214. Temporal vs. Spatial: Comparing DINOv3 and V-JEPA2 Feature Representations for Video Action Analysis


215. What Happens Next? Anticipating Future Motion by Generating Point Trajectories


216. Enhancing Contrastive Learning for Geolocalization by Discovering Hard Negatives on Semivariograms


217. No Alignment Needed for Generation: Learning Linearly Separable Representations in Diffusion Models


218. Domain-Aware Speaker Diarization On African-Accented English


219. Psychological and behavioural responses in human-agent vs. human-human interactions: a systematic review and meta-analysis


220. Agribot: agriculture-specific question answer system


221. Preemptive Detection and Steering of LLM Misalignment via Latent Reachability


222. Shortcut Flow Matching for Speech Enhancement: Step-Invariant flows via single stage training


223. $\mathbf{Li_2}$: A Framework on Dynamics of Feature Emergence and Delayed Generalization


224. DistillKac: Few-Step Image Generation via Damped Wave Equations


225. New Algorithmic Directions in Optimal Transport and Applications for Product Spaces


226. Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training


227. Dual-Head Reasoning Distillation: Improving Classifier Accuracy with Train-Time-Only Reasoning


228. Neural Operators for Mathematical Modeling of Transient Fluid Flow in Subsurface Reservoir Systems


229. Learning to Reason with Mixture of Tokens


230. Are Hallucinations Bad Estimations?


231. Score-based Idempotent Distillation of Diffusion Models


232. Gender Stereotypes in Professional Roles Among Saudis: An Analytical Study of AI-Generated Images Using Language Models


233. Enhanced Generative Machine Listener


234. A State-of-the-Art SQL Reasoning Model using RLVR


235. ARTI-6: Towards Six-dimensional Articulatory Speech Encoding


236. One Model, Many Morals: Uncovering Cross-Linguistic Misalignments in Computational Moral Reasoning


237. Foundation models for high-energy physics


238. DyME: Dynamic Multi-Concept Erasure in Diffusion Models with Bi-Level Orthogonal LoRA Adaptation


239. PhenoMoler: Phenotype-Guided Molecular Optimization via Chemistry Large Language Model


240. Near-Optimal Experiment Design in Linear non-Gaussian Cyclic Models


241. How Large Language Models Need Symbolism


242. Large AI Model-Enabled Generative Semantic Communications for Image Transmission


243. MIXRAG : Mixture-of-Experts Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering


244. Towards Adapting Federated & Quantum Machine Learning for Network Intrusion Detection: A Survey


245. Do Sparse Subnetworks Exhibit Cognitively Aligned Attention? Effects of Pruning on Saliency Map Fidelity, Sparsity, and Concept Coherence


246. Toward a Realistic Encoding Model of Auditory Affective Understanding in the Brain


247. SAEmnesia: Erasing Concepts in Diffusion Models with Sparse Autoencoders


248. Dynamic Multi-Target Fusion for Efficient Audio-Visual Navigation


249. In silico Deep Learning Protocols for Label-Free Super-Resolution Microscopy: A Comparative Study of Network Architectures and SNR Dependence


250. Automated Prompt Generation for Creative and Counterfactual Text-to-image Synthesis


251. ReGeS: Reciprocal Retrieval-Generation Synergy for Conversational Recommender Systems


252. Safety Assessment of Scaffolding on Construction Site using AI


253. Design and Implementation of a Secure RAG-Enhanced AI Chatbot for Smart Tourism Customer Service: Defending Against Prompt Injection Attacks – A Case Study of Hsinchu, Taiwan


254. MAJORScore: A Novel Metric for Evaluating Multimodal Relevance via Joint Representation


255. A Mutual Learning Method for Salient Object Detection with intertwined Multi-Supervision–Revised


256. Context Is What You Need: The Maximum Effective Context Window for Real World Limits of LLMs


257. Multimodal Prompt Decoupling Attack on the Safety Filters in Text-to-Image Models


258. Influence Guided Context Selection for Effective Retrieval-Augmented Generation


259. MDF-MLLM: Deep Fusion Through Cross-Modal Feature Alignment for Contextually Aware Fundoscopic Image Classification


260. A Novel Differential Feature Learning for Effective Hallucination Detection and Classification


261. Phrase-grounded Fact-checking for Automatically Generated Chest X-ray Reports


262. Domain-Informed Genetic Superposition Programming: A Case Study on SFRC Beams


263. KV-Efficient VLA: A Method of Speed up Vision Language Model with RNN-Gated Chunked KV Cache


264. Random Direct Preference Optimization for Radiography Report Generation


265. SGNNBench: A Holistic Evaluation of Spiking Graph Neural Network on Large-scale Graph


266. From Embeddings to Equations: Genetic-Programming Surrogates for Interpretable Transformer Classification


267. Cycle is All You Need: More Is Different


268. Cross-Modal Retrieval with Cauchy-Schwarz Divergence


269. Seismic Velocity Inversion from Multi-Source Shot Gathers Using Deep Segmentation Networks: Benchmarking U-Net Variants and SeismoLabV3+


270. Assessment of deep learning models integrated with weather and environmental variables for wildfire spread prediction and a case study of the 2023 Maui fires


271. PIR-RAG: A System for Private Information Retrieval in Retrieval-Augmented Generation


272. From Search to Reasoning: A Five-Level RAG Capability Framework for Enterprise Data