전체 AI 논문 - 2025-09-23

1. Attention Schema-based Attention Control (ASAC): A Cognitive-Inspired Approach for Attention Management in Transformers


2. Structured Information for Improving Spatial Relationships in Text-to-Image Generation


3. EHR-MCP: Real-world Evaluation of Clinical Information Retrieval by Large Language Models via Model Context Protocol


4. A Comparative Study of Rule-Based and Data-Driven Approaches in Industrial Monitoring


5. Building Data-Driven Occupation Taxonomies: A Bottom-Up Multi-Stage Approach via Semantic Clustering and Multi-Agent Collaboration


6. Ontology Creation and Management Tools: the Case of Anatomical Connectivity


7. A Nascent Taxonomy of Machine Learning in Intelligent Robotic Process Automation


8. CCrepairBench: A High-Fidelity Benchmark and Reinforcement Learning Framework for C++ Compilation Repair


9. MicroRCA-Agent: Microservice Root Cause Analysis Method Based on Large Language Model Agents


10. Stress Testing Deliberative Alignment for Anti-Scheming Training


11. FragmentRetro: A Quadratic Retrosynthetic Method Based on Fragmentation Algorithms


12. Diagnostics of cognitive failures in multi-agent expert systems using dynamic evaluation protocols and subsequent mutation of the processing context


13. Knowledge-Driven Hallucination in Large Language Models: An Empirical Study on Process Modeling


14. An Artificial Intelligence Driven Semantic Similarity-Based Pipeline for Rapid Literature


15. The Distribution Shift Problem in Transportation Networks using Reinforcement Learning and AI


16. KNARsack: Teaching Neural Algorithmic Reasoners to Solve Pseudo-Polynomial Problems


17. MICA: Multi-Agent Industrial Coordination Assistant


18. RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation


19. FocalCodec-Stream: Streaming Low-Bitrate Speech Coding via Causal Distillation


20. CultureScope: A Dimensional Lens for Probing Cultural Understanding in LLMs


21. Accelerating Atomic Fine Structure Determination with Graph Reinforcement Learning


22. Fast OTSU Thresholding Using Bisection Method


23. Robust Vision-Language Models via Tensor Decomposition: A Defense Against Adversarial Attacks


24. Network-Based Detection of Autism Spectrum Disorder Using Sustainable and Non-invasive Salivary Biomarkers


25. DiffusionNFT: Online Diffusion Reinforcement with Forward Process


26. Beyond Pointwise Scores: Decomposed Criteria-Based Evaluation of LLM Responses


27. See&Trek: Training-Free Spatial Prompting for Multimodal Large Language Model


28. Communications to Circulations: 3D Wind Field Retrieval and Real-Time Prediction Using 5G GNSS Signals and Deep Learning


29. Compose by Focus: Scene Graph-based Atomic Skills


30. Think, Verbalize, then Speak: Bridging Complex Thoughts and Comprehensible Speech


31. Session-Level Spoken Language Assessment with a Multimodal Foundation Model via Multi-Target Learning


32. AI Methods for Permutation Circuit Synthesis Across Generic Topologies


33. Fed-PISA: Federated Voice Cloning via Personalized Identity-Style Adaptation


34. Towards Sharper Object Boundaries in Self-Supervised Depth Estimation


35. EmoHeal: An End-to-End System for Personalized Therapeutic Music Retrieval from Fine-grained Emotions


36. Uncertainty-Based Smooth Policy Regularisation for Reinforcement Learning with Few Demonstrations


37. Shedding Light on Depth: Explainability Assessment in Monocular Depth Estimation


38. BEFT: Bias-Efficient Fine-Tuning of Language Models


39. RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation


40. MoE-CE: Enhancing Generalization for Deep Learning based Channel Estimation via a Mixture-of-Experts Framework


41. Explainable AI for Maritime Autonomous Surface Ships (MASS): Adaptive Interfaces and Trustworthy Human-AI Collaboration


42. Compose Yourself: Average-Velocity Flow Matching for One-Step Speech Enhancement


43. ArchesClimate: Probabilistic Decadal Ensemble Generation With Flow Matching


44. A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning


45. The Alignment Bottleneck



47. Foundation Models as World Models: A Foundational Study in Text-Based GridWorlds


48. An Equivariant Graph Network for Interpretable Nanoporous Materials Design


49. Re-FRAME the Meeting Summarization SCOPE: Fact-Based Summarization and Personalization via Questions


50. From Data to Diagnosis: A Large, Comprehensive Bone Marrow Dataset and AI Methods for Childhood Leukemia Prediction


51. MoAngelo: Motion-Aware Neural Surface Reconstruction for Dynamic Scenes


52. Distribution-Aligned Decoding for Efficient LLM Task Adaptation


53. RACap: Relation-Aware Prompting for Lightweight Retrieval-Augmented Image Captioning


54. Self-Supervised Cross-Modal Learning for Image-to-Point Cloud Registration


55. DeepMech: A Machine Learning Framework for Chemical Reaction Mechanism Prediction


56. EvoBrain: Dynamic Multi-channel EEG Graph Modeling for Time-evolving Brain Network


57. Diversity of Structured Domains via k-Kemeny Scores


58. Best-of-L: Cross-Lingual Reward Modeling for Mathematical Reasoning


59. Instance Generation for Meta-Black-Box Optimization through Latent Space Reverse Engineering


60. CIDER: A Causal Cure for Brand-Obsessed Text-to-Image Models


61. ChronoForge-RL: Chronological Forging through Reinforcement Learning for Enhanced Video Understanding


62. Hierarchical Reinforcement Learning with Low-Level MPC for Multi-Agent Control


63. Monte Carlo Tree Diffusion with Multiple Experts for Protein Design


64. CBPNet: A Continual Backpropagation Prompt Network for Alleviating Plasticity Loss on Edge Devices


65. Ideal Registration? Segmentation is All You Need


66. On Optimal Steering to Achieve Exact Fairness


67. FloorSAM: SAM-Guided Floorplan Reconstruction with Semantic-Geometric Fusion


68. GP3: A 3D Geometry-Aware Policy with Multi-View Images for Robotic Manipulation


69. Once Upon a Time: Interactive Learning for Storytelling with Small Language Models


70. SGMAGNet: A Baseline Model for 3D Cloud Phase Structure Reconstruction on a New Passive Active Satellite Benchmark


71. Saccadic Vision for Fine-Grained Visual Classification


72. KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning


73. Inference Offloading for Cost-Sensitive Binary Classification at the Edge


74. TISDiSS: A Training-Time and Inference-Time Scalable Framework for Discriminative Source Separation


75. SightSound-R1: Cross-Modal Reasoning Distillation from Vision to Audio Language Models


76. Chunk Knowledge Generation Model for Enhanced Information Retrieval: A Multi-task Learning Approach


77. Toward Efficient Influence Function: Dropout as a Compression Tool


78. Information Geometry of Variational Bayes


79. Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification


80. CFDA & CLIP at TREC iKAT 2025: Enhancing Personalized Conversational Search via Query Reformulation and Rank Fusion


81. DivLogicEval: A Framework for Benchmarking Logical Reasoning Evaluation in Large Language Models


82. Momentum-constrained Hybrid Heuristic Trajectory Optimization Framework with Residual-enhanced DRL for Visually Impaired Scenarios


83. Multimodal Learning for Fake News Detection in Short Videos Using Linguistically Verified Data and Heterogeneous Modality Fusion


84. Relevance to Utility: Process-Supervised Rewrite for RAG


85. Towards Size-invariant Salient Object Detection: A Generic Evaluation and Optimization Approach


86. Contrastive Learning with Spectrum Information Augmentation in Abnormal Sound Detection


87. LiteLong: Resource-Efficient Long-Context Data Synthesis for LLMs



89. Reward Hacking Mitigation using Verifiable Composite Rewards


90. Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining


91. Diffusion-Based Cross-Modal Feature Extraction for Multi-Label Classification


92. GUI-ARP: Enhancing Grounding with Adaptive Region Perception for GUI Agents


93. How do Language Models Generate Slang: A Systematic Comparison between Human and Machine-Generated Slang Usages


94. The (Short-Term) Effects of Large Language Models on Unemployment and Earnings


95. Explainable AI-Enhanced Supervisory Control for Robust Multi-Agent Robotic Systems


96. SmolRGPT: Efficient Spatial Reasoning for Warehouse Environments with 600M Parameters


97. mucAI at BAREC Shared Task 2025: Towards Uncertainty Aware Arabic Readability Assessment


98. Comparing Computational Pathology Foundation Models using Representational Similarity Analysis


99. Self-supervised learning of imaging and clinical signatures using a multimodal joint-embedding predictive architecture


100. Incorporating Visual Cortical Lateral Connection Properties into CNN: Recurrent Activation and Excitatory-Inhibitory Separation


101. CAGE: Continuity-Aware edGE Network Unlocks Robust Floorplan Reconstruction


102. Hierarchical Self-Attention: Generalizing Neural Attention Mechanics to Multi-Scale Problems


103. PILOT: Steering Synthetic Data Generation with Psychological & Linguistic Output Targeting


104. Implicit Kinodynamic Motion Retargeting for Human-to-humanoid Imitation Learning


105. Where Do I ‘Add the Egg’?: Exploring Agency and Ownership in AI Creative Co-Writing Systems


106. Dual-Mode Visual System for Brain-Computer Interfaces: Integrating SSVEP and P300 Responses


107. Impact of Phonetics on Speaker Identity in Adversarial Voice Attack


108. Region-Aware Deformable Convolutions


109. ORCA: Agentic Reasoning For Hallucination and Adversarial Robustness in Vision-Language Models


110. Deep learning and abstractive summarisation for radiological reports: an empirical study for adapting the PEGASUS models’ family with scarce data


111. Exploring multimodal implicit behavior learning for vehicle navigation in simulated cities


112. Generating Part-Based Global Explanations Via Correspondence


113. Efficient and Versatile Model for Multilingual Information Retrieval of Islamic Text: Development and Deployment in Real-World Scenarios


114. Recent Advancements in Microscopy Image Enhancement using Deep Learning: A Survey


115. Beyond Spurious Signals: Debiasing Multimodal Large Language Models via Counterfactual Inference and Adaptive Expert Routing


116. Emulating Human-like Adaptive Vision for Efficient and Flexible Machine Visual Perception


117. Collective Voice: Recovered-Peer Support Mediated by An LLM-Based Chatbot for Eating Disorder Recovery


118. Evaluating the Limitations of Local LLMs in Solving Complex Programming Challenges


119. Partial Column Generation with Graph Neural Networks for Team Formation and Routing


120. Large Vision Models Can Solve Mental Rotation Problems


121. PRISM: Phase-enhanced Radial-based Image Signature Mapping framework for fingerprinting AI-generated images


122. Modeling Transformers as complex networks to analyze learning dynamics


123. Autoguided Online Data Curation for Diffusion Model Training


124. IEFS-GMB: Gradient Memory Bank-Guided Feature Selection Based on Information Entropy for EEG Classification of Neurological Disorders


125. Generative AI Meets Wireless Sensing: Towards Wireless Foundation Model


126. A Multi-Scale Graph Neural Process with Cross-Drug Co-Attention for Drug-Drug Interactions Prediction


127. Emotion-Aware Speech Generation with Character-Specific Voices for Comics


128. Walk and Read Less: Improving the Efficiency of Vision-and-Language Navigation via Tuning-Free Multimodal Token Pruning


129. Causal Reasoning Elicits Controllable 3D Scene Generation


130. Synthetic bootstrapped pretraining


131. GenCAD-3D: CAD Program Generation using Multimodal Latent Space Alignment and Synthetic Dataset Balancing


132. Generating Plans for Belief-Desire-Intention (BDI) Agents Using Alternating-Time Temporal Logic (ATL)


133. ChannelFlow-Tools: A Standardized Dataset Creation Pipeline for 3D Obstructed Channel Flows


134. Pre-Forgettable Models: Prompt Learning as a Native Mechanism for Unlearning