전체 AI 논문 - 2026-01-16

1. Automating Supply Chain Disruption Monitoring via an Agentic AI Approach


2. Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning


3. PersonalAlign: Hierarchical Implicit Intent Alignment for Personalized GUI Agent with Long-Term User-Centric Records


4. LLM for Large-Scale Optimization Model Auto-Formulation: A Lightweight Few-Shot Learning Approach


5. Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning


6. What Do LLM Agents Know About Their World? Task2Quiz: A Paradigm for Studying Environment Understanding


7. EvoFSM: Controllable Self-Evolution for Deep Research with Finite State Machines


8. Long-term Task-oriented Agent: Proactive Long-term Intent Maintenance in Dynamic Environments


9. Monte-Carlo Tree Search with Neural Network Guidance for Lane-Free Autonomous Driving


10. Policy-Based Reinforcement Learning with Action Masking for Dynamic Job Shop Scheduling under Uncertainty: Handling Random Arrivals and Machine Failures


11. Cluster Workload Allocation: Semantic Soft Affinity Using Natural Language Processing


12. STaR: Sensitive Trajectory Regulation for Unlearning in Large Reasoning Models


13. M$^3$Searcher: Modular Multimodal Information Seeking Agency with Retrieval-Oriented Reasoning


14. $A^3$-Bench: Benchmarking Memory-Driven Scientific Reasoning via Anchor and Attractor Activation


15. RISER: Orchestrating Latent Reasoning Skills for Adaptive Activation Steering


16. Coordinated Pandemic Control with Large Language Model Agents as Policymaking Assistants


17. Efficient Paths and Dense Rewards: Probabilistic Flow Reasoning for Large Language Models


18. MAXS: Meta-Adaptive Exploration with LLM Agents


19. Position on LLM-Assisted Peer Review: Addressing Reviewer Gap through Mentoring and Feedback


20. PrivacyReasoner: Can LLM Emulate a Human-like Privacy Mind?


21. The AI Hippocampus: How Far are We From Human Memory?


22. AviationLMM: A Large Multimodal Foundation Model for Civil Aviation


23. DScheLLM: Enabling Dynamic Scheduling through a Fine-Tuned Dual-System Large language Model


24. Programming over Thinking: Efficient and Robust Multi-Constraint Planning


25. Human-AI Co-design for Clinical Prediction Models


26. The Hierarchy of Agentic Capabilities: Evaluating Frontier Models on Realistic RL Environments


27. ART: Action-based Reasoning Task Benchmarking for Medical AI Agents


28. ConvoLearn: A Dataset of Constructivist Tutor-Student Dialogue


29. Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning


30. Value-Aware Numerical Representations for Transformer Language Models


31. ShortCoder: Knowledge-Augmented Syntax Optimization for Token-Efficient Code Generation


32. LLMs can Compress LLMs: Adaptive Pruning by Agents


33. Routing with Generated Data: Annotation-Free LLM Skill Estimation and Expert Selection


34. Disentangling Task Conflicts in Multi-Task LoRA via Orthogonal Gradient Projection


35. From Prompt to Protocol: Fast Charging Batteries with Large Language Models


36. The Promptware Kill Chain: How Prompt Injections Gradually Evolved Into a Multi-Step Malware


37. Toward Understanding Unlearning Difficulty: A Mechanistic Perspective and Circuit-Guided Difficulty Metric


38. Full Disclosure, Less Trust? How the Level of Detail about AI Use in News Writing Affects Readers’ Trust


39. CogRail: Benchmarking VLMs in Cognitive Intrusion Perception for Intelligent Railway Transportation Systems


40. DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing


41. Sim2real Image Translation Enables Viewpoint-Robust Policies from Fixed-Camera Datasets


42. Linear Complexity Self-Supervised Learning for Music Understanding with Random Quantizer


43. Information Access of the Oppressed: A Problem-Posing Framework for Envisioning Emancipatory Information Access Platforms


44. Hot-Start from Pixels: Low-Resolution Visual Tokens for Chinese Language Modeling


45. Benchmarking Post-Training Quantization of Large Language Models under Microscaling Floating Point Formats


46. Private LLM Inference on Consumer Blackwell GPUs: A Practical Guide for Cost-Effective Local Deployment in SMEs


47. Towards Realistic Synthetic Data for Automatic Drum Transcription


48. Learning Whole-Body Human-Humanoid Interaction from Human-Human Demonstrations


49. Bridging Semantic Understanding and Popularity Bias with LLMs


50. SimMerge: Learning to Select Merge Operators from Similarity Signals


51. Personalized Multimodal Feedback Using Multiple External Representations: Strategy Profiles and Learning in High School Physics


52. FairGU: Fairness-aware Graph Unlearning in Social Network


53. Searth Transformer: A Transformer Architecture Incorporating Earth’s Geospheric Physical Priors for Global Mid-Range Weather Forecasting


54. SoK: Enhancing Cryptographic Collaborative Learning with Differential Privacy


55. On the Hardness of Computing Counterfactual and Semifactual Explanations in XAI


56. Late Breaking Results: Quamba-SE: Soft-edge Quantizer for Activations in State Space Models


57. Population-Aligned Audio Reproduction With LLM-Based Equalizers


58. Improving Symbolic Translation of Language Models for Logical Reasoning


59. Where Knowledge Collides: A Mechanistic Study of Intra-Memory Knowledge Conflict in Language Models


60. Do Transformers Understand Ancient Roman Coin Motifs Better than CNNs?


61. Bias Dynamics in BabyLMs: Towards a Compute-Efficient Sandbox for Democratising Pre-Training Debiasing


62. Radiomics-Integrated Deep Learning with Hierarchical Loss for Osteosarcoma Histology Classification


63. Speech-Hands: A Self-Reflection Voice Agentic Approach to Speech Recognition and Audio Reasoning with Omni Perception


64. Ability Transfer and Recovery via Modularized Parameters Localization


65. FairGE: Fairness-Aware Graph Encoding in Incomplete Social Networks


66. Query Languages for Machine-Learning Models


67. Frame of Reference: Addressing the Challenges of Common Ground Representation in Situational Dialogs


68. GeoRA: Geometry-Aware Low-Rank Adaptation for RLVR


69. Navigating Ethical AI Challenges in the Industrial Sector: Balancing Innovation and Responsibility


70. Improving Implicit Hate Speech Detection via a Community-Driven Multi-Agent Framework


71. Understanding or Memorizing? A Case Study of German Definite Articles in Language Models


72. On-Device Large Language Models for Sequential Recommendation


73. Blue Teaming Function-Calling Agents


74. Why not Collaborative Filtering in Dual View? Bridging Sparse and Dense Models


75. ReGraM: Region-First Knowledge Graph Reasoning for Medical Question Answering


76. Magnifying change: Rapid burn scar mapping with multi-resolution, multi-source satellite imagery


77. RIFT: Repurposing Negative Samples via Reward-Informed Fine-Tuning


78. HGATSolver: A Heterogeneous Graph Attention Solver for Fluid-Structure Interaction


79. Hybrid guided variational autoencoder for visual place recognition


80. Reward Learning through Ranking Mean Squared Error


81. GIFT: Unlocking Global Optimality in Post-Training via Finite-Temperature Gibbs Initialization


82. SpikeVAEDiff: Neural Spike-based Natural Visual Scene Reconstruction via VD-VAE and Versatile Diffusion


83. Annealed Relaxation of Speculative Decoding for Faster Autoregressive Image Generation


84. Mikasa: A Character-Driven Emotional AI Companion Inspired by Japanese Oshi Culture


85. A.X K1 Technical Report


86. ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection


87. KTCF: Actionable Recourse in Knowledge Tracing via Counterfactual Explanations for Education


88. SSVP: Synergistic Semantic-Visual Prompting for Industrial Zero-Shot Anomaly Detection


89. SkinFlow: Efficient Information Transmission for Open Dermatological Diagnosis via Dynamic Visual Encoding and Staged RL


90. Equi-ViT: Rotational Equivariant Vision Transformer for Robust Histopathology Analysis


91. Adaptive Multi-Stage Patent Claim Generation with Unified Quality Assessment


92. A Marketplace for AI-Generated Adult Content and Deepfakes


93. LP-LLM: End-to-End Real-World Degraded License Plate Text Recognition via Large Multimodal Models


94. SubTokenTest: A Practical Benchmark for Real-World Sub-token Understanding


95. MMR-GRPO: Accelerating GRPO-Style Training through Diversity-Aware Reward Reweighting


96. From Symbolic to Natural-Language Relations: Rethinking Knowledge Graph Construction in the Era of Large Language Models


97. Mi:dm 2.0 Korea-centric Bilingual Language Models


98. Is Grokking Worthwhile? Functional Analysis and Transferability of Generalization Circuits in Transformers


99. Can LLMs interpret figurative language as humans do?: surface-level vs representational similarity


100. A Decompilation-Driven Framework for Malware Detection with Large Language Models


101. Generalizable Geometric Prior and Recurrent Spiking Feature Learning for Humanoid Robot Manipulation


102. Proactively Detecting Threats: A Novel Approach Using LLMs


103. OpenDecoder: Open Large Language Model Decoding to Incorporate Document Quality in RAG


104. Meta-learning to Address Data Shift in Time Series Classification


105. TranslateGemma Technical Report


106. Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models


107. Fairness risk and its privacy-enabled solution in AI-driven robotic applications


108. PluriHarms: Benchmarking the Full Spectrum of Human Judgments on AI Harm


109. Towards a Self-Driving Trigger at the LHC: Adaptive Response in Real Time


110. Navigating Ideation Space: Decomposed Conceptual Representations for Positioning Scientific Ideas


111. XGBoost Forecasting of NEPSE Index Log Returns with Walk Forward Validation


112. Evaluating Role-Consistency in LLMs for Counselor Training


113. Attention Consistency Regularization for Interpretable Early-Exit Neural Networks


114. Bridging the Gap: Empowering Small Models in Reliable OpenACC-based Parallelization via GEPA-Optimized Prompting


115. Compressing Vision Transformers in Geospatial Transfer Learning with Manifold-Constrained Optimization


116. TAG-MoE: Task-Aware Gating for Unified Generative Mixture-of-Experts


117. Learning Domain-Invariant Representations for Cross-Domain Image Registration via Scene-Appearance Disentanglement


118. The Illusion of Friendship: Why Generative AI Demands Unprecedented Ethical Vigilance


119. ForensicFormer: Hierarchical Multi-Scale Reasoning for Cross-Domain Image Forgery Detection


120. Semantic visually-guided acoustic highlighting with large vision-language models


121. First African Digital Humanism Summer School 2025


122. AI Deployment Authorisation: A Global Standard for Machine-Readable Governance of High-Risk Artificial Intelligence


123. Residual Cross-Modal Fusion Networks for Audio-Visual Navigation


124. R$^2$BD: A Reconstruction-Based Method for Generalizable and Efficient Detection of Fake Images



126. Bias Detection and Rotation-Robustness Mitigation in Vision-Language Models and Generative Image Models


127. Adaptive Trust Metrics for Multi-LLM Systems: Enhancing Reliability in Regulated Industries


128. Revisiting Software Engineering Education in the Era of Large Language Models: A Curriculum Adaptation and Academic Integrity Framework


129. LAUDE: LLM-Assisted Unit Test Generation and Debugging of Hardware DEsigns


130. Más contexto no es mejor. Paradoja de la dilución vectorial en RAG corporativos


131. The Inconsistency Critique: Epistemic Practices and AI Testimony About Inner States


132. PediaMind-R1: A Temperament-Aware Language Model for Personalized Early Childhood Care Reasoning via Cognitive Modeling and Preference Alignment


133. Scalable and Reliable Evaluation of AI Knowledge Retrieval Systems: RIKER and the Coherent Simulated Universe


134. Directional Attractors in LLM Reasoning: How Similarity Retrieval Steers Iterative Summarization Based Reasoning


135. No Universal Hyperbola: A Formal Disproof of the Epistemic Trade-Off Between Certainty and Scope in Symbolic and Generative AI


136. Emissions and Performance Trade-off Between Small and Large Language Models


137. Resisting Correction: How RLHF Makes Language Models Ignore External Safety Signals in Natural Conversation


138. Triples and Knowledge-Infused Embeddings for Clustering and Classification of Scientific Documents


139. Consistency-Aware Editing for Entity-level Unlearning in Language Models


140. Companion Agents: A Table-Information Mining Paradigm for Text-to-SQL


141. From Adversarial Poetry to Adversarial Tales: An Interpretability Research Agenda


142. DeliberationBench: When Do More Voices Hurt? A Controlled Study of Multi-LLM Deliberation Protocols


143. Reading or Reasoning? Format Decoupled Reinforcement Learning for Document OCR


144. Revisiting Disaggregated Large Language Model Serving for Performance and Energy Implications


145. CrowdLLM: Building LLM-Based Digital Populations Augmented with Generative Models