전체 AI 논문 - 2026-02-02

1. Strongly Polynomial Time Complexity of Policy Iteration for $L_\infty$ Robust MDPs


2. Scaling Multiagent Systems with Process Rewards


3. High-quality generation of dynamic game content via small language models: A proof of concept


4. TSAQA: Time Series Analysis Question And Answering Benchmark


5. Make Anything Match Your Target: Universal Adversarial Perturbations against Closed-Source MLLMs via Multi-Crop Routed Meta Optimization


6. THINKSAFE: Self-Generated Safety Alignment for Reasoning Models


7. RAudit: A Blind Auditing Protocol for Large Language Model Reasoning


8. Chain-of-thought obfuscation learned from output supervision can generalise to unseen tasks


9. MedMCP-Calc: Benchmarking LLMs for Realistic Medical Calculator Scenarios via MCP Integration


10. From Abstract to Contextual: What LLMs Still Cannot Do in Mathematics


11. The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?


12. Guided by Trajectories: Repairing and Rewarding Tool-Use Trajectories for Tool-Integrated Reasoning


13. TriCEGAR: A Trace-Driven Abstraction Mechanism for Agentic AI


14. Why Your Deep Research Agent Fails? On Hallucination Evaluation in Full Research Trajectory


15. Quantifying Model Uniqueness in Heterogeneous AI Ecosystems


16. Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text


17. EvoClinician: A Self-Evolving Agent for Multi-Turn Medical Diagnosis via Test-Time Evolutionary Learning


18. Alignment among Language, Vision and Action Representations


19. MulFeRL: Enhancing Reinforcement Learning with Verbal Feedback in a Multi-turn Loop


20. Game-Theoretic Co-Evolution for LLM-Based Heuristic Discovery


21. Aligning the Unseen in Attributed Graphs: Interplay between Graph Geometry and Node Attributes Manifold


22. CVeDRL: An Efficient Code Verifier via Difficulty-aware Reinforcement Learning


23. Conditional Performance Guarantee for Large Reasoning Models


24. Toward IIT-Inspired Consciousness in LLMs: A Reward-Based Learning Framework


25. Learning with Challenges: Adaptive Difficulty-Aware Data Generation for Mobile GUI Agent Training


26. TSPO: Breaking the Double Homogenization Dilemma in Multi-turn Search Policy Optimization


27. AutoRefine: From Trajectories to Reusable Expertise for Continual LLM Agent Refinement


28. A Step Back: Prefix Importance Ratio Stabilizes Policy Optimization


29. Best-of-Q: Improving VLM agents with Q-function Action Ranking at Inference


30. Real-Time Aligned Reward Model beyond Semantics


31. Task-Aware LLM Council with Adaptive Decision Pathways for Decision Support


32. UCPO: Uncertainty-Aware Policy Optimization


33. Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments


34. Beyond Medical Chatbots: Meddollina and the Rise of Continuous Clinical Intelligence


35. Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling


36. SYMPHONY: Synergistic Multi-agent Planning with Heterogeneous Language Model Assembly


37. EntroCut: Entropy-Guided Adaptive Truncation for Efficient Chain-of-Thought Reasoning in Small-scale Large Reasoning Models


38. From Self-Evolving Synthetic Data to Verifiable-Reward RL: Post-Training Multi-turn Interactive Tool-Using Agents


39. Learn More with Less: Uncertainty Consistency Guided Query Selection for RLVR


40. WED-Net: A Weather-Effect Disentanglement Network with Causal Augmentation for Urban Flow Prediction


41. PerfGuard: A Performance-Aware Agent for Visual Content Generation


42. Decoding in Geometry: Alleviating Embedding-Space Crowding for Complex Reasoning


43. Enhancing TableQA through Verifiable Reasoning Trace Reward


44. Darwinian Memory: A Training-Free Self-Regulating Memory System for GUI Agent Evolution


45. Why Self-Rewarding Works: Theoretical Guarantees for Iterative Alignment of Language Models


46. Controllable Information Production


47. Anytime Safe PAC Efficient Reasoning


48. When LLM meets Fuzzy-TOPSIS for Personnel Selection through Automated Profile Analysis


49. AI-Enabled Waste Classification as a Data-Driven Decision Support Tool for Circular Economy and Urban Sustainability


50. Semi-Autonomous Mathematics Discovery with Gemini: A Case Study on the Erdős Problems


51. Learning Provably Correct Distributed Protocols Without Human Knowledge


52. Sparks of Rationality: Do Reasoning LLMs Align with Human Judgment and Choice?


53. Why Reasoning Fails to Plan: A Planning-Centric Analysis of Long-Horizon Decision Making in LLM Agents


54. The Six Sigma Agent: Achieving Enterprise-Grade Reliability in LLM Systems Through Consensus-Driven Decomposed Execution


55. JAF: Judge Agent Forest


56. VideoGPA: Distilling Geometry Priors for 3D-Consistent Video Generation


57. End-to-end Optimization of Belief and Policy Learning in Shared Autonomy Paradigms


58. IRL-DAL: Safe and Adaptive Trajectory Planning for Autonomous Driving via Energy-Guided Diffusion Models


59. TEON: Tensorized Orthonormalization Beyond Layer-Wise Muon for Large Language Model Pre-Training


60. Agnostic Language Identification and Generation


61. Now You Hear Me: Audio Narrative Attacks Against Large Audio-Language Models


62. YuriiFormer: A Suite of Nesterov-Accelerated Transformers



64. Agile Reinforcement Learning through Separable Neural Architecture


65. Med-Scout: Curing MLLMs’ Geometric Blindness in Medical Perception via Geometry-Aware RL Post-Training


66. MonoScale: Scaling Multi-Agent System with Monotonic Improvement


67. Disentangling multispecific antibody function with graph neural networks


68. Learning to Execute Graph Algorithms Exactly with Graph Neural Networks


69. Beyond Fixed Frames: Dynamic Character-Aligned Speech Tokenization


70. Probing the Trajectories of Reasoning Traces in Large Language Models


71. SPICE: Submodular Penalized Information-Conflict Selection for Efficient Large Language Model Training


72. On Safer Reinforcement Learning Policies for Sedation and Analgesia in Intensive Care


73. Securing Time in Energy IoT: A Clock-Dynamics-Aware Spatio-Temporal Graph Attention Network for Clock Drift Attacks and Y2K38 Failures


74. Machine Learning for Energy-Performance-aware Scheduling


75. Secure Tool Manifest and Digital Signing Solution for Verifiable MCP and LLM Pipelines


76. Regularisation in neural networks: a survey and empirical analysis of approaches


77. To See Far, Look Close: Evolutionary Forecasting for Long-term Time Series


78. WiFiPenTester: Advancing Wireless Ethical Hacking with Governed GenAI


79. From Similarity to Vulnerability: Key Collision Attack on LLM Semantic Caching


80. OrLog: Resolving Complex Queries with LLMs and Probabilistic Reasoning


81. Character as a Latent Variable in Large Language Models: A Mechanistic Account of Emergent Misalignment and Conditional Safety Failures


82. ExplainerPFN: Towards tabular foundation models for model-free zero-shot feature importance estimations


83. Towards Explicit Acoustic Evidence Perception in Audio LLMs for Speech Deepfake Detection


84. HierLoc: Hyperbolic Entity Embeddings for Hierarchical Visual Geolocation


85. On the Impact of Code Comments for Automated Bug-Fixing: An Empirical Study


86. Adaptive Edge Learning for Density-Aware Graph Generation


87. Avoiding Premature Collapse: Adaptive Annealing for Entropy-Regularized Structural Inference


88. Leveraging Convolutional Sparse Autoencoders for Robust Movement Classification from Low-Density sEMG


89. Automatic Constraint Policy Optimization based on Continuous Constraint Interpolation Framework for Offline Reinforcement Learning


90. Bias Beyond Borders: Political Ideology Evaluation and Steering in Multilingual LLMs


91. Mano: Restriking Manifold Optimization for LLM Training


92. Self-Supervised Slice-to-Volume Reconstruction with Gaussian Representations for Fetal MRI


93. About an Automating Annotation Method for Robot Markers


94. Stabilizing the Q-Gradient Field for Policy Smoothness in Actor-Critic


95. Residual Context Diffusion Language Models


96. Perplexity Cannot Always Tell Right from Wrong


97. From Data Leak to Secret Misses: The Impact of Data Leakage on Secret Detection Models


98. A Real-Time Privacy-Preserving Behavior Recognition System via Edge-Cloud Collaboration


99. Protecting Private Code in IDE Autocomplete using Differential Privacy


100. MTDrive: Multi-turn Interactive Reinforcement Learning for Autonomous Driving


101. BEAR: Towards Beam-Search-Aware Optimization for Recommendation with Large Language Models


102. Evaluating Large Language Models for Security Bug Report Prediction


103. DINO-SAE: DINO Spherical Autoencoder for High-Fidelity Image Reconstruction and Generation


104. DiffuSpeech: Silent Thought, Spoken Answer via Unified Speech-Text Diffusion


105. Should LLMs, $\textit{like}$, Generate How Users Talk? Building Dialect-Accurate Dialog[ue]s Beyond the American Default with MDial


106. MoVE: Mixture of Value Embeddings – A New Axis for Scaling Parametric Memory in Autoregressive Models


107. Reinforcement Learning-Based Co-Design and Operation of Chiller and Thermal Energy Storage for Cost-Optimal HVAC Systems


108. EmoShift: Lightweight Activation Steering for Enhanced Emotion-Aware Speech Synthesis


109. Eroding the Truth-Default: A Causal Analysis of Human Susceptibility to Foundation Model Hallucinations and Disinformation in the Wild


110. Degradation-Aware Frequency Regulation of a Heterogeneous Battery Fleet via Reinforcement Learning


111. Bayesian Interpolating Neural Network (B-INN): a scalable and reliable Bayesian model for large-scale physical systems


112. MEnvAgent: Scalable Polyglot Environment Construction for Verifiable Software Engineering


113. Learning to Build Shapes by Extrusion


114. Just-in-Time Catching Test Generation at Meta


115. Offline Reinforcement Learning of High-Quality Behaviors Under Robust Style Alignment


116. User-Adaptive Meta-Learning for Cold-Start Medication Recommendation with Uncertainty Filtering


117. Hide and Seek in Embedding Space: Geometry-based Steganography and Detection in Large Language Models


118. SOMBRERO: Measuring and Steering Boundary Placement in End-to-End Hierarchical Sequence Models


119. Beyond Abstract Compliance: Operationalising trust in AI as a moral relationship


120. How Far Can Pretrained LLMs Go in Symbolic Music? Controlled Comparisons of Supervised and Preference-based Adaptation


121. Qualitative Evaluation of LLM-Designed GUI


122. Procedural Knowledge Extraction from Industrial Troubleshooting Guides Using Vision Language Models


123. UrbanMoE: A Sparse Multi-Modal Mixture-of-Experts Framework for Multi-Task Urban Region Profiling


124. Decomposing Epistemic Uncertainty for Causal Decision Making


125. ImgCoT: Compressing Long Chain of Thought into Compact Visual Tokens for Efficient Reasoning of Large Language Model


126. OpenVTON-Bench: A Large-Scale High-Resolution Benchmark for Controllable Virtual Try-On Evaluation


127. A Cross-Domain Graph Learning Protocol for Single-Step Molecular Geometry Refinement


128. AEGIS: White-Box Attack Path Generation using LLMs and Training Effectiveness Evaluation for Large-Scale Cyber Defence Exercises


129. Breaking the Blocks: Continuous Low-Rank Decomposed Scaling for Unified LLM Quantization and Adaptation


130. Vision-Language Models Unlock Task-Centric Latent Actions


131. Gated Relational Alignment via Confidence-based Distillation for Efficient VLMs


132. Deep Learning-Based Early-Stage IR-Drop Estimation via CNN Surrogate Modeling


133. PEAR: Pixel-aligned Expressive humAn mesh Recovery


134. FNF: Functional Network Fingerprint for Large Language Models


135. Do Transformers Have the Ability for Periodicity Generalization?


136. Fire on Motion: Optimizing Video Pass-bands for Efficient Spiking Action Recognition


137. From Horizontal Layering to Vertical Integration: A Comparative Study of the AI-Driven Software Development Paradigm


138. Unsupervised Synthetic Image Attribution: Alignment and Disentanglement


139. NAG: A Unified Native Architecture for Encoder-free Text-Graph Modeling in Language Models


140. Human-Centered Explainability in AI-Enhanced UI Security Interfaces: Designing Trustworthy Copilots for Cybersecurity Analysts


141. GUDA: Counterfactual Group-wise Training Data Attribution for Diffusion Models via Unlearning


142. ScholarPeer: A Context-Aware Multi-Agent Framework for Automated Peer Review


143. Training Beyond Convergence: Grokking nnU-Net for Glioma Segmentation in Sub-Saharan MRI


144. What can Computer Vision learn from Ranganathan?


145. MCP-Diag: A Deterministic, Protocol-Driven Architecture for AI-Native Network Diagnostics


146. PEFT-MuTS: A Multivariate Parameter-Efficient Fine-Tuning Framework for Remaining Useful Life Prediction based on Cross-domain Time Series Representation Model


147. Time-Annealed Perturbation Sampling: Diverse Generation for Diffusion Language Models


148. TTCS: Test-Time Curriculum Synthesis for Self-Evolving


149. Local-Global Multimodal Contrastive Learning for Molecular Property Prediction


150. Language Model Circuits Are Sparse in the Neuron Basis


151. FedCARE: Federated Unlearning with Conflict-Aware Projection and Relearning-Resistant Recovery


152. Rethinking LLM-as-a-Judge: Representation-as-a-Judge with Small Language Models via Semantic Capacity Asymmetry


153. MC-GRPO: Median-Centered Group Relative Policy Optimization for Small-Rollout Reinforcement Learning


154. Cross-Domain Few-Shot Learning for Hyperspectral Image Classification Based on Mixup Foundation Model


155. SpanNorm: Reconciling Training Stability and Performance in Deep Transformers


156. FedDis: A Causal Disentanglement Framework for Federated Traffic Prediction


157. Mitigating Hallucinations in Video Large Language Models via Spatiotemporal-Semantic Contrastive Decoding


158. Whispers of Wealth: Red-Teaming Google’s Agent Payments Protocol via Prompt Injection


159. EUGens: Efficient, Unified, and General Dense Layers


160. Are LLM Evaluators Really Narcissists? Sanity Checking Self-Preference Evaluations


161. Towards the Holographic Characteristic of LLMs for Efficient Short-text Generation


162. Adapting Reinforcement Learning for Path Planning in Constrained Parking Scenarios


163. Demystifying Design Choices of Reinforcement Fine-tuning: A Batched Contextual Bandit Learning Perspective


164. Learn from A Rationalist: Distilling Intermediate Interpretable Rationales


165. SCOPE-PD: Explainable AI on Subjective and Clinical Objective Measurements of Parkinson’s Disease for Precision Decision-Making


166. Shattered Compositionality: Counterintuitive Learning Dynamics of Transformers for Arithmetic


167. Keep Rehearsing and Refining: Lifelong Learning Vehicle Routing under Continually Drifting Tasks


168. Action-Sufficient Goal Representations


169. AI Literacy, Safety Awareness, and STEM Career Aspirations of Australian Secondary Students: Evaluating the Impact of Workshop Interventions


170. FraudShield: Knowledge Graph Empowered Defense for LLMs against Fraud Attacks


171. RulePlanner: All-in-One Reinforcement Learner for Unifying Design Rules in 3D Floorplanning


172. Training-Free Representation Guidance for Diffusion Models with a Representation Alignment Projector


173. AI Decodes Historical Chinese Archives to Reveal Lost Climate History


174. Machine Unlearning in Low-Dimensional Feature Subspace


175. Temporal Graph Pattern Machine


176. Does My Chatbot Have an Agenda? Understanding Human and AI Agency in Human-Human-like Chatbot Interaction


177. Countering the Over-Reliance Trap: Mitigating Object Hallucination for LVLMs via a Self-Validation Framework


178. Tuning the Implicit Regularizer of Masked Diffusion Language Models: Enhancing Generalization via Insights from $k$-Parity


179. Automating Forecasting Question Generation and Resolution for AI Evaluation


180. AI and My Values: User Perceptions of LLMs’ Ability to Extract, Embody, and Explain Human Values from Casual Conversations



182. MetaLead: A Comprehensive Human-Curated Leaderboard Dataset for Transparent Reporting of Machine Learning Experiments


183. Dynamic Welfare-Maximizing Pooled Testing


184. Optimization, Generalization and Differential Privacy Bounds for Gradient Descent on Kolmogorov-Arnold Networks


185. Spectral Filtering for Learning Quantum Dynamics


186. Score-based Integrated Gradient for Root Cause Explanations of Outliers


187. Jailbreaks on Vision Language Model via Multimodal Reasoning


188. Culturally Grounded Personas in Large Language Models: Characterization and Alignment with Socio-Psychological Value Frameworks


189. SP^2DPO: An LLM-assisted Semantic Per-Pair DPO Generalization


190. Graph is a Substrate Across Data Modalities


191. Context Structure Reshapes the Representational Geometry of Language Models


192. MERMAID: Memory-Enhanced Retrieval and Reasoning with Multi-Agent Iterative Knowledge Grounding for Veracity Assessment


193. The Unseen Threat: Residual Knowledge in Machine Unlearning under Perturbed Samples


194. Recoverability Has a Law: The ERR Measure for Tool-Augmented Agents


195. Learning Policy Representations for Steerable Behavior Synthesis


196. MixQuant: Pushing the Limits of Block Rotations in Post-Training Quantization


197. From Retrieving Information to Reasoning with AI: Exploring Different Interaction Modalities to Support Human-AI Coordination in Clinical Decision-Making


198. Stealthy Poisoning Attacks Bypass Defenses in Regression Settings


199. Conformal Prediction for Generative Models via Adaptive Cluster-Based Density Estimation


200. ParalESN: Enabling parallel information processing in Reservoir Computing


201. PersonaCite: VoC-Grounded Interviewable Agentic Synthetic AI Personas for Verifiable User and Design Research


202. VMonarch: Efficient Video Diffusion Transformers with Structured Attention


203. Predicting Intermittent Job Failure Categories for Diagnosis Using Few-Shot Fine-Tuned Language Models


204. AI Narrative Breakdown. A Critical Assessment of Power and Promise


205. MirrorMark: A Distortion-Free Multi-Bit Watermark for Large Language Models


206. A Systematic Literature Review on LLM Defenses Against Prompt Injection and Jailbreaking: Expanding NIST Taxonomy


207. Lost in Space? Vision-Language Models Struggle with Relative Camera Pose Estimation


208. Learning to Recommend Multi-Agent Subgraphs from Calling Trees


209. Beyond Conditional Computation: Retrieval-Augmented Genomic Foundation Models with Gengram


210. Advanced techniques and applications of LiDAR Place Recognition in Agricultural Environments: A Comprehensive Survey


211. Neural Signals Generate Clinical Notes in the Wild


212. Multitask Learning for Earth Observation Data Classification with Hybrid Quantum Network


213. Practical Evaluation of Quantum Kernel Methods for Radar Micro-Doppler Classification on Noisy Intermediate-Scale Quantum (NISQ) Hardware


214. COL-Trees: Efficient Hierarchical Object Search in Road Networks


215. ShellForge: Adversarial Co-Evolution of Webshell Generation and Multi-View Detection for Robust Webshell Defense


216. In Vino Veritas and Vulnerabilities: Examining LLM Safety via Drunk Language Inducement


217. Stablecoin Design with Adversarial-Robust Multi-Agent Systems via Trust-Weighted Signal Aggregation


218. UniFinEval: Towards Unified Evaluation of Financial Multimodal Models across Text, Images and Videos


219. Screen, Match, and Cache: A Training-Free Causality-Consistent Reference Frame Framework for Human Animation