전체 AI 논문 - 2025-12-31

1. Web World Models


2. Regret-Based Federated Causal Discovery with Unknown Interventions


3. Physics-Informed Neural Networks for Device and Circuit Modeling: A Case Study of NeuroSPICE


4. Divergent-Convergent Thinking in Large Language Models for Creative Problem Generation


5. Why AI Safety Requires Uncertainty, Incomplete Preferences, and Non-Archimedean Utilities


6. The Gaining Paths to Investment Success: Information-Driven LLM Graph Reasoning for Venture Capital Prediction


7. Replay Failures as Successes: Sample-Efficient Reinforcement Learning for Instruction Following


8. AKG kernel Agent: A Multi-Agent Framework for Cross-Platform Kernel Synthesis


9. The World Is Bigger! A Computationally-Embedded Perspective on the Big World Hypothesis


10. MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning


11. CubeBench: Diagnosing Interactive, Long-Horizon Spatial Reasoning Under Partial Observations


12. On Conformant Planning and Model-Checking of $\exists^\forall^$ Hyperproperties


13. Agentic Physical AI toward a Domain-Specific Foundation Model for Nuclear Reactor Control


14. TCEval: Using Thermal Comfort to Assess Cognitive and Perceptual Abilities of AI


15. From Model Choice to Model Belief: Establishing a New Measure for LLM-Based Research



17. Why We Need a New Framework for Emotional Intelligence in AI


18. InSPO: Unlocking Intrinsic Self-Reflection for LLM Preference Optimization


19. Benchmark Success, Clinical Failure: When Reinforcement Learning Optimizes for Benchmarks, Not Patients


20. The Reward Model Selection Crisis in Personalized Alignment


21. Problems With Large Language Models for Learner Modelling: Why LLMs Alone Fall Short for Responsible Tutoring in K–12 Education


22. Multimodal Fact-Checking: An Agent-based Approach


23. Geometric Structural Knowledge Graph Foundation Model


24. HiSciBench: A Hierarchical Multi-disciplinary Benchmark for Scientific Intelligence from Reading to Discovery


25. SAMP-HDRL: Segmented Allocation with Momentum-Adjusted Utility for Multi-agent Portfolio Management via Hierarchical Deep Reinforcement Learning


26. Memento-II: Learning by Stateful Reflective Memory


27. TravelBench: A Real-World Benchmark for Multi-Turn and Tool-Augmented Travel Planning


28. DICE: Discrete Interpretable Comparative Evaluation with Probabilistic Scoring for Retrieval-Augmented Generation


29. The Wisdom of Deliberating AI Crowds: Does Deliberation Improve LLM-Based Forecasting?


30. LLM Agents as VC investors: Predicting Startup Success via RolePlay-Based Collective Simulation


31. Learning Multi-Modal Mobility Dynamics for Generalized Next Location Recommendation


32. Tyee: A Unified, Modular, and Fully-Integrated Configurable Toolkit for Intelligent Physiological Health Care


33. SANet: A Semantic-aware Agentic AI Networking Framework for Cross-layer Optimization in 6G


34. Lessons from Neuroscience for AI: How integrating Actions, Compositional Structure and Episodic Memory could enable Safe, Interpretable and Human-Like AI


35. Multi-AI Agent Framework Reveals the “Oxide Gatekeeper” in Aluminum Nanoparticle Oxidation


36. DarkPatterns-LLM: A Multi-Layer Benchmark for Detecting Manipulative and Harmful AI Behavior


37. Monadic Context Engineering


38. Lightweight Inference-Time Personalization for Frozen Knowledge Graph Embeddings


39. HalluMat: Detecting Hallucinations in LLM-Generated Materials Science Content Through Multi-Stage Verification


40. Subgoaling Relaxation-based Heuristics for Numeric Planning with Infinite Actions


41. Agent2World: Learning to Generate Symbolic World Models via Adaptive Multi-Agent Feedback


42. SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence


43. Logic Sketch Prompting (LSP): A Deterministic and Interpretable Prompting Method


44. Shape of Thought: When Distribution Matters More than Correctness in Reasoning Tasks


45. We are not able to identify AI-generated images


46. With Great Capabilities Come Great Responsibilities: Introducing the Agentic Risk & Capability Framework for Governing Agentic AI Systems


47. Toward Equitable Recovery: A Fairness-Aware AI Framework for Prioritizing Post-Flood Aid in Bangladesh


48. GamiBench: Evaluating Spatial Reasoning and 2D-to-3D Planning Capabilities of MLLMs with Origami Folding Tasks


49. Emergent Persuasion: Will LLMs Persuade Without Being Prompted?


50. Bidirectional RAG: Safe Self-Improving Retrieval-Augmented Generation Through Multi-Stage Validation


51. Multilingual Hidden Prompt Injection Attacks on LLM-Based Academic Reviewing


52. Nested Browser-Use Learning for Agentic Information Seeking


53. AI tutoring can safely and effectively support students: An exploratory RCT in UK classrooms


54. BOAD: Discovering Hierarchical Software Engineering Agents via Bandit Optimization


55. Le Cam Distortion: A Decision-Theoretic Framework for Robust Transfer Learning


56. RxnBench: A Multimodal Benchmark for Evaluating Large Language Models on Chemical Reaction Understanding from Scientific Literature


57. VL-RouterBench: A Benchmark for Vision-Language Model Routing


58. Toward Trustworthy Agentic AI: A Multimodal Framework for Preventing Prompt Injection Attacks


59. Lie to Me: Knowledge Graphs for Robust Hallucination Self-Detection in LLMs


60. PathFound: An Agentic Multimodal Model Activating Evidence-seeking Pathological Diagnosis


61. Act2Goal: From World Model To General Goal-conditioned Policy


62. AnyMS: Bottom-up Attention Decoupling for Layout-guided and Training-free Multi-subject Customization


63. Alpha-R1: Alpha Screening with LLM Reasoning via Reinforcement Learning


64. UniHetero: Could Generation Enhance Understanding for Vision-Language-Model at Large Data Scale?



66. ML Compass: Navigating Capability, Cost, and Compliance Trade-offs in AI Model Deployment


67. FRoD: Full-Rank Efficient Fine-Tuning with Rotational Degrees for Fast Convergence


68. Theory of Mind for Explainable Human-Robot Interaction


69. Agentic AI for Autonomous Defense in Software Supply Chain Security: Beyond Provenance to Vulnerability Mitigation


70. Semantic Tree Inference on Text Corpa using a Nested Density Approach together with Large Language Model Embeddings


71. HY-Motion 1.0: Scaling Flow Matching Models for Text-To-Motion Generation


72. Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance


73. CoFi-Dec: Hallucination-Resistant Decoding via Coarse-to-Fine Generative Feedback in Large Vision-Language Models


74. Fuzzy-Logic and Deep Learning for Environmental Condition-Aware Road Surface Classification


75. Mobile-Efficient Speech Emotion Recognition Using DistilHuBERT: A Cross-Corpus Validation Study


76. Directly Constructing Low-Dimensional Solution Subspaces in Deep Neural Networks


77. Theoretical Foundations of Scaling Law in Familial Models


78. PINNs for Electromagnetic Wave Propagation


79. Securing the AI Supply Chain: What Can We Learn From Developer-Reported Security Issues and Solutions of AI Projects?


80. A unified framework for detecting point and collective anomalies in operating system logs via collaborative transformers


81. SoulX-LiveTalk Technical Report


82. Post-Training Quantization of OpenPangu Models for Efficient Deployment on Atlas A2


83. AGRO-SQL: Agentic Group-Relative Optimization with High-Fidelity Data Synthesis


84. ECG-RAMBA: Zero-Shot ECG Generalization by Morphology-Rhythm Disentanglement and Long-Range Modeling


85. AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents


86. The Law of Multi-Model Collaboration: Scaling Limits of Model Ensembling for Large Language Models


87. Explainable Neural Inverse Kinematics for Obstacle-Aware Robotic Manipulation: A Comparative Analysis of IKNet Variants


88. Splitwise: Collaborative Edge-Cloud Inference for LLMs via Lyapunov-Assisted DRL


89. MedGemma vs GPT-4: Open-Source and Proprietary Zero-shot Medical Disease Classification from Images


90. Interpretable Safety Alignment via SAE-Constructed Low-Rank Subspace Adaptation


91. ViLaCD-R1: A Vision-Language Framework for Semantic Change Detection in Remote Sensing


92. KernelEvolve: Scaling Agentic Kernel Coding for Heterogeneous AI Accelerators at Meta


93. Physics-Inspired Modeling and Content Adaptive Routing in an Infrared Gas Leak Detection Network


94. Anomaly Detection by Effectively Leveraging Synthetic Images


95. Holi-DETR: Holistic Fashion Item Detection Leveraging Contextual Information


96. Scoring, Reasoning, and Selecting the Best! Ensembling Large Language Models via a Peer-Review Process


97. Exploring Syn-to-Real Domain Adaptation for Military Target Detection


98. Not too long do read: Evaluating LLM-generated extreme scientific summaries


99. ForCM: Forest Cover Mapping from Multispectral Sentinel-2 Image by Integrating Deep Learning with Object-Based Image Analysis


100. EIR: Enhanced Image Representations for Medical Report Generation


101. EquaCode: A Multi-Strategy Jailbreak Approach for Large Language Models via Equation Solving and Code Completion


102. Constraint programming model and biased random-key genetic algorithm for the single-machine coupled task scheduling problem with exact delays to minimize the makespan


103. Reservoir Computing inspired Matrix Multiplication-free Language Model


104. An Inference-Based Architecture for Intent and Affordance Saturation in Decision-Making


105. PathoSyn: Imaging-Pathology MRI Synthesis via Disentangled Deviation Diffusion


106. It’s a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web Agents


107. How Much Data Is Enough? Uniform Convergence Bounds for Generative & Vision-Language Models under Low-Dimensional Structure


108. A Note on Hybrid Online Reinforcement and Imitation Learning for LLMs: Formulations and Algorithms


109. MedSAM-based lung masking for multi-label chest X-ray classification


110. Taming the Tail: Stable LLM Reinforcement Learning via Dynamic Vocabulary Pruning


111. Deep Learning for Art Market Valuation


112. Multimodal Functional Maximum Correlation for Emotion Recognition


113. Trust Region Masking for Long-Horizon LLM Reinforcement Learning


114. Is Chain-of-Thought Really Not Explainability? Chain-of-Thought Can Be Faithful without Hint Verbalization


115. Viability and Performance of a Private LLM Server for SMBs: A Benchmark Analysis of Qwen3-30B on Consumer-Grade Hardware


116. An Architecture-Led Hybrid Report on Body Language Detection Project


117. LENS: LLM-Enabled Narrative Synthesis for Mental Health by Aligning Multimodal Sensing with Language Models


118. OpenGround: Active Cognition-based Reasoning for Open-World 3D Visual Grounding


119. JADAI: Jointly Amortizing Adaptive Design and Bayesian Inference


120. APO: Alpha-Divergence Preference Optimization


121. Heterogeneity in Multi-Agent Reinforcement Learning


122. Sat-EnQ: Satisficing Ensembles of Weak Q-Learners for Reliable and Compute-Efficient Reinforcement Learning


123. A Neural Network-Based Real-time Casing Collar Recognition System for Downhole Instruments


124. DECEPTICON: How Dark Patterns Manipulate Web Agents


125. Agentic AI for Cyber Resilience: A New Security Paradigm and Its System-Theoretic Foundations


126. SwinTF3D: A Lightweight Multimodal Fusion Approach for Text-Guided 3D Medical Image Segmentation


127. Reinforcement Networks: novel framework for collaborative Multi-Agent Reinforcement Learning tasks


128. The body is not there to compute: Comment on “Informational embodiment: Computational role of information structure in codes and robots” by Pitti et al


129. AutoForge: Automated Environment Synthesis for Agentic Reinforcement Learning


130. FasterPy: An LLM-based Code Execution Efficiency Optimization Framework


131. EgoReAct: Egocentric Video-Driven 3D Human Reaction Generation


132. MoR: Mixture Of Representations For Mixed-Precision Training


133. CNSight: Evaluation of Clinical Note Segmentation Tools


134. Reach-Avoid Differential game with Reachability Analysis for UAVs: A decomposition approach


135. SNM-Net: A Universal Framework for Robust Open-Set Gas Recognition via Spherical Normalization and Mahalanobis Distance


136. Adapting, Fast and Slow: Transportable Circuits for Few-Shot Learning


137. GRExplainer: A Universal Explanation Method for Temporal Graph Neural Networks


138. Next Best View Selections for Semantic and Dynamic 3D Gaussian Splatting


139. Understanding the Mechanisms of Fast Hyperparameter Transfer


140. Active Constraint Learning in High Dimensions from Demonstrations


141. Robust LLM-based Column Type Annotation via Prompt Augmentation with LoRA Tuning


142. Harnessing Large Language Models for Biomedical Named Entity Recognition


143. FoldAct: Efficient and Stable Context Folding for Long-Horizon Search Agents


144. GHaLIB: A Multilingual Framework for Hope Speech Detection in Low-Resource Languages


145. Learning with the $p$-adics


146. Conformal Prediction Sets for Next-Token Prediction in Large Language Models: Balancing Coverage Guarantees with Set Efficiency


147. Fragile Knowledge, Robust Instruction-Following: The Width Pruning Dichotomy in Llama-3.2


148. Unleashing Foundation Vision Models: Adaptive Transfer for Diverse Data-Limited Scientific Domains


149. Investigating Deep Learning Models for Ejection Fraction Estimation from Echocardiography Videos


150. Scaling Unverifiable Rewards: A Case Study on Visual Insights


151. Chord Recognition with Deep Learning


152. Geometry-Aware Optimization for Respiratory Sound Classification: Enhancing Sensitivity with SAM-Optimized Audio Spectrogram Transformers


153. Learning When Not to Attend Globally


154. RollArt: Scaling Agentic RL Training via Disaggregated Infrastructure


155. TimePerceiver: An Encoder-Decoder Framework for Generalized Time-Series Forecasting


156. Self-Rewarded Multimodal Coherent Reasoning Across Diverse Visual Domains


157. CoAgent: Collaborative Planning and Consistency Agent for Coherent Video Generation


158. Towards Reliable Evaluation of Adversarial Robustness for Spiking Neural Networks


159. Predicting LLM Correctness in Prosthodontics Using Metadata and Hallucination Signals


160. Hierarchical Pedagogical Oversight: A Multi-Agent Adversarial Framework for Reliable AI Tutoring


161. Role-Based Fault Tolerance System for LLM RL Post-Training


162. ManchuTTS: Towards High-Quality Manchu Speech Synthesis via Flow Matching and Hierarchical Text Representation


163. SPECTRE: Spectral Pre-training Embeddings with Cylindrical Temporal Rotary Position Encoding for Fine-Grained sEMG-Based Movement Decoding


164. Gradient Dynamics of Attention: How Cross-Entropy Sculpts Bayesian Manifolds


165. The Bayesian Geometry of Transformer Attention


166. AMBIT: Augmenting Mobility Baselines with Interpretable Trees


167. Towards Robust Optical-SAR Object Detection under Missing Modalities: A Dynamic Quality-Aware Fusion Framework


168. HiFi-RAG: Hierarchical Content Filtering and Two-Pass Generation for Open-Domain RAG


169. SuperiorGAT: Graph Attention Networks for Sparse LiDAR Point Cloud Reconstruction in Autonomous Systems


170. FluenceFormer: Transformer-Driven Multi-Beam Fluence Map Regression for Radiotherapy Planning


171. Bright 4B: Scaling Hyperspherical Learning for Segmentation in 3D Brightfield Microscopy


172. Nightjar: Dynamic Adaptive Speculative Decoding for Large Language Models Serving


173. Emergence of Human to Robot Transfer in Vision-Language-Action Models


174. A Unified AI, Embedded, Simulation, and Mechanical Design Approach to an Autonomous Delivery Robot


175. Efficient Multi-Model Orchestration for Self-Hosted Large Language Models


176. Space AI: Leveraging Artificial Intelligence for Space to Improve Life on Earth


177. BLISS: Bandit Layer Importance Sampling Strategy for Efficient Training of Graph Neural Networks


178. AI-Generated Code Is Not Reproducible (Yet): An Empirical Study of Dependency Gaps in LLM-Based Coding Agents


179. LLM-Guided Exemplar Selection for Few-Shot Wearable-Sensor Human Activity Recognition


180. Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration


181. Towards Efficient Post-Training via Fourier-Driven Adapter Architectures


182. Self-Evaluation Unlocks Any-Step Text-to-Image Generation


183. Cost-Aware Text-to-SQL: An Empirical Study of Cloud Compute Costs for LLM-Generated Queries


184. VULCAN: Tool-Augmented Multi Agents for Iterative 3D Object Arrangement


185. Human-like visual computing advances explainability and few-shot learning in deep neural networks for complex physiological data


186. The Effectiveness of Approximate Regularized Replay for Efficient Supervised Fine-Tuning of Large Language Models


187. Feature Learning with Multi-Stage Vision Transformers on Inter-Modality HER2 Status Scoring and Tumor Classification on Whole Slides


188. The Multi-View Paradigm Shift in MRI Radiomics: Predicting MGMT Methylation in Glioblastoma


189. Expert System for Bitcoin Forecasting: Integrating Global Liquidity via TimeXer Transformers


190. SpotEdit: Selective Region Editing in Diffusion Transformers


191. SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents


192. LangPrecip: Language-Aware Multimodal Precipitation Nowcasting


193. VideoZoomer: Reinforcement-Learned Temporal Focusing for Long Video Reasoning


194. LLMBoost: Make Large Language Models Stronger with Boosting


195. LLA: Enhancing Security and Privacy for Generative Models with Logic-Locked Accelerators


196. Beyond Single Bugs: Benchmarking Large Language Models for Multi-Vulnerability Detection


197. Attack-Aware Deepfake Detection under Counter-Forensic Manipulations


198. A Three-Level Alignment Framework for Large-Scale 3D Retrieval and Controlled 4D Generation


199. Multi-Head Spectral-Adaptive Graph Anomaly Detection


200. When Algorithms Manage Humans: A Double Machine Learning Approach to Estimating Nonlinear Effects of Algorithmic Control on Gig Worker Performance and Wellbeing


201. Co-GRPO: Co-Optimized Group Relative Policy Optimization for Masked Diffusion Model


202. Cluster Aggregated GAN (CAG): A Cluster-Based Hybrid Model for Appliance Pattern Generation


203. DBAW-PIKAN: Dynamic Balance Adaptive Weight Kolmogorov-Arnold Neural Network for Solving Partial Differential Equations


204. Valori: A Deterministic Memory Substrate for AI Systems


205. The Illusion of Clinical Reasoning: A Benchmark Reveals the Pervasive Gap in Vision-Language Models for Clinical Competency


206. LLMTM: Benchmarking and Optimizing LLMs for Temporal Motif Analysis in Dynamic Graphs


207. ReVEAL: GNN-Guided Reverse Engineering for Formal Verification of Optimized Multipliers


208. Agentic Software Issue Resolution with Large Language Models: A Survey



210. Interpretable Perturbation Modeling Through Biomedical Knowledge Graphs


211. Calibrating LLM Judges: Linear Probes for Fast and Reliable Uncertainty Estimation


212. Fairness Evaluation of Risk Estimation Models for Lung Cancer Screening


213. Enhanced geometry prediction in laser directed energy deposition using meta-learning


214. EvoXplain: When Machine Learning Models Agree on Predictions but Disagree on Why – Measuring Mechanistic Multiplicity Across Training Runs


215. Multi-objective hybrid knowledge distillation for efficient deep learning in smart agriculture


216. Masking Teacher and Reinforcing Student for Distilling Vision-Language Models


217. Meta-information Guided Cross-domain Synergistic Diffusion Model for Low-dose PET Reconstruction


218. DiRL: An Efficient Post-Training Framework for Diffusion Language Models


219. Scalable Cloud-Native Architectures for Intelligent PMU Data Processing


220. VideoScaffold: Elastic-Scale Visual Hierarchies for Streaming Video Understanding in MLLMs


221. Literature Mining System for Nutraceutical Biosynthesis: From AI Framework to Biological Insight


222. ReGAIN: Retrieval-Grounded AI Framework for Network Traffic Analysis


223. Müntz-Szász Networks: Neural Architectures with Learnable Power-Law Bases


224. On Extending Semantic Abstraction for Efficient Search of Hidden Objects


225. VLM-PAR: A Vision Language Model for Pedestrian Attribute Recognition


226. Signal-SGN++: Topology-Enhanced Time-Frequency Spiking Graph Network for Skeleton-Based Action Recognition


227. On the Existence and Behaviour of Secondary Attention Sinks


228. Super-Resolution Enhancement of Medical Images Based on Diffusion Model: An Optimization Scheme for Low-Resolution Gastric Images


229. CosineGate: Semantic Dynamic Routing via Cosine Incompatibility in Residual Networks


230. TCFormer: A 5M-Parameter Transformer with Density-Guided Aggregation for Weakly-Supervised Crowd Counting


231. MatKV: Trading Compute for Flash Storage in LLM Inference


232. HookMIL: Revisiting Context Modeling in Multiple Instance Learning for Computational Pathology


233. Learning Tennis Strategy Through Curriculum-Based Dueling Double Deep Q-Networks


234. Unbiased Visual Reasoning with Controlled Visual Inputs


235. Enhancing Medical Data Analysis through AI-Enhanced Locally Linear Embedding: Applications in Medical Point Location and Imagery



237. iOS as Acceleration


238. Wireless Traffic Prediction with Large Language Model


239. Field strength-dependent performance variability in deep learning-based analysis of magnetic resonance imaging


240. Characterizing Motion Encoding in Video Diffusion Timesteps


241. BitFlipScope: Scalable Fault Localization and Recovery for Bit-Flip Corruptions in LLMs


242. Solving Multi-Agent Multi-Goal Path Finding Problems in Polynomial Time


243. Practical challenges of control monitoring in frontier AI deployments


244. Neural ocean forecasting from sparse satellite-derived observations: a case-study for SSH dynamics and altimetry data


245. Adaptive GPU Resource Allocation for Multi-Agent Collaborative Reasoning in Serverless Environments


246. Rethinking Leveraging Pre-Trained Multi-Layer Representations for Speaker Verification


247. GPU Kernel Optimization Beyond Full Builds: An LLM Framework with Minimal Executable Programs


248. Pre-review to Peer review: Pitfalls of Automating Reviews using Large Language Models


249. The Complete Anatomy of the Madden-Julian Oscillation Revealed by Artificial Intelligence


250. HLS4PC: A Parametrizable Framework For Accelerating Point-Based 3D Point Cloud Models on FPGA


251. SoDA: An Efficient Interaction Paradigm for the Agentic Web


252. ReCollab: Retrieval-Augmented LLMs for Cooperative Ad-hoc Teammate Modeling


253. GPU-Virt-Bench: A Comprehensive Benchmarking Framework for Software-Based GPU Virtualization Systems