전체 AI 논문 - 2025-10-13

1. LiveOIBench: Can Large Language Models Outperform Human Contestants in Informatics Olympiads?


2. GraphMERT: Efficient and Scalable Distillation of Reliable Knowledge Graphs from Unstructured Data


3. Safe, Untrusted, “Proof-Carrying” AI Agents: toward the agentic lakehouse


4. Titans Revisited: A Lightweight Reimplementation and Critical Analysis of a Test-Time Memory Model


5. Agentic Systems in Radiology: Design, Applications, Evaluation, and Challenges


6. Sequence Variables: A Constraint Programming Computational Domain for Routing and Sequencing


7. Toward Mechanistic Explanation of Deductive Reasoning in Language Models


8. Localist LLMs – A Mathematical Framework for Dynamic Locality Control


9. Fundamentals of Building Autonomous LLM Agents


10. RegexPSPACE: A Benchmark for Evaluating LLM Reasoning on PSPACE-complete Regex Problems


11. Comparing Knowledge Source Integration Methods for Optimizing Healthcare Knowledge Fusion in Rescue Operation


12. Dr. Bias: Social Disparities in AI-Powered Medical Guidance


13. PAC Reasoning: Controlling the Performance Loss for Efficient Reasoning


14. Leading the Follower: Learning Persuasive Agents in Social Deduction Games


15. Physics-Informed High-order Graph Dynamics Identification Learning for Predicting Complex Networks Long-term Dynamics


16. OSCAR: Orthogonal Stochastic Control for Alignment-Respecting Diversity in Flow Matching


17. MEC$^3$O: Multi-Expert Consensus for Code Time Complexity Prediction


18. Humanoid Artificial Consciousness Designed with Large Language Model Based on Psychoanalysis and Personality Theory


19. Auto-scaling Continuous Memory for GUI Agent


20. Repairing Regex Vulnerabilities via Localization-Guided Instructions


21. RefGrader: Automated Grading of Mathematical Competition Proofs using Agentic Workflows


22. TripScore: Benchmarking and rewarding real-world travel planning with fine-grained evaluation


23. Tiny-R1V: Lightweight Multimodal Unified Reasoning Model via Model Merging


24. Semantic-Condition Tuning: Fusing Graph Context with Large Language Models for Knowledge Graph Completion


25. DualResearch: Entropy-Gated Dual-Graph Retrieval for Answer Reconstruction


26. EcphoryRAG: Re-Imagining Knowledge-Graph RAG via Human Associative Memory


27. FATHOMS-RAG: A Framework for the Assessment of Thinking and Observation in Multimodal Systems that use Retrieval Augmented Generation


28. RADAR: Mechanistic Pathways for Detecting Data Contamination in LLM Evaluation


29. LM Fight Arena: Benchmarking Large Multimodal Models via Game Competition


30. GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare


31. ReviewerToo: Should AI Join The Program Committee? A Look At The Future of Peer Review


32. What Is Your Agent’s GPA? A Framework for Evaluating Agent Goal-Plan-Action Alignment


33. Everyone prefers human writers, including AI


34. COMPASS: Enhancing Agent Long-Horizon Reasoning with Evolving Context


35. Robust Heuristic Algorithm Design with LLMs


36. Unified World Models: Memory-Augmented Planning and Foresight for Visual Navigation


37. Optimizing delivery for quick commerce factoring qualitative assessment of generated routes


38. Hypothesis Hunting with Evolving Networks of Autonomous Scientific Agents


39. StreamingVLM: Real-Time Understanding for Infinite Video Streams


40. Prompting Test-Time Scaling Is A Strong LLM Reasoning Data Augmentation


41. BaNEL: Exploration Posteriors for Generative Modeling Using Only Negative Rewards


42. Dyna-Mind: Learning to Simulate from Experience for Better AI Agents


43. SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models


44. Mitigating Overthinking through Reasoning Shaping


45. A methodology for clinically driven interactive segmentation evaluation


46. Autonomous Soft Robotic Guidewire Navigation via Imitation Learning


47. Precoder Design in Multi-User FDD Systems with VQ-VAE and GNN


48. Performance Analysis of Machine Learning Algorithms in Chronic Kidney Disease Prediction


49. Multimodal Policy Internalization for Conversational Agents


50. Scalable Multi-Agent Path Finding using Collision-Aware Dynamic Alert Mask and a Hybrid Execution Strategy


51. Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols


52. Failure Prediction at Runtime for Generative Robot Policies


53. SilvaScenes: Tree Segmentation and Species Classification from Under-Canopy Images in Natural Forests


54. Bandits with Single-Peaked Preferences and Limited Resources


55. The Speech-LLM Takes It All: A Truly Fully End-to-End Spoken Dialogue State Tracking Approach


56. On the Representations of Entities in Auto-regressive Large Language Models


57. Beyond Single-Granularity Prompts: A Multi-Scale Chain-of-Thought Prompt Learning for Graph


58. ChoirRec: Semantic User Grouping via LLMs for Conversion Rate Prediction of Low-Activity Users


59. Identifying & Interactively Refining Ambiguous User Goals for Data Visualization Code Generation


60. Design Principles for Sequence Models via Coefficient Dynamics


61. Task-Level Insights from Eigenvalues across Sequence Models


62. The Potential of Second-Order Optimization for LLMs: A Study with Full Gauss-Newton


63. deep-REMAP: Probabilistic Parameterization of Stellar Spectra Using Regularized Multi-Task Learning


64. FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference


65. Randomized HyperSteiner: A Stochastic Delaunay Triangulation Heuristic for the Hyperbolic Steiner Minimal Tree


66. Rate optimal learning of equilibria from data


67. Verifying Chain-of-Thought Reasoning via Its Computational Graph


68. A Model-Driven Engineering Approach to AI-Powered Healthcare Platforms


69. CapGeo: A Caption-Assisted Approach to Geometric Reasoning


70. CLARity: Reasoning Consistency Alone Can Teach Reinforced Experts


71. Inflated Excellence or True Performance? Rethinking Medical Diagnostic Benchmarks with Dynamic Evaluation


72. SynthID-Image: Image watermarking at internet scale


73. Detecting Data Contamination from Reinforcement Learning Post-training for Large Language Models


74. Obstacle Avoidance using Dynamic Movement Primitives and Reinforcement Learning


75. CrisiText: A dataset of warning messages for LLM training in emergency communication


76. Diagnosing Shoulder Disorders Using Multimodal Large Language Models and Consumer-Grade Cameras


77. Clear Roads, Clear Vision: Advancements in Multi-Weather Restoration for Smart Transportation


78. DICE: Structured Reasoning in LLMs through SLM-Guided Chain-of-Thought Correction


79. Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs


80. Towards Safer and Understandable Driver Intention Prediction


81. Modern Deep Learning Approaches for Cricket Shot Classification: A Comprehensive Baseline Study


82. On the Implicit Adversariality of Catastrophic Forgetting in Deep Continual Learning


83. Cross-Representation Benchmarking in Time-Series Electronic Health Records for Clinical Outcome Prediction


84. Federated Data Analytics for Cancer Immunotherapy: A Privacy-Preserving Collaborative Platform for Patient Management


85. Controlled Personalization in Legacy Media Online Services: A Case Study in News Recommendation


86. MSDM: Generating Task-Specific Pathology Images with a Multimodal Conditioned Diffusion Model for Cell and Nuclei Segmentation


87. On the Fairness of Privacy Protection: Measuring and Mitigating the Disparity of Group Privacy Risks for Differentially Private Machine Learning


88. SOS: Synthetic Object Segments Improve Detection, Segmentation, and Grounding


89. MemLoss: Enhancing Adversarial Training with Recycling Adversarial Examples


90. When a Robot is More Capable than a Human: Learning from Constrained Demonstrators


91. AI and Human Oversight: A Risk-Based Framework for Alignment


92. Training Models to Detect Successive Robot Errors from Human Reactions


93. Emotion-Disentangled Embedding Alignment for Noise-Robust and Cross-Corpus Speech Emotion Recognition


94. Alif: Advancing Urdu Large Language Models via Multilingual Synthetic Data Distillation


95. Cost-Efficient Long Code Translation using LLMs while Leveraging Identifier Replacements


96. Robust Driving Control for Autonomous Vehicles: An Intelligent General-sum Constrained Adversarial Reinforcement Learning Approach


97. Déréverbération non-supervisée de la parole par modèle hybride


98. Value-State Gated Attention for Mitigating Extreme-Token Phenomena in Transformers


99. DiTSinger: Scaling Singing Voice Synthesis with Diffusion Transformer and Implicit Alignment


100. On Epistemic Uncertainty of Visual Tokens for Object Hallucinations in Large Vision-Language Models


101. SQS: Bayesian DNN Compression through Sparse Quantized Sub-distributions


102. Saving SWE-Bench: A Benchmark Mutation Approach for Realistic Agent Evaluation



104. SEER: Sustainability Enhanced Engineering of Software Requirements


105. Learning Regularizers: Learning Optimizers that can Regularize


106. Analytical Survey of Learning with Low-Resource Data: From Analysis to Investigation


107. A Human Behavioral Baseline for Collective Governance in Software Projects


108. SHERLOCK: Towards Dynamic Knowledge Adaptation in LLM-enhanced E-commerce Risk Management


109. RO-Bench: Large-scale robustness evaluation of MLLMs with text-driven counterfactual videos


110. Co-Authoring the Self: A Human-AI Interface for Interest Reflection in Recommenders


111. A Frequency-Domain Analysis of the Multi-Armed Bandit Problem: A New Perspective on the Exploration-Exploitation Trade-off


112. A Unified Biomedical Named Entity Recognition Framework with Large Language Models


113. Pinpointing crucial steps: Attribution-based Credit Assignment for Verifiable Reinforcement Learning


114. HES-SQL: Hybrid Reasoning for Efficient Text-to-SQL with Structural Skeleton Guidance


115. Exploring Multi-Temperature Strategies for Token- and Rollout-Level Control in RLVR


116. Designing and Evaluating an AI-driven Immersive Multidisciplinary Simulation (AIMS) for Interprofessional Education


117. ControlAudio: Tackling Text-Guided, Timing-Indicated and Intelligible Audio Generation via Progressive Diffusion Modeling


118. Vector Graph-Based Repository Understanding for Issue-Driven File Retrieval


119. Slicing Is All You Need: Towards A Universal One-Sided Algorithm for Distributed Matrix Multiplication


120. Pattern Enhanced Multi-Turn Jailbreaking: Exploiting Structural Vulnerabilities in Large Language Models


121. Time-Aware Feature Selection: Adaptive Temporal Masking for Stable Sparse Autoencoder Training


122. Repository-Aware File Path Retrieval via Fine-Tuned LLMs


123. Reinforcement Learning-Driven Edge Management for Reliable Multi-view 3D Reconstruction


124. CommandSans: Securing AI Agents with Surgical Precision Prompt Sanitization


125. McMining: Automated Discovery of Misconceptions in Student Code


126. D-CoDe: Scaling Image-Pretrained VLMs to Video via Dynamic Compression and Question Decomposition


127. $\mathsf{P} \neq \mathsf{NP}$: A Non-Relativizing Proof via Quantale Weakness and Geometric Complexity


128. Adaptive Science Operations in Deep Space Missions Using Offline Belief State Planning


129. Benchmarking Chinese Commonsense Reasoning with a Multi-hop Reasoning Perspective


130. SkipSR: Faster Super Resolution with Token Skipping


131. Deceptive Exploration in Multi-armed Bandits


132. MLLM as a UI Judge: Benchmarking Multimodal LLMs for Predicting Human Perception of User Interfaces


133. Guiding Exploration in Reinforcement Learning Through LLM-Augmented Observations


134. Measuring Moral LLM Responses in Multilingual Capacities


135. Re-Identifying Kākā with AI-Automated Video Key Frame Extraction


136. Struc-EMB: The Potential of Structure-Aware Encoding in Language Embeddings


137. SAFER-AiD: Saccade-Assisted Foveal-peripheral vision Enhanced Reconstruction for Adversarial Defense


138. Graph Diffusion Transformers are In-Context Molecular Designers


139. Coordinates from Context: Using LLMs to Ground Complex Location References


140. When to Reason: Semantic Router for vLLM


141. Enhancing Self-Supervised Learning with Semantic Pairs A New Dataset and Empirical Study


142. In-Context Learning for Non-Stationary MIMO Equalization


143. ConPoSe: LLM-Guided Contact Point Selection for Scalable Cooperative Object Pushing


144. BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution


145. FreqCa: Accelerating Diffusion Models via Frequency-Aware Caching


146. RAG4Tickets: AI-Powered Ticket Resolution via Retrieval-Augmented Generation on JIRA and GitHub Data


147. dInfer: An Efficient Inference Framework for Diffusion Language Models


148. RA-Gen: A Controllable Code Generation Framework Using ReAct for Multi-Agent Task Execution


149. Faver: Boosting LLM-based RTL Generation with Function Abstracted Verifiable Middleware


150. A Novel Framework for Augmenting Rating Scale Tests with LLM-Scored Text Data


151. DPCformer: An Interpretable Deep Learning Model for Genomic Prediction in Crops


152. CATS-Linear: Classification Auxiliary Linear Model for Time Series Forecasting


153. Provably Robust Adaptation for Language-Empowered Foundation Models


154. Inner-Instance Normalization for Time Series Forecasting


155. A 3D Generation Framework from Cross Modality to Parameterized Primitive


156. Knowledge Graph Sparsification for GNN-based Rare Disease Diagnosis


157. Formalizing Style in Personal Narratives


158. Inverse-Free Wilson Loops for Transformers: A Practical Diagnostic for Invariance and Order Sensitivity


159. Upfront Chain-of-Thought: A Cooperative Framework for Chain-of-Thought Compression


160. Energy-Driven Steering: Reducing False Refusals in Large Language Models


161. Automating Android Build Repair: Bridging the Reasoning-Execution Gap in LLM Agents with Domain-Specific Tools


162. Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry


163. Hi-OSCAR: Hierarchical Open-set Classifier for Human Activity Recognition


164. From What to Why: Thought-Space Recommendation with Small Language Models


165. Impact of LLMs on Team Collaboration in Software Development


166. Relative Positioning Based Code Chunking Method For Rich Context Retrieval In Repository Level Code Completion Task With Code Language Model


167. MMA-ASIA: A Multilingual and Multimodal Alignment Framework for Culturally-Grounded Evaluation


168. Centering Emotion Hotspots: Multimodal Local-Global Fusion and Cross-Modal Alignment for Emotion Recognition in Conversations


169. Toward a Safer Web: Multilingual Multi-Agent LLMs for Mitigating Adversarial Misinformation Attacks


170. LatentBreak: Jailbreaking Large Language Models through Latent Space Feedback


171. Mnemosyne: An Unsupervised, Human-Inspired Long-Term Memory Architecture for Edge-Based LLMs


172. Recover-LoRA: Data-Free Accuracy Recovery of Degraded Language Models via Low-Rank Adaptation


173. BaldWhisper: Faster Whisper with Head Shearing and Layer Merging


174. Hierarchical Self-Supervised Representation Learning for Depression Detection from Speech


175. Less Diverse, Less Safe: The Indirect But Pervasive Risk of Test-Time Scaling in Large Language Models


176. The Enduring Dominance of Deep Neural Networks: A Critical Analysis of the Fundamental Limitations of Quantum Machine Learning and Spiking Neural Networks


177. Beyond CNNs: Efficient Fine-Tuning of Multi-Modal LLMs for Object Detection on Low-Data Regimes


178. EGSTalker: Real-Time Audio-Driven Talking Head Generation with Efficient Gaussian Deformation


179. Dynamic Stress Detection: A Study of Temporal Progression Modelling of Stress in Speech


180. Articulation-Informed ASR: Integrating Articulatory Features into ASR via Auxiliary Speech Inversion and Cross-Attention Fusion


181. Evaluating Hallucinations in Multimodal LLMs with Spoken Queries under Diverse Acoustic Conditions


182. LadderSym: A Multimodal Interleaved Transformer for Music Practice Error Detection


183. AgenticAD: A Specialized Multiagent System Framework for Holistic Alzheimer Disease Management


184. Comparative Analysis of Large Language Models for the Machine-Assisted Resolution of User Intentions


185. PyNoetic: A modular python framework for no-code development of EEG brain-computer interfaces


186. Deep Sparse Representation-based Classification


187. Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition with Multimodal Training


188. Deep Multimodal Subspace Clustering Networks