전체 AI 논문 - 2025-10-09

1. Agentic generative AI for media content discovery at the national football league


2. Multi-Objective Multi-Agent Path Finding with Lexicographic Cost Preferences


3. NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents


4. Integrating Domain Knowledge into Process Discovery Using Large Language Models


5. The Contingencies of Physical Embodiment Allow for Open-Endedness and Care


6. The Cognitive Bandwidth Bottleneck: Shifting Long-Horizon Agent from Planning with Actions to Planning with Schemas


7. VRPAgent: LLM-Driven Discovery of Heuristic Operators for Vehicle Routing Problems


8. Inductive Learning for Possibilistic Logic Programs Under Stable Models


9. Prompt Optimization Across Multiple Agents for Representing Diverse Human Populations


10. Tool-Augmented Policy Optimization: Synergizing Reasoning and Adaptive Tool Use with Reinforcement Learning


11. Revisiting the Uniform Information Density Hypothesis in LLM Reasoning Traces


12. LLM-Assisted Modeling of Semantic Web-Enabled Multi-Agents Systems with AJAN


13. TGPR: Tree-Guided Policy Refinement for Robust Self-Debugging of LLMs


14. Autoformalizer with Tool Feedback


15. Evolving and Executing Research Plans via Double-Loop Multi-Agent Collaboration


16. Verifying Memoryless Sequential Decision-making of Large Language Models


17. MultiCNKG: Integrating Cognitive Neuroscience, Gene, and Disease Knowledge Graphs Using Large Language Models


18. Inefficiencies of Meta Agents for Agent Design


19. Agent-in-the-Loop: A Data Flywheel for Continuous Improvement in LLM-based Customer Support


20. Fine-Grained Emotion Recognition via In-Context Learning


21. WebDART: Dynamic Decomposition and Re-planning for Complex Web Tasks


22. Auto-Prompt Ensemble for LLM Judge


23. Beneficial Reasoning Behaviors in Agentic Search and Effective Post-training to Obtain Them


24. PuzzlePlex: Benchmarking Foundation Models on Reasoning and Planning with Puzzles


25. Flavonoid Fusion: Creating a Knowledge Graph to Unveil the Interplay Between Food and Health


26. Off-Trajectory Reasoning: Can LLMs Collaborate on Reasoning Trajectory?


27. Belief-Calibrated Multi-Agent Consensus Seeking for Complex NLP Tasks


28. Requirements for Game-Based Learning Design Framework for Information System Integration in the Context of Post-Merger Integration


29. BuilderBench – A benchmark for generalist agents


30. Bridging Reasoning to Learning: Unmasking Illusions using Complexity Out of Distribution Generalization


31. AlphaApollo: Orchestrating Foundation Models and Professional Tools into a Self-Evolving System for Deep Agentic Reasoning


32. Artificial Hippocampus Networks for Efficient Long-Context Modeling


33. Vibe Checker: Aligning Code Evaluation with Human Preference


34. GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations


35. h1: Bootstrapping LLMs to Reason over Longer Horizons via Reinforcement Learning


36. MLE-Smith: Scaling MLE Tasks with Automated Multi-Agent Pipeline


37. Cocoon: A System Architecture for Differentially Private Training with Correlated Noises


38. AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs


39. Evolutionary Profiles for Protein Fitness Prediction


40. GTCN-G: A Residual Graph-Temporal Fusion Network for Imbalanced Intrusion Detection (Preprint)


41. Online Rubrics Elicitation from Pairwise Comparisons


42. On the false election between regulation and innovation. Ideas for regulation through the responsible use of artificial intelligence in research and education.[Spanish version]



44. Benchmarking LLM Causal Reasoning with Scientifically Validated Relationships


45. Where to Begin: Efficient Pretraining via Subnetwork Selection and Distillation


46. GenPilot: A Multi-Agent System for Test-Time Prompt Optimization in Image Generation


47. Language Lives in Sparse Dimensions: Toward Interpretable and Efficient Multilingual Control for Large Language Models


48. HyPlan: Hybrid Learning-Assisted Planning Under Uncertainty for Safe Autonomous Driving


49. Resolution scaling governs DINOv3 transfer performance in chest radiograph classification


50. TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics


51. ELMUR: External Layer Memory with Update/Rewrite for Long-Horizon RL



53. Comparing human and language models sentence processing difficulties on complex structures


54. TrackVLA++: Unleashing Reasoning and Memory Capabilities in VLA Models for Embodied Visual Tracking


55. A Digital Twin Framework for Metamorphic Testing of Autonomous Driving Systems Using Generative Model


56. Graph Conditioned Diffusion for Controllable Histopathology Image Generation


57. Opt-ICL at LeWiDi-2025: Maximizing In-Context Signal from Rater Examples via Meta-Learning


58. Generative World Modelling for Humanoids: 1X World Model Challenge Technical Report


59. HTMformer: Hybrid Time and Multivariate Transformer for Time Series Forecasting


60. Vision-Language-Action Models for Robotics: A Review Towards Real-World Applications


61. LuxInstruct: A Cross-Lingual Instruction Tuning Dataset For Luxembourgish


62. Introspection in Learned Semantic Scene Graph Localisation


63. Search-R3: Unifying Reasoning and Embedding Generation in Large Language Models


64. Unified Molecule Pre-training with Flexible 2D and 3D Modalities: Single and Paired Modality Integration


65. Mining the Mind: What 100M Beliefs Reveal About Frontier LLM Knowledge


66. Federated Unlearning in the Wild: Rethinking Fairness and Data Discrepancy


67. Native Hybrid Attention for Efficient Sequence Modeling


68. Pragyaan: Designing and Curating High-Quality Cultural Post-Training Datasets for Indian Languages


69. The Limits of Goal-Setting Theory in LLM-Driven Assessment


70. VelLMes: A high-interaction AI-based deception framework


71. Learning Global Representation from Queries for Vectorized HD Map Construction


72. Generating Surface for Text-to-3D using 2D Gaussian Splatting


73. EDUMATH: Generating Standards-aligned Educational Math Word Problems


74. Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation


75. Grouped Differential Attention


76. Expressive and Scalable Quantum Fusion for Multimodal Learning


77. Bayesian Nonparametric Dynamical Clustering of Time Series


78. LongRM: Revealing and Unlocking the Context Boundary of Reward Modeling


79. DecompGAIL: Learning Realistic Traffic Behaviors with Decomposed Multi-Agent Generative Adversarial Imitation Learning


80. Emotionally Vulnerable Subtype of Internet Gaming Disorder: Measuring and Exploring the Pathology of Problematic Generative AI Use


81. Angular Constraint Embedding via SpherePair Loss for Constrained Clustering


82. M3Retrieve: Benchmarking Multimodal Retrieval for Medicine


83. Multi-Dimensional Autoscaling of Stream Processing Services on Edge Devices


84. MoRE-GNN: Multi-omics Data Integration with a Heterogeneous Graph Autoencoder


85. Multi-hop Deep Joint Source-Channel Coding with Deep Hash Distillation for Semantically Aligned Image Retrieval


86. Towards Generalization of Graph Neural Networks for AC Optimal Power Flow


87. Explaining raw data complexity to improve satellite onboard processing


88. Enhancing Bankruptcy Prediction of Banks through Advanced Machine Learning Techniques: An Innovative Approach and Analysis


89. OpenJAI-v1.0: An Open Thai Large Language Model


90. SID: Multi-LLM Debate Driven by Self Signals


91. CNN-TFT explained by SHAP with multi-head attention weights for time series forecasting


92. Recurrence-Complete Frame-based Action Models


93. FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipeline


94. Extreme Amodal Face Detection


95. Foundations of LLM Knowledge Materialization: Termination, Reproducibility, Robustness


96. Modeling COVID-19 Dynamics in German States Using Physics-Informed Neural Networks


97. Evaluating LLMs for Historical Document OCR: A Methodological Framework for Digital Humanities


98. Are LLMs Reliable Rankers? Rank Manipulation via Two-Stage Token Optimization


99. Scaling LLM Multi-turn RL with End-to-end Summarization-based Context Management


100. LLM Company Policies and Policy Implications in Software Organizations


101. Dual Goal Representations


102. AISysRev - LLM-based Tool for Title-abstract Screening


103. Learning to Rewrite Prompts for Bootstrapping LLMs on Downstream Tasks


104. Semantic Segmentation Algorithm Based on Light Field and LiDAR Fusion


105. Incremental Summarization for Customer Support via Progressive Note-Taking and Agent Feedback


106. Heptapod: Language Modeling on Visual Signals


107. Automated Neural Architecture Design for Industrial Defect Detection


108. Delay Independent Safe Control with Neural Networks: Positive Lur’e Certificates for Risk Aware Autonomy


109. Local Reinforcement Learning with Action-Conditioned Root Mean Squared Q-Functions


110. The False Promise of Zero-Shot Super-Resolution in Machine-Learned Operators


111. Distilling Lightweight Language Models for C/C++ Vulnerabilities


112. StaR-KVQA: Structured Reasoning Traces for Implicit-Knowledge Visual Question Answering


113. Control-Augmented Autoregressive Diffusion for Data Assimilation


114. AI-Driven Forecasting and Monitoring of Urban Water System


115. Reading Between the Lines: Towards Reliable Black-box LLM Fingerprinting via Zeroth-order Gradient Estimation


116. SDQM: Synthetic Data Quality Metric for Object Detection Dataset Evaluation


117. The Framework That Survives Bad Models: Human-AI Collaboration For Clinical Trials


118. HSNet: Heterogeneous Subgraph Network for Single Image Super-resolution


119. The Algebra of Meaning: Why Machines Need Montague More Than Moore’s Law


120. The Markovian Thinker


121. Incoherence in goal-conditioned autoregressive models


122. Scalable Policy-Based RL Algorithms for POMDPs


123. CLAQS: Compact Learnable All-Quantum Token Mixer with Shared-ansatz for Text Classification


124. Visualizing Multimodality in Combinatorial Search Landscapes


125. LogSTOP: Temporal Scores over Prediction Sequences for Matching and Retrieval


126. A Median Perspective on Unlabeled Data for Out-of-Distribution Detection


127. ATLO-ML: Adaptive Time-Length Optimizer for Machine Learning – Insights from Air Quality Forecasting


128. Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels


129. Valid Stopping for LLM Generation via Empirical Dynamic Formal Lift


130. Attention Sinks and Compression Valleys in LLMs are Two Sides of the Same Coin


131. Deep Generative Model for Human Mobility Behavior


132. Evaluating Node-tree Interfaces for AI Explainability


133. How NOT to benchmark your SITE metric: Beyond Static Leaderboards and Towards Realistic Evaluation


134. A Survey on Agentic Security: Applications, Threats and Defenses


135. Context-Aware Inference via Performance Forecasting in Decentralized Learning Networks


136. Geometry-Aware Backdoor Attacks: Leveraging Curvature in Hyperbolic Embeddings


137. Adaptive Protein Design Protocols and Middleware


138. Reward Model Perspectives: Whose Opinions Do Reward Models Reward?


139. Protecting De-identified Documents from Search-based Linkage Attacks



141. Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data


142. EverydayMMQA: A Multilingual and Multimodal Framework for Culturally Grounded Spoken Visual QA


143. Constrained Natural Language Action Planning for Resilient Embodied Systems


144. TransFIRA: Transfer Learning for Face Image Recognizability Assessment


145. Asking For It: Question-Answering for Predicting Rule Infractions in Online Content Moderation


146. Flexible Swarm Learning May Outpace Foundation Models in Essential Tasks


147. Leveraging Large Language Models for Cybersecurity Risk Assessment – A Case from Forestry Cyber-Physical Systems


148. SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation


149. RGBD Gaze Tracking Using Transformer for Feature Fusion


150. VeriEquivBench: An Equivalence Score for Ground-Truth-Free Evaluation of Formally Verifiable Code


151. Efficient High-Resolution Image Editing with Hallucination-Aware Loss and Adaptive Tiling


152. BlockGPT: Spatio-Temporal Modelling of Rainfall via Frame-Level Autoregression


153. ChainMPQ: Interleaved Text-Image Reasoning Chains for Mitigating Relation Hallucinations


154. Traj-Transformer: Diffusion Models with Transformer for GPS Trajectory Generation


155. Soft-Evidence Fused Graph Neural Network for Cancer Driver Gene Identification across Multi-View Biological Graphs


156. SER-Diff: Synthetic Error Replay Diffusion for Incremental Brain Tumor Segmentation


157. Improving the Spatial Resolution of GONG Solar Images to GST Quality Using Deep Learning


158. Surgeons Are Indian Males and Speech Therapists Are White Females: Auditing Biases in Vision-Language Models for Healthcare Professionals


159. RVFL-X: A Novel Randomized Network Based on Complex Transformed Real-Valued Tabular Datasets



161. Reproducibility Study of “XRec: Large Language Models for Explainable Recommendation”


162. MCCE: A Framework for Multi-LLM Collaborative Co-Evolution


163. RareGraph-Synth: Knowledge-Guided Diffusion Models for Generating Privacy-Preserving Synthetic Patient Trajectories in Ultra-Rare Diseases


164. Language models for longitudinal analysis of abusive content in Billboard Music Charts


165. Dual-stage and Lightweight Patient Chart Summarization for Emergency Physicians


166. Prakriti200: A Questionnaire-Based Dataset of 200 Ayurvedic Prakriti Assessments


167. Ensemble Deep Learning and LLM-Assisted Reporting for Automated Skin Lesion Diagnosis


168. LLM-Driven Rubric-Based Assessment of Algebraic Competence in Multi-Stage Block Coding Tasks with Design and Field Evaluation


169. Dream2Image : An Open Multimodal EEG Dataset for Decoding and Visualizing Dreams with Artificial Intelligence


170. Scalable multilingual PII annotation for responsible AI in LLMs


171. TRepLiNa: Layer-wise CKA+REPINA Alignment Improves Low-Resource Machine Translation in Aya-23 8B


172. DynBenchmark: Customizable Ground Truths to Benchmark Community Detection and Tracking in Temporal Networks


173. Evaluating Embedding Frameworks for Scientific Domain


174. CoT Referring: Improving Referring Expression Tasks with Grounded Reasoning


175. Transparent Reference-free Automated Evaluation of Open-Ended User Survey Responses


176. Knowledge Graph-Guided Multi-Agent Distillation for Reliable Industrial Question Answering with Datasets


177. Uncertainty Quantification In Surface Landmines and UXO Classification Using MC Dropout


178. Stacked Regression using Off-the-shelf, Stimulus-tuned and Fine-tuned Neural Networks for Predicting fMRI Brain Responses to Movies (Algonauts 2025 Report)


179. Generalized Multi-agent Social Simulation Framework


180. Exploring Human-AI Collaboration Using Mental Models of Early Adopters of Multi-Agent Generative AI Tools


181. A Multimodal GUI Architecture for Interfacing with LLM-Based Conversational Assistants


182. WeatherArchive-Bench: Benchmarking Retrieval-Augmented Reasoning for Historical Weather Archives


183. AgentBuilder: Exploring Scaffolds for Prototyping User Experiences of Interface Agents


184. TiltXter: CNN-based Electro-tactile Rendering of Tilt Angle for Telemanipulation of Pasteur Pipettes


185. DeepXPalm: Tilt and Position Rendering using Palm-worn Haptic Display and CNN-based Tactile Pattern Recognition