전체 AI 논문 - 2025-10-08

1. TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning


2. Barbarians at the Gate: How AI is Upending Systems Research


3. Pushing Test-Time Scaling Limits of Deep Search with Asymmetric Verification


4. Moloch’s Bargain: Emergent Misalignment When LLMs Compete for Audiences


5. Classical AI vs. LLMs for Decision-Maker Alignment in Health Insurance Choices


6. Constraint-Aware Route Recommendation from Natural Language via Hierarchical LLM Agents


7. TelecomTS: A Multi-Modal Observability Dataset for Time Series and Language Analysis


8. Scientific Algorithm Discovery by Augmenting AlphaEvolve with Deep Research


9. MixReasoning: Switching Modes to Think


10. Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning?


11. ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models



13. Information-Theoretic Policy Pre-Training with Empowerment


14. MatheMagic: Generating Dynamic Mathematics Benchmarks Robust to Memorization


15. Training-Free Time Series Classification via In-Context Reasoning with LLM Agents


16. Optimizing for Persuasion Improves LLM Generalization: Evidence from Quality-Diversity Evolution of Debate Strategies


17. Towards Label-Free Biological Reasoning Synthetic Dataset Creation via Uncertainty Filtering


18. The Safety Challenge of World Models for Embodied AI Agents: A Review


19. ConstraintLLM: A Neuro-Symbolic Framework for Industrial-Level Constraint Programming


20. RareAgent: Self-Evolving Reasoning for Drug Repurposing in Rare Diseases


21. Early Multimodal Prediction of Cross-Lingual Meme Virality on Reddit: A Time-Window Analysis


22. Uncertainty assessment in satellite-based greenhouse gas emissions estimates using emulated atmospheric transport


23. ARM: Discovering Agentic Reasoning Modules for Generalizable Multi-Agent Systems


24. Artificially intelligent agents in the social and behavioral sciences: A history and outlook


25. Syn-Diag: An LLM-based Synergistic Framework for Generalizable Few-shot Fault Diagnosis on the Edge


26. Joint Communication Scheduling and Velocity Control for Multi-UAV-Assisted Post-Disaster Monitoring: An Attention-Based In-Context Learning Approach


27. D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI


28. Large Language Model-Based Uncertainty-Adjusted Label Extraction for Artificial Intelligence Model Development in Upper Extremity Radiography


29. From Agentification to Self-Evolving Agentic AI for Wireless Networks: Concepts, Approaches, and Future Research Directions


30. In-the-Flow Agentic System Optimization for Effective Planning and Tool Use


31. MetaVLA: Unified Meta Co-training For Efficient Embodied Adaption


32. Decade-long Emission Forecasting with an Ensemble Model in Taiwan


33. Vul-R2: A Reasoning LLM for Automated Vulnerability Repair


34. VAL-Bench: Measuring Value Alignment in Language Models


35. Do Code Models Suffer from the Dunning-Kruger Effect?


36. NASP-T: A Fuzzy Neuro-Symbolic Transformer for Logic-Constrained Aviation Safety Report Classification


37. AInstein: Assessing the Feasibility of AI-Generated Approaches to Research Problems


38. Teacher-Student Guided Inverse Modeling for Steel Final Hardness Estimation


39. What Do You Mean? Exploring How Humans and AI Interact with Symbols and Meanings in Their Interactions


40. MHA-RAG: Improving Efficiency, Accuracy, and Consistency by Encoding Exemplars as Soft Prompts


41. Integrating Bayesian methods with neural network–based model predictive control: a review


42. Biomedical reasoning in action: Multi-agent System for Auditable Biomedical Evidence Synthesis


43. BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions


44. Beyond Monolithic Rewards: A Hybrid and Multi-Aspect Reward Optimization for MLLM Alignment


45. Efficient Prediction of Pass@k Scaling in Large Language Models


46. Graph-based LLM over Semi-Structured Population Data for Dynamic Policy Response


47. Plug-and-Play Dramaturge: A Divide-and-Conquer Approach for Iterative Narrative Script Refinement via Collaborative LLM Agents


48. Real-time Framework for Interoperable Semantic-driven Internet-of-Things in Smart Agriculture


49. Representation Potentials of Foundation Models for Multimodal Alignment: A Survey


50. Lang-PINN: From Language to Physics-Informed Neural Networks via a Multi-Agent Framework


51. An Algorithmic Information-Theoretic Perspective on the Symbol Grounding Problem


52. Structuring Reasoning for Complex Rules Beyond Flat Representations


53. Optimization Modeling via Semantic Anchored Alignment


54. Structured Cognition for Behavioral Intelligence in Large Language Model Agents: Preliminary Study


55. Rule Encoding and Compliance in Large Language Models: An Information-Theoretic Analysis


56. EgoNight: Towards Egocentric Vision Understanding at Night with a Challenging Benchmark


57. Stratified GRPO: Handling Structural Heterogeneity in Reinforcement Learning of LLM Search Agents


58. Reference Grounded Skill Discovery


59. TokenChain: A Discrete Speech Chain via Semantic Token Modeling


60. StarEmbed: Benchmarking Time Series Foundation Models on Astronomical Observations of Variable Stars


61. Latent Speech-Text Transformer


62. BanglaTalk: Towards Real-Time Speech Assistance for Bengali Regional Dialects


63. Automated Program Repair of Uncompilable Student Code


64. RECODE-H: A Benchmark for Research Code Development with Interactive Human Feedback


65. Smartphone-based iris recognition through high-quality visible-spectrum iris image capture.V2


66. LLMs as Policy-Agnostic Teammates: A Case Study in Human Proxy Design for Heterogeneous Agent Teams


67. Bimanual 3D Hand Motion and Articulation Forecasting in Everyday Images


68. Multi-Task Reinforcement Learning with Language-Encoded Gated Policy Networks


69. CreditDecoding: Accelerating Parallel Decoding in Diffusion Large Language Models with Trace Credits


70. Discrete Diffusion Models with MLLMs for Unified Medical Multimodal Generation


71. Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models


72. A public cardiac CT dataset featuring the left atrial appendage


73. Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability


74. When Thinking Drifts: Evidential Grounding for Robust Video Reasoning



76. Cross-Embodiment Dexterous Hand Articulation Generation via Morphology-Aware Learning


77. Reasoning under Vision: Understanding Visual-Spatial Cognition in Vision-Language Models for CAPTCHA


78. Controllable Audio-Visual Viewpoint Generation from 360° Spatial Information


79. GLVD: Guided Learned Vertex Descent


80. VideoMiner: Iteratively Grounding Key Frames of Hour-Long Videos via Tree-based Group Relative Policy Optimization


81. CDTP: A Large-Scale Chinese Data-Text Pair Dataset for Comprehensive Evaluation of Chinese LLMs


82. From Learning to Mastery: Achieving Safe and Efficient Real-World Autonomous Driving with Human-In-The-Loop Reinforcement Learning


83. Fast Leave-One-Out Approximation from Fragment-Target Prevalence Vectors (molFTP) : From Dummy Masking to Key-LOO for Leakage-Free Feature Construction


84. Emergent AI Surveillance: Overlearned Person Re-Identification and Its Mitigation in Law Enforcement Context


85. Hybrid Quantum-Classical Policy Gradient for Adaptive Control of Cyber-Physical Systems: A Comparative Study of VQC vs. MLP


86. Detection and Measurement of Hailstones with Multimodal Large Language Models


87. ECTSpeech: Enhancing Efficient Speech Synthesis via Easy Consistency Tuning


88. Diffusion Models for Low-Light Image Enhancement: A Multi-Perspective Taxonomy and Performance Analysis


89. LexiCon: a Benchmark for Planning under Temporal Constraints in Natural Language


90. Probing the Difficulty Perception Mechanism of Large Language Models


91. Gaussian Embeddings: How JEPAs Secretly Learn Your Data Density


92. EvalMORAAL: Interpretable Chain-of-Thought and LLM-as-Judge Evaluation for Moral Alignment in Large Language Models


93. LLM-FS-Agent: A Deliberative Role-based Large Language Model Architecture for Transparent Feature Selection


94. Carré du champ flow matching: better quality-generalisation tradeoff in generative models


95. An Attention-Augmented VAE-BiLSTM Framework for Anomaly Detection in 12-Lead ECG Signals


96. Kaputt: A Large-Scale Dataset for Visual Defect Detection


97. Paying Attention to Hybrid Attention: Untangling the Issues with Conversion Methods


98. $\bf{D^3}$QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection


99. Segment-Factorized Full-Song Generation on Symbolic Piano Music


100. Revisiting Long-context Modeling from Context Denoising Perspective


101. DACP: Domain-Adaptive Continual Pre-Training of Large Language Models for Phone Conversation Summarization


102. VCoT-Grasp: Grasp Foundation Models with Visual Chain-of-Thought Reasoning for Language-driven Grasp Generation


103. Mitigating Premature Exploitation in Particle-based Monte Carlo for Inference-Time Scaling


104. Deformable Image Registration for Self-supervised Cardiac Phase Detection in Multi-View Multi-Disease Cardiac Magnetic Resonance Images


105. Risk level dependent Minimax Quantile lower bounds for Interactive Statistical Decision Making


106. Data-efficient Targeted Token-level Preference Optimization for LLM-based Text-to-Speech


107. Mellum: Production-Grade in-IDE Contextual Code Completion with Multi-File Project Understanding


108. InforME: Improving Informativeness of Abstractive Text Summarization With Informative Attention Guided by Named Entity Salience


109. Are Heterogeneous Graph Neural Networks Truly Effective? A Causal Perspective


110. Redefining Generalization in Visual Domains: A Two-Axis Framework for Fake Image Detection with FusionDetect


111. Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies


112. Federated Split Learning for Resource-Constrained Robots in Industrial IoT: Framework Comparison, Optimization Strategies, and Future Directions


113. FinReflectKG - EvalBench: Benchmarking Financial KG with Multi-Dimensional Evaluation


114. Towards Reliable and Practical LLM Security Evaluations via Bayesian Modelling


115. Uncovering Representation Bias for Investment Decisions in Open-Source Large Language Models


116. Membership Inference Attacks on Tokenizers of Large Language Models


117. Sparse deepfake detection promotes better disentanglement


118. vAttention: Verified Sparse Attention


119. QGraphLIME - Explaining Quantum Graph Neural Networks


120. Verifier-free Test-Time Sampling for Vision Language Action Models


121. Code-Switching In-Context Learning for Cross-Lingual Transfer of Large Language Models


122. Quantifying the Accuracy-Interpretability Trade-Off in Concept-Based Sidechannel Models


123. Ocular-Induced Abnormal Head Posture: Diagnosis and Missing Data Imputation


124. The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP


125. From Neural Activity to Computation: Biological Reservoirs for Pattern Recognition in Digit Classification


126. Beyond Spectral Peaks: Interpreting the Cues Behind Synthetic Image Detection


127. Generative AI-Driven Hierarchical Multi-Agent Framework for Zero-Touch Optical Networks


128. Monte Carlo-Type Neural Operator for Differential Equations


129. PointNSP: Autoregressive 3D Point Cloud Generation with Next-Scale Level-of-Detail Prediction


130. MADIAVE: Multi-Agent Debate for Implicit Attribute Value Extraction


131. HOI-R1: Exploring the Potential of Multimodal Large Language Models for Human-Object Interaction Detection


132. AutoPentester: An LLM Agent-based Framework for Automated Pentesting


133. AgentDR Dynamic Recommendation with Implicit Item-Item Relations via LLM-based Agents


134. Improving Chain-of-Thought Efficiency for Autoregressive Image Generation


135. Deciphering Invariant Feature Decoupling in Source-free Time Series Forecasting with Proxy Denoising


136. Domain-Shift-Aware Conformal Prediction for Large Language Models


137. Generative Dynamic Graph Representation Learning for Conspiracy Spoofing Detection


138. Critical attention scaling in long-context transformers


139. Seeing the Big Picture: Evaluating Multimodal LLMs’ Ability to Interpret and Grade Handwritten Student Work


140. Permutation-Invariant Representation Learning for Robust and Privacy-Preserving Feature Selection


141. Provably Mitigating Corruption, Overoptimization, and Verbosity Simultaneously in Offline and Online RLHF/DPO Alignment


142. CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension


143. Orders in Chaos: Enhancing Large-Scale MoE LLM Serving with Data Movement Forecasting


144. High-Fidelity Synthetic ECG Generation via Mel-Spectrogram Informed Diffusion Training


145. LANTERN: Scalable Distillation of Large Language Models for Job-Person Fit and Explanation


146. AMAQ: Adaptive Mixed-bit Activation Quantization for Collaborative Parameter Efficient Fine-tuning


147. QDeepGR4J: Quantile-based ensemble of deep learning and GR4J hybrid rainfall-runoff models for extreme flow prediction with uncertainty quantification


148. Adversarial Reinforcement Learning for Large Language Model Agent Safety


149. UnitTenX: Generating Tests for Legacy Packages with AI Agents Powered by Formal Verification


150. Physics-Informed Machine Learning in Biomedical Science and Engineering


151. Exploring Student Choice and the Use of Multimodal Generative AI in Programming Learning


152. See the past: Time-Reversed Scene Reconstruction from Thermal Traces Using Visual Language Models


153. Comparing LSTM-Based Sequence-to-Sequence Forecasting Strategies for 24-Hour Solar Proton Flux Profiles Using GOES Data


154. Fusion-Based Neural Generalization for Predicting Temperature Fields in Industrial PET Preform Heating


155. Context Length Alone Hurts LLM Performance Despite Perfect Retrieval


156. AutoDAN-Reasoning: Enhancing Strategies Exploration based Jailbreak Attacks with Test-Time Scaling


157. MT-DAO: Multi-Timescale Distributed Adaptive Optimizers with Local Updates


158. Physics-informed Attention-enhanced Fourier Neural Operator for Solar Magnetic Field Extrapolations


159. Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization


160. DeepV: A Model-Agnostic Retrieval-Augmented Framework for Verilog Code Generation with a High-Quality Knowledge Base


161. Dynamic Functional Connectivity Features for Brain State Classification: Insights from the Human Connectome Project


162. DeepAf: One-Shot Spatiospectral Auto-Focus Model for Digital Pathology


163. RAG Makes Guardrails Unsafe? Investigating Robustness of Guardrails under RAG-style Contexts


164. AUREXA-SE: Audio-Visual Unified Representation Exchange Architecture with Cross-Attention and Squeezeformer for Speech Enhancement


165. DP-Adam-AC: Privacy-preserving Fine-Tuning of Localizable Language Models Using Adam Optimization with Adaptive Clipping


166. Adjusting the Output of Decision Transformer with Action Gradient


167. CMT-Benchmark: A Benchmark for Condensed Matter Theory Built by Expert Researchers


168. Approximate Gaussianity Beyond Initialisation in Neural Networks


169. VER: Vision Expert Transformer for Robot Learning via Foundation Distillation and Dynamic Routing


170. Adapting Insider Risk mitigations for Agentic Misalignment: an empirical study


171. Provable Speech Attributes Conversion via Latent Independence


172. A novel hallucination classification framework


173. OptPipe: Memory- and Scheduling-Optimized Pipeline Parallelism for LLM Training


174. Auditing Pay-Per-Token in Large Language Models


175. OptiFLIDS: Optimized Federated Learning for Energy-Efficient Intrusion Detection in IoT


176. Agentic Misalignment: How LLMs Could Be Insider Threats


177. Logistic-Gated Operators Enable Auditable Unit-Aware Thresholds in Symbolic Regression


178. PatternKV: Flattening KV Representation Expands Quantization Headroom


179. Emergent Coordination in Multi-Agent Language Models


180. SafeGuider: Robust and Practical Content Safety Control for Text-to-Image Models


181. From Poisoned to Aware: Fostering Backdoor Self-Awareness in LLMs


182. Domain-Adapted Granger Causality for Real-Time Cross-Slice Attack Attribution in 6G Networks


183. SATER: A Self-Aware and Token-Efficient Approach to Routing and Cascading


184. Deep Learning-Based Multi-Factor Authentication: A Survey of Biometric and Smart Card Integration Approaches


185. Artificial-Intelligence Grading Assistance for Handwritten Components of a Calculus Exam


186. Generative Inverse Design: From Single Point Optimization to a Diverse Design Portfolio via Conditional Variational Autoencoders


187. Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain


188. Adversarial Reinforcement Learning for Offensive and Defensive Agents in a Simulated Zero-Sum Network Environment


189. VeriGuard: Enhancing LLM Agent Safety via Verified Code Generation


190. A Single Character can Make or Break Your LLM Evals


191. Chronological Thinking in Full-Duplex Spoken Dialogue Language Models


192. Percepta: High Performance Stream Processing at the Edge


193. Every Step Counts: Decoding Trajectories as Authorship Fingerprints of dLLMs


194. FlashResearch: Real-time Agent Orchestration for Efficient Deep Research


195. SynCED-EnDe 2025: A Synthetic and Curated English - German Dataset for Critical Error Detection in Machine Translation


196. Linguistic Characteristics of AI-Generated Text: A Survey


197. Training Large Language Models To Reason In Parallel With Global Forking Tokens


198. Rationale-Augmented Retrieval with Constrained LLM Re-Ranking for Task Discovery


199. Artificial Intelligence for Cost-Aware Resource Prediction in Big Data Pipelines


200. Improving Metacognition and Uncertainty Communication in Language Models


201. MADS: Multi-Agent Dialogue Simulation for Diverse Persuasion Data Generation


202. A Scalable AI Driven, IoT Integrated Cognitive Digital Twin for Multi-Modal Neuro-Oncological Prognostics and Tumor Kinetics Prediction using Enhanced Vision Transformer and XAI


203. CARE: Cognitive-reasoning Augmented Reinforcement for Emotional Support Conversation


204. Hallucination is Inevitable for LLMs with the Open World Assumption


205. Trainable Reference-Based Evaluation Metric for Identifying Quality of English-Gujarati Machine Translation System


206. Tiny but Mighty: A Software-Hardware Co-Design Approach for Efficient Multimodal Inference on Battery-Powered Small Devices


207. COSPADI: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning


208. Ads that Talk Back: Implications and Perceptions of Injecting Personalized Advertising into LLM Chatbots