전체 AI 논문 - 2025-12-02

1. Thinking by Doing: Building Efficient World Model Reasoning in LLMs via Multi-turn Interaction


2. Towards Continuous Intelligence Growth: Self-Training, Continual Learning, and Dual-Scale Memory in SuperIntelliAgent


3. Hierarchical AI-Meteorologist: LLM-Agent System for Multi-Scale and Explainable Weather Forecast Reporting


4. Agentic AI Framework for Smart Inventory Replenishment


5. Multi-Modal Scene Graph with Kolmogorov-Arnold Experts for Audio-Visual Question Answering


6. OctoMed: Data Recipes for State-of-the-Art Multimodal Medical Reasoning


7. Adapting Like Humans: A Metacognitive Agent with Test-time Reasoning


8. AgriCoT: A Chain-of-Thought Benchmark for Evaluating Reasoning in Vision-Language Models for Agriculture


9. Peer-to-Peer Energy Trading in Dairy Farms using Multi-Agent Reinforcement Learning


10. Evolutionary Discovery of Heuristic Policies for Traffic Signal Control


11. Does Self-Evaluation Enable Wireheading in Language Models?


12. MindPower: Enabling Theory-of-Mind Reasoning in VLM-based Embodied Agents


13. TIM-PRM: Verifying multimodal reasoning with Tool-Integrated PRM


14. ORION: Teaching Language Models to Reason Efficiently in the Language of Thought


15. InsightEval: An Expert-Curated Benchmark for Assessing Insight Discovery in LLM-Driven Data Agents


16. Fast dynamical similarity analysis


17. Agentic AI Framework for Cloudburst Prediction and Coordinated Response


18. Agentic AI Framework for Individuals with Disabilities and Neurodivergence: A Multi-Agent System for Healthy Eating, Daily Routines, and Inclusive Well-Being


19. Solving Context Window Overflow in AI Agents


20. Geometrically-Constrained Agent for Spatial Reasoning


21. Optimized Agent Shift Scheduling Using Multi-Phase Allocation Approach


22. AI Deception: Risks, Dynamics, and Controls


23. DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning


24. Counting Still Counts: Understanding Neural Complex Query Answering Through Query Relaxation


25. A Computable Game-Theoretic Framework for Multi-Agent Theory of Mind


26. Structured Extraction from Business Process Diagrams Using Vision-Language Models


27. Who is Afraid of Minimal Revision?


28. On the Complexity of the Grounded Semantics for Infinite Argumentation Frameworks


29. Tracing Footsteps of Similar Cities: Modeling Urban Economic Vitality with Dynamic Inter-City Graph Embeddings


30. Swarms of Large Language Model Agents for Protein Sequence Design with Experimental Validation


31. Enhanced Conditional Generation of Double Perovskite by Knowledge-Guided Language Model Feedback


32. When AI Bends Metal: AI-Assisted Optimization of Design Parameters in Sheet Metal Forming


33. RecToM: A Benchmark for Evaluating Machine Theory of Mind in LLM-based Conversational Recommender Systems


34. Co-Evolving Agents: Learning from Failures as Hard Negatives


35. Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation


36. Embedded Universal Predictive Intelligence: a coherent framework for multi-agent learning


37. WearVQA: A Visual Question Answering Benchmark for Wearables in Egocentric Authentic Real-world scenarios


38. A perceptual bias of AI Logical Argumentation Ability in Writing


39. Hybrid Stackelberg Game and Diffusion-based Auction for Two-tier Agentic AI Task Offloading in Internet of Agents


40. Real-Time Procedural Learning From Experience for AI Agents


41. Pathology-Aware Prototype Evolution via LLM-Driven Semantic Disambiguation for Multicenter Diabetic Retinopathy Diagnosis


42. Evaluating Strategies for Synthesizing Clinical Notes for Medical Multimodal AI


43. Aligning Artificial Superintelligence via a Multi-Box Protocol


44. The Price of Progress: Algorithmic Efficiency and the Falling Cost of AI Inference


45. Physics-Informed Neural Networks for Thermophysical Property Retrieval


46. ASTRO: Adaptive Stitching via Dynamics-Guided Trajectory Rollouts


47. Evaluating LLMs for One-Shot Patching of Real and Artificial Vulnerabilities


48. LFM2 Technical Report


49. MegaChat: A Synthetic Persian Q&A Dataset for High-Quality Sales Chatbot Evaluation


50. Flow Straighter and Faster: Efficient One-Step Generative Modeling via MeanFlow on Rectified Trajectories


51. ParaGate: Parasitic-Driven Domain Adaptation Transfer Learning for Netlist Performance Prediction


52. Towards Improving Interpretability of Language Model Generation through a Structured Knowledge Discovery Approach


53. Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models


54. Toward Automatic Safe Driving Instruction: A Large-Scale Vision Language Model Approach


55. Hard-Constrained Neural Networks with Physics-Embedded Architecture for Residual Dynamics Learning and Invariant Enforcement in Cyber-Physical Systems


56. Machine Learning for Scientific Visualization: Ensemble Data Analysis


57. Simultaneous Image Quality Improvement and Artefacts Correction in Accelerated MRI


58. Time Series Forecasting via Direct Per-Step Probability Distribution Modeling


59. Robust HRRP Recognition under Interrupted Sampling Repeater Jamming using a Prior Jamming Information-Guided Network


60. One-Shot Secure Aggregation: A Hybrid Cryptographic Protocol for Private Federated Learning in IoT


61. Learning to Predict Aboveground Biomass from RGB Images with 3D Synthetic Scenes


62. Tourism Question Answer System in Indian Language using Domain-Adapted Foundation Models


63. GAVINA: flexible aggressive undervolting for bit-serial mixed-precision DNN acceleration


64. Vision Bridge Transformer at Scale


65. Obstruction reasoning for robotic grasping


66. Listwise Preference Optimization with Element-wise Confusions for Aspect Sentiment Quad Prediction


67. Identification of Malicious Posts on the Dark Web Using Supervised Machine Learning


68. AI for software engineering: from probable to provable


69. REVEAL: Reasoning-enhanced Forensic Evidence Analysis for Explainable AI-generated Image Detection


70. Automated Generation of MDPs Using Logic Programming and LLMs for Robotic Applications


71. Multi-chain Graph Refinement and Selection for Reliable Reasoning in Large Language Models


72. Mind Reading or Misreading? LLMs on the Big Five Personality Test


73. Fairness in the Multi-Secretary Problem


74. SpaceMind: Camera-Guided Modality Fusion for Spatial Reasoning in Vision-Language Models


75. What If They Took the Shot? A Hierarchical Bayesian Framework for Counterfactual Expected Goals


76. Bharat Scene Text: A Novel Comprehensive Dataset and Benchmark for Indian Language Scene Text Understanding


77. Evaluating the Clinical Impact of Generative Inpainting on Bone Age Estimation


78. Conveying Imagistic Thinking in TCM Translation: A Prompt Engineering and LLM-Based Evaluation Framework


79. High-Resolution Probabilistic Data-Driven Weather Modeling with a Stretched-Grid


80. Delta-XAI: A Unified Framework for Explaining Prediction Changes in Online Time Series Monitoring


81. From Illusion to Intention: Visual Rationale Learning for Vision-Language Reasoning


82. A transfer learning approach for automatic conflicts detection in software requirement sentence pairs based on dual encoders


83. MIMM-X: Disentangling Spurious Correlations for Medical Image Analysis


84. Ovis-Image Technical Report


85. Pooling Attention: Evaluating Pretrained Transformer Embeddings for Deception Classification


86. Commanding Humanoid by Free-form Language: A Large Language Action Model with Unified Motion Vocabulary


87. Bandit Guided Submodular Curriculum for Adaptive Subset Selection


88. EnECG: Efficient Ensemble Learning for Electrocardiogram Multi-task Foundation Model


89. AgentShield: Make MAS more secure and efficient


90. MICCAI STS 2024 Challenge: Semi-Supervised Instance-Level Tooth Segmentation in Panoramic X-ray and CBCT Images


91. Leveraging Textual Compositional Reasoning for Robust Change Captioning


92. Switching-time bioprocess control with pulse-width-modulated optogenetics


93. Adversarial Training for Process Reward Models


94. Serving Heterogeneous LoRA Adapters in Distributed LLM Inference Systems


95. Escaping Barren Plateaus in Variational Quantum Algorithms Using Negative Learning Rate in Quantum Internet of Things


96. CausalProfiler: Generating Synthetic Benchmarks for Rigorous and Transparent Evaluation of Causal Machine Learning


97. A Unified and Stable Risk Minimization Framework for Weakly Supervised Learning with Theoretical Guarantees


98. AI summaries in online search influence users’ attitudes


99. The Hidden AI Race: Tracking Environmental Costs of Innovation


100. Distracted Robot: How Visual Clutter Undermine Robotic Manipulation


101. Improving Robotic Manipulation Robustness via NICE Scene Surgery


102. CAPE: Context-Aware Diffusion Policy Via Proximal Mode Expansion for Collision Avoidance


103. MammoRGB: Dual-View Mammogram Synthesis Using Denoising Diffusion Probabilistic Models


104. Exact Learning of Arithmetic with Differentiable Agents


105. VeriDispatcher: Multi-Model Dispatching through Pre-Inference Difficulty Prediction for RTL Generation Optimization


106. All Centers Are at most a Few Tokens Apart: Knowledge Distillation with Domain Invariant Prompt Tuning


107. ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering


108. CoFiRec: Coarse-to-Fine Tokenization for Generative Recommendation


109. Probabilistic Fusion and Calibration of Neural Speaker Diarization Models


110. Test-time scaling of diffusions with flow maps


111. Foundations of Quantum Granular Computing with Effect-Based Granules, Algebraic Properties and Reference Architectures


112. Automated Design Optimization via Strategic Search with Large Language Models


113. Variational analysis of determinantal varieties


114. GazeTrack: High-Precision Eye Tracking Based on Regularization and Spatial Computing


115. HarmoCLIP: Harmonizing Global and Regional Representations in Contrastive Vision-Language Models


116. Revisiting the Necessity of Lengthy Chain-of-Thought in Vision-centric Reasoning Generalization


117. Where to Measure: Epistemic Uncertainty-Based Sensor Placement with ConvCNPs


118. CoT4AD: A Vision-Language-Action Model with Explicit Chain-of-Thought Reasoning for Autonomous Driving


119. DocVAL: Validated Chain-of-Thought Distillation for Grounded Document VQA


120. HW-GNN: Homophily-Aware Gaussian-Window Constrained Graph Spectral Network for Social Network Bot Detection


121. Exploring Performance Variations in Finetuned Translators of Ultra-Low Resource Languages: Do Linguistic Differences Matter?


122. What Is the Optimal Ranking Score Between Precision and Recall? We Can Always Find It and It Is Rarely $F_1$


123. GEO-Detective: Unveiling Location Privacy Risks in Images with LLM Agents


124. FastFHE: Packing-Scalable and Depthwise-Separable CNN Inference Over FHE


125. MATCH: Engineering Transparent and Controllable Conversational XAI Systems through Composable Building Blocks


126. Mapping Clinical Doubt: Locating Linguistic Uncertainty in LLMs


127. Asking like Socrates: Socrates helps VLMs understand remote sensing images


128. Graded Distributed Belief


129. Conditionals Based on Selection Functions, Modal Operators and Probabilities


130. Distributed Knowing How


131. SuRe: Surprise-Driven Prioritised Replay for Continual LLM Learning


132. BINDER: Instantly Adaptive Mobile Manipulation with Open-Vocabulary Commands


133. Test Time Training for AC Power Flow Surrogates via Physics and Operational Constraint Refinement


134. Edge Deployment of Small Language Models, a comprehensive comparison of CPU, GPU and NPU backends


135. On the Condition Number Dependency in Bilevel Optimization


136. Prompt-based Consistent Video Colorization


137. RELiQ: Scalable Entanglement Routing via Reinforcement Learning in Quantum Networks


138. Adaptive tumor growth forecasting via neural & universal ODEs


139. Efficiency and Effectiveness of SPLADE Models on Billion-Scale Web Document Title


140. An interpretable unsupervised representation learning for high precision measurement in particle physics


141. Evaluating Embedding Models and Pipeline Optimization for AI Search Quality


142. DeepPNI: Language- and graph-based model for mutation-driven protein-nucleic acid energetics


143. From Compound Figures to Composite Understanding: Developing a Multi-Modal LLM from Biomedical Literature with Medical Multiple-Image Benchmarking and Validation


144. 3D-Consistent Multi-View Editing by Diffusion Guidance


145. PULSE-ICU: A Pretrained Unified Long-Sequence Encoder for Multi-task Prediction in Intensive Care Units


146. ARPGNet: Appearance- and Relation-aware Parallel Graph Attention Fusion Network for Facial Expression Recognition


147. MTR-VP: Towards End-to-End Trajectory Planning through Context-Driven Image Encoding and Multiple Trajectory Prediction


148. Enhanced Graph Convolutional Network with Chebyshev Spectral Graph and Graph Attention for Autism Spectrum Disorder Classification


149. Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information


150. Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization


151. IMTalker: Efficient Audio-driven Talking Face Generation with Implicit Motion Transfer


152. A Theoretically Grounded Hybrid Ensemble for Reliable Detection of LLM-Generated Text


153. Towards Heterogeneous Quantum Federated Learning: Challenges and Solutions


154. RemedyGS: Defend 3D Gaussian Splatting against Computation Cost Attacks


155. Stacked Ensemble of Fine-Tuned CNNs for Knee Osteoarthritis Severity Grading


156. Decomposed Trust: Exploring Privacy, Adversarial Robustness, Fairness, and Ethics of Low-Rank LLMs


157. Binary-30K: A Heterogeneous Dataset for Deep Learning in Binary Analysis and Malware Detection


158. A Fast and Flat Federated Learning Method via Weighted Momentum and Sharpness-Aware Minimization


159. A Multi-View Multi-Timescale Hypergraph-Empowered Spatiotemporal Framework for EV Charging Forecasting


160. ICM-SR: Image-Conditioned Manifold Regularization for Image Super-Resoultion


161. Distillability of LLM Security Logic: Predicting Attack Success Rate of Outline Filling Attack via Ranking Regression


162. Predicting Public Health Impacts of Electricity Usage


163. MedEyes: Learning Dynamic Visual Focus for Medical Progressive Diagnosis


164. AfriStereo: A Culturally Grounded Dataset for Evaluating Stereotypical Bias in Large Language Models


165. When Do Domain-Specific Foundation Models Justify Their Cost? A Systematic Evaluation Across Retinal Imaging Tasks


166. Joint Estimation of Sea State and Vessel Parameters Using a Mass-Spring-Damper Equivalence Model


167. A Safety and Security Framework for Real-World Agentic Systems


168. DialBench: Towards Accurate Reading Recognition of Pointer Meter using Large Foundation Models


169. The Risk-Adjusted Intelligence Dividend: A Quantitative Framework for Measuring AI Return on Investment Integrating ISO 42001 and Regulatory Exposure


170. DeepGI: Explainable Deep Learning for Gastrointestinal Image Classification


171. ABLE: Using Adversarial Pairs to Construct Local Models for Explaining Model Predictions


172. WalkCLIP: Multimodal Learning for Urban Walkability Prediction


173. Heterogeneous Multi-Agent Reinforcement Learning with Attention for Cooperative and Scalable Feature Transformation


174. Does the Model Say What the Data Says? A Simple Heuristic for Model Data Alignment


175. Prompted Policy Search: Reinforcement Learning through Linguistic and Numerical Reasoning in LLMs


176. Exploring Dynamic Properties of Backdoor Training Through Information Bottleneck


177. Toward Automated and Trustworthy Scientific Analysis and Visualization with LLM-Generated Code


178. Adaptive Parameter Optimization for Robust Remote Photoplethysmography


179. PathReasoning: A multimodal reasoning agent for query-based ROI navigation on whole-slide images


180. Standardized Threat Taxonomy for AI Security, Governance, and Regulatory Compliance


181. Bridging Planning and Execution: Multi-Agent Path Finding Under Real-World Deadlines


182. LLM-Empowered Event-Chain Driven Code Generation for ADAS in SDV systems


183. Advancing Marine Bioacoustics with Deep Generative Models: A Hybrid Augmentation Strategy for Southern Resident Killer Whale Detection


184. Towards a Foundation Model for Partial Differential Equations Across Physics Domains


185. Improving Score Reliability of Multiple Choice Benchmarks with Consistency Evaluation and Altered Answer Choices


186. LILAD: Learning In-context Lyapunov-stable Adaptive Dynamics Models


187. FLAWS: A Benchmark for Error Identification and Localization in Scientific Papers


188. Dark Speculation: Combining Qualitative and Quantitative Understanding in Frontier AI Risk Analysis


189. Tacit Bidder-Side Collusion: Artificial Intelligence in Dynamic Auctions


190. Reducing research bureaucracy in UK higher education: Can generative AI assist with the internal evaluation of quality?


191. BeeRNA: tertiary structure-based RNA inverse folding using Artificial Bee Colony


192. LAYER: A Quantitative Explainable AI Framework for Decoding Tissue-Layer Drivers of Myofascial Low Back Pain


193. Factors That Support Grounded Responses in LLM Conversations: A Rapid Review


194. fMRI-LM: Towards a Universal Foundation Model for Language-Aligned fMRI Understanding


195. A Longitudinal Measurement of Privacy Policy Evolution for Large Language Models


196. Medical Malice: A Dataset for Context-Aware Safety in Healthcare LLMs




199. Semantics as a Shield: Label Disguise Defense (LDD) against Prompt Injection in LLM Sentiment Classification


200. SO-Bench: A Structural Output Evaluation of Multimodal LLMs


201. Proactive Defense: Compound AI for Detecting Persuasion Attacks and Measuring Inoculation Effectiveness


202. Building Domain-Specific Small Language Models via Guided Data Generation


203. QuantumChem-200K: A Large-Scale Open Organic Molecular Dataset for Quantum-Chemistry Property Screening and Language Model Benchmarking


204. A Lightweight Approach to Detection of AI-Generated Texts Using Stylometric Features


205. EduMod-LLM: A Modular Approach for Designing Flexible and Transparent Educational Assistants


206. Decoding inner speech with an end-to-end brain-to-text neural interface


207. The Rapid Growth of AI Foundation Model Usage in Science


208. Polarity-Aware Probing for Quantifying Latent Alignment in Language Models


209. R2Q: Towards Robust 2-Bit Large Language Models via Residual Refinement Quantization


210. Closing the Performance Gap Between AI and Radiologists in Chest X-Ray Reporting


211. Asking LLMs to Verify First is Almost Free Lunch


212. RoSA: Enhancing Parameter-Efficient Fine-Tuning via RoPE-aware Selective Adaptation in Large Language Models


213. HUMORCHAIN: Theory-Guided Multi-Stage Reasoning for Interpretable Multimodal Humor Generation


214. Identifying Quantum Structure in AI Language: Evidence for Evolutionary Convergence of Human and Artificial Cognition


215. A Benchmark for Procedural Memory Retrieval in Language Agents


216. Beyond Component Strength: Synergistic Integration and Adaptive Calibration in Multi-Agent RAG Systems


217. Affective Multimodal Agents with Proactive Knowledge Grounding for Emotionally Aligned Marketing Dialogue


218. Goal-Directed Search Outperforms Goal-Agnostic Memory Compression in Long-Context Memory Tasks


219. PromptTailor: Multi-turn Intent-Aligned Prompt Synthesis for Lightweight LLMs


220. German General Personas: A Survey-Derived Persona Prompt Collection for Population-Aligned LLM Studies


221. GPS: General Per-Sample Prompter


222. EulerESG: Automating ESG Disclosure Analysis with LLMs


223. Quantifying and Mitigating Selection Bias in LLMs: A Transferable LoRA Fine-Tuning and Efficient Majority Voting Approach


224. Lost in the Pipeline: How Well Do Large Language Models Handle Data Preparation?


225. Sensing and Understanding the World over Air: A Large Multimodal Model for Mobile Networks


226. A General Highly Accurate Online Planning Method Integrating Large Language Models into Nested Rollout Policy Adaptation for Dialogue Tasks


227. Evaluating Embedding Generalization: How LLMs, LoRA, and SLERP Shape Representational Geometry


228. CSV-Decode: Certifiable Sub-Vocabulary Decoding for Efficient Large Language Model Inference


229. Cacheback: Speculative Decoding With Nothing But Cache


230. TIP and Polish: Text-Image-Prototype Guided Multi-Modal Generation via Commonality-Discrepancy Modeling and Refinement


231. EvalCards: A Framework for Standardized Evaluation Reporting


232. $\mathcal{E}_0$: Enhancing Generalization and Fine-Grained Control in VLA Models via Continuized Discrete Diffusion


233. On the Role of Preference Variance in Preference Optimization


234. Temporal Consistency for LLM Reasoning Process Error Identification