전체 AI 논문 - 2026-01-23

1. Scalable Board Expansion within a General Game System


2. Structured Hints for Sample-Efficient Lean Theorem Proving


3. Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning


4. LLM Prompt Evaluation for Educational Applications


5. Multimodal Climate Disinformation Detection: Integrating Vision-Language Models with External Knowledge Sources


6. Controlling Long-Horizon Behavior in Language Model Agents with Explicit State Dynamics


7. Designing faster mixed integer linear programming algorithm via learning the optimal path


8. AgriPINN: A Process-Informed Neural Network for Interpretable and Scalable Crop Biomass Prediction Under Water Stress


9. Grounding Large Language Models in Reaction Knowledge Graphs for Synthesis Retrieval


10. Deja Vu in Plots: Leveraging Cross-Session Evidence with Retrieval-Augmented LLMs for Live Streaming Risk Assessment


11. Decoupling Return-to-Go for Efficient Decision Transformer


12. Natural Language-Driven Global Mapping of Martian Landforms



14. EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience


15. ErrorMap and ErrorAtlas: Charting the Failure Landscape of Large Language Models


16. Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification


17. VitalDiagnosis: AI-Driven Ecosystem for 24/7 Vital Monitoring and Chronic Disease Management


18. Creativity in the Age of AI: Rethinking the Role of Intentional Agency


19. Agentic Confidence Calibration


20. Off-Policy Actor-Critic with Sigmoid-Bounded Entropy for Real-World Robot Learning


21. Tabular Incremental Inference


22. PhysProver: Advancing Automatic Theorem Proving for Physics


23. Benchmarking Text-to-Python against Text-to-SQL: The Impact of Explicit Logic and Ambiguity


24. Investigation of the Generalisation Ability of Genetic Programming-evolved Scheduling Rules in Dynamic Flexible Job Shop Scheduling


25. AgentSM: Semantic Memory for Agentic Text-to-SQL


26. Improving Methodologies for LLM Evaluations Across Global Languages


27. Agentic Uncertainty Quantification


28. From Passive Metric to Active Signal: The Evolving Role of Uncertainty Quantification in Large Language Models


29. Improving Methodologies for Agentic Evaluations Across Domains: Leakage of Sensitive Information, Fraud and Cybersecurity Threats


30. Predictive Coding and Information Bottleneck for Hallucination Detection in Large Language Models


31. Agentic AI Governance and Lifecycle Management in Healthcare


32. CogToM: A Comprehensive Theory of Mind Benchmark inspired by Human Cognition for Large Language Models


33. Autonomous Business System via Neuro-symbolic AI


34. ALIGNAgent: Adaptive Learner Intelligence for Gap Identification and Next-step guidance


35. From Generative Engines to Actionable Simulators: The Imperative of Physical Grounding in World Models


36. TransportAgents: a multi-agents LLM framework for traffic accident severity prediction


37. The Dark Side of AI Transformers: Sentiment Polarization & the Loss of Business Neutrality by NLP Transformers


38. Tracking the Limits of Knowledge Propagation: How LLMs Fail at Multi-Step Reasoning with Conflicting Knowledge


39. MiRAGE: A Multiagent Framework for Generating Multimodal Multihop Question-Answer Dataset for RAG Evaluation



41. A tensor network formalism for neuro-symbolic AI


42. Not Your Typical Sycophant: The Elusive Nature of Sycophancy in Large Language Models


43. Beyond Prompting: Efficient and Robust Contextual Biasing for Speech LLMs via Logit-Space Integration (LOGIC)


44. GeMM-GAN: A Multimodal Generative Model Conditioned on Histopathology Images and Clinical Descriptions for Gene Expression Profile Generation


45. Logic Programming on Knowledge Graph Networks And its Application in Medical Domain


46. Prometheus Mind: Retrofitting Memory to Frozen Language Models


47. Replayable Financial Agents: A Determinism-Faithfulness Assurance Harness for Tool-Using LLM Agents


48. The Paradigm Shift: A Comprehensive Survey on Large Vision Language Models for Multimodal Fake News Detection


49. Aeon: High-Performance Neuro-Symbolic Memory Management for Long-Horizon LLM Agents


50. DeepSurvey-Bench: Evaluating Academic Value of Automatically Generated Scientific Survey


51. Uncovering Latent Bias in LLM-Based Emergency Department Triage Through Proxy Variables


52. Gated Sparse Attention: Combining Computational Efficiency with Training Stability for Long-Context Language Models


53. Why Can’t I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action Recognition


54. PyraTok: Language-Aligned Pyramidal Tokenizer for Video Understanding and Generation


55. LLM-in-Sandbox Elicits General Agentic Intelligence


56. Counterfactual Training: Teaching Models Plausible and Actionable Explanations


57. Learning to Discover at Test Time


58. Substrate Stability Under Persistent Disagreement: Structural Constraints for Neutral Ontological Substrates


59. Pay (Cross) Attention to the Melody: Curriculum Masking for Single-Encoder Melodic Harmonization


60. Learning to Watermark in the Latent Space of Generative Models


61. Replicating Human Motivated Reasoning Studies with LLMs


62. Improving Training Efficiency and Reducing Maintenance Costs via Language Specific Model Merging


63. Delayed Assignments in Online Non-Centroid Clustering with Stochastic Arrivals


64. Probably Approximately Correct Maximum A Posteriori Inference


65. Sawtooth Wavefront Reordering: Enhanced CuTile FlashAttention on NVIDIA GB10


66. THOR: A Versatile Foundation Model for Earth Observation Climate and Society Applications


67. PhysicsMind: Sim and Real Mechanics Benchmarking for Physical Reasoning and Prediction in Foundational VLMs and World Models


68. PUMA: Perception-driven Unified Foothold Prior for Mobility Augmented Quadruped Parkour


69. MMGRid: Navigating Temporal-aware and Cross-domain Generative Recommendation via Model Merging


70. Class Confidence Aware Reweighting for Long Tailed Learning


71. Progressive Power Homotopy for Non-convex Optimization


72. TeNet: Text-to-Network for Compact Policy Synthesis


73. Transfer Learning from ImageNet for MEG-Based Decoding of Imagined Speech


74. Iterative Amortized Hierarchical VAE


75. Understanding the Transfer Limits of Vision Foundation Models


76. Why Inference in Large Models Becomes Decomposable After Training


77. Artificial Rigidities vs. Biological Noise: A Comparative Analysis of Multisensory Integration in AV-HuBERT and Human Observers


78. Can professional translators identify machine-generated text?


79. Introducing the Generative Application Firewall (GAF)


80. Virtual Traffic Police: Large Language Model-Augmented Traffic Signal Control for Unforeseen Incidents


81. A Mobile Application for Flower Recognition System Based on Convolutional Neural Networks


82. A Beacon Based Solution for Autonomous UUVs GNSS-Denied Stealthy Navigation


83. CAFE-GB: Scalable and Stable Feature Selection for Malware Detection via Chunk-wise Aggregated Gradient Boosting


84. FAIR-ESI: Feature Adaptive Importance Refinement for Electrophysiological Source Imaging


85. DualShield: Safe Model Predictive Diffusion via Reachability Analysis for Interactive Autonomous Driving


86. VideoThinker: Building Agentic VideoLLMs with LLM-Guided Tool Reasoning


87. CoNRec: Context-Discerning Negative Recommendation with LLMs


88. Dancing in Chains: Strategic Persuasion in Academic Rebuttal via Theory of Mind


89. Even GPT-5.2 Can’t Count to Five: The Case for Zero-Error Horizons in Trustworthy LLMs


90. FlexLLM: Composable HLS Library for Flexible Hybrid LLM Accelerator Design


91. Beyond Visual Safety: Jailbreaking Multimodal Large Language Models for Harmful Image Generation via Semantic-Agnostic Inputs


92. FARM: Field-Aware Resolution Model for Intelligent Trigger-Action Automation


93. Connect the Dots: Knowledge Graph-Guided Crawler Attack on Retrieval-Augmented Generation Systems


94. Enhancing guidance for missing data in diffusion-based sequential recommendation


95. StreetDesignAI: A Multi-Persona Evaluation System for Inclusive Infrastructure Design


96. Skywork UniPic 3.0: Unified Multi-Image Composition via Sequence Modeling


97. TempoNet: Learning Realistic Communication and Timing Patterns for Network Traffic Simulation


98. Integrating Knowledge Distillation Methods: A Sequential Multi-Stage Framework


99. Event-VStream: Event-Driven Real-Time Understanding for Long Video Streams


100. Bridging Qualitative Rubrics and AI: A Binary Question Framework for Criterion-Referenced Grading in Engineering


101. Robust Tool Use via Fission-GRPO: Learning to Recover from Execution Errors


102. DeepASMR: LLM-Based Zero-Shot ASMR Speech Generation for Anyone of Any Voice


103. Data-Free Privacy-Preserving for LLMs via Model Inversion and Selective Unlearning


104. Parallelism and Generation Order in Masked Diffusion Language Models: Limits Today, Potential Tomorrow


105. MapViT: A Two-Stage ViT-Based Framework for Real-Time Radio Quality Map Prediction in Dynamic Environments


106. PromptHelper: A Prompt Recommender System for Encouraging Creativity in AI Chatbot Interactions


107. BanditLP: Large-Scale Stochastic Optimization for Personalized Recommendations


108. VIOLA: Towards Video In-Context Learning with Minimal Annotations


109. Learning Neural Operators from Partial Observations via Latent Autoregressive Modeling


110. RDumb++: Drift-Aware Continual Test-Time Adaptation


111. PRISM: Deriving the Transformer as a Signal-Denoising Operator via Maximum Coding Rate Reduction


112. A Machine Vision Approach to Preliminary Skin Lesion Assessments


113. QUAIL: Quantization Aware Unlearning for Mitigating Misinformation in LLMs


114. Low-Dimensional Adaptation of Rectified Flow: A New Perspective through the Lens of Diffusion and Stochastic Localization


115. Multi-Persona Thinking for Bias Mitigation in Large Language Models


116. The Rise of Large Language Models and the Direction and Impact of US Federal Research Funding


117. Is Grokipedia Right-Leaning? Comparing Political Framing in Wikipedia and Grokipedia on Controversial Topics


118. Martingale Foresight Sampling: A Principled Approach to Inference-Time LLM Decoding


119. Benchmarking LLMs for Pairwise Causal Discovery in Biomedical and Multi-Domain Contexts


120. Multi-Targeted Graph Backdoor Attack


121. Panther: Faster and Cheaper Computations with Randomized Numerical Linear Algebra


122. Chunking, Retrieval, and Re-ranking: An Empirical Evaluation of RAG Architectures for Policy Document Question Answering


123. Reflexis: Supporting Reflexivity and Rigor in Collaborative Qualitative Analysis through Design for Deliberation


124. Ambient Dataloops: Generative Models for Dataset Refinement


125. DuFal: Dual-Frequency-Aware Learning for High-Fidelity Extremely Sparse-view CBCT Reconstruction


126. A Checklist for Trustworthy, Safe, and User-Friendly Mental Health Chatbots


127. CURE: Curriculum-guided Multi-task Training for Reliable Anatomy Grounded Report Generation


128. Beyond Fixed Psychological Personas: State Beats Trait, but Language Models are State-Blind


129. Improving MoE Compute Efficiency by Composing Weight and Data Sparsity


130. OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation


131. Q-Probe: Scaling Image Quality Assessment to High Resolution via Context-Aware Agentic Probing


132. OmniSpectra: A Unified Foundation Model for Native Resolution Astronomical Spectra


133. Abusive music and song transformation using GenAI and LLMs


134. Lost in Transcription: How Speech-to-Text Errors Derail Code Understanding


135. Learning Discrete Successor Transitions in Continuous Attractor Networks: Emergence, Limits, and Topological Constraints


136. ToolCaching: Towards Efficient Caching for LLM Tool-calling


137. No Reliable Evidence of Self-Reported Sentience in Small Large Language Models


138. Empowering LLMs for Structure-Based Drug Design via Exploration-Augmented Latent Inference


139. RECAP: A Resource-Efficient Method for Adversarial Prompting in Large Language Models


140. ICPO: Illocution-Calibrated Policy Optimization for Multi-Turn Conversation


141. ECGomics: An Open Platform for AI-ECG Digital Biomarker Discovery


142. Large Language Models as Simulative Agents for Neurodivergent Adult Psychometric Profiles


143. Beyond the Einstein-Bohr Debate: Cognitive Complementarity and the Emergence of Quantum Intuition


144. Mind the Gap: Why Neural Memory Fails Under Semantic Density


145. Do people expect different behavior from large language models acting on their behalf? Evidence from norm elicitations in two canonical economic games


146. When Generative AI Meets Extended Reality: Enabling Scalable and Natural Interactions


147. An Explainable Market Integrity Monitoring System with Multi-Source Attention Signals and Transparent Scoring


148. Can We Trust LLM Detectors?


149. Embedding Retrofitting: Data Engineering for better RAG


150. Entropy-Tree: Tree-Based Decoding with Entropy-Guided Exploration


151. Elsewise: Authoring AI-Based Interactive Narrative with Possibility Space Visualization


152. A Mobile Application Front-End for Presenting Explainable AI Results in Diabetes Risk Estimation


153. Agentic Persona Control and Task State Tracking for Realistic User Simulation in Interactive Scenarios


154. LLM-based Multimodal Feedback Produces Equivalent Learning and Better Student Perceptions than Educator Feedback


155. Psychometric Comparability of LLM-Based Digital Twins