전체 AI 논문 - 2026-02-19

1. Towards a Science of AI Agent Reliability


2. Agent Skill Framework: Perspectives on the Potential of Small Language Models in Industrial Environments


3. Creating a digital poet


4. Framework of Thoughts: A Foundation Framework for Dynamic and Optimized Reasoning based on Chains, Trees, and Graphs


5. Leveraging Large Language Models for Causal Discovery: a Constraint-based, Argumentation-driven Approach


6. Causally-Guided Automated Feature Engineering with Multi-Agent Reinforcement Learning


7. Verifiable Semantics for Agent-to-Agent Communication


8. Multi-agent cooperation through in-context co-player inference


9. Toward Scalable Verifiable Reward: Proxy State-Based Evaluation for Multi-turn Tool-Calling LLM Agents


10. Revolutionizing Long-Term Memory in AI: New Horizons with High-Capacity and High-Speed Storage


11. EnterpriseGym Corecraft: Training Generalizable Agents on High-Fidelity RL Environments


12. Learning Personalized Agents from Human Feedback


13. GPSBench: Do Large Language Models Understand GPS Coordinates?


14. Improving Interactive In-Context Learning from Natural Language Feedback


15. Evidence-Grounded Subspecialty Reasoning: Evaluating a Curated Clinical Intelligence Layer on the 2025 Endocrinology Board-Style Examination


16. How Uncertain Is the Grade? A Benchmark of Uncertainty Metrics for LLM-Based Automatic Assessment


17. Optimization Instability in Autonomous Agentic Workflows for Clinical Symptom Detection


18. Towards Efficient Constraint Handling in Neural Solvers for Routing Problems


19. Policy Compiler for Secure Agentic Systems


20. Measuring Mid-2025 LLM-Assistance on Novice Performance in Biology


21. Calibrate-Then-Act: Cost-Aware Exploration in LLM Agents


22. SPARC: Scenario Planning and Reasoning for Automated C Unit Test Generation


23. Align Once, Benefit Multilingually: Enforcing Multilingual Consistency for LLM Safety Alignment


24. Retrieval Augmented Generation of Literature-derived Polymer Knowledge: The Example of a Biodegradable Polymer Expert System


25. Enhanced Diffusion Sampling: Efficient Rare Event Sampling and Free Energy Calculation with Diffusion Models


26. Almost Sure Convergence of Differential Temporal Difference Learning for Average Reward Markov Decision Processes


27. A Systematic Evaluation of Sample-Level Tokenization Strategies for MEG Foundation Models


28. Causal and Compositional Abstraction


29. Who can we trust? LLM-as-a-jury for Comparative Assessment


30. Explainable AI: Context-Aware Layer-Wise Integrated Gradients for Explaining Transformer Models


31. FlowPrefill: Decoupling Preemption from Prefill Scheduling Granularity to Mitigate Head-of-Line Blocking in LLM Serving


32. A Contrastive Learning Framework Empowered by Attention-based Feature Adaptation for Street-View Image Classification


33. DataJoint 2.0: A Computational Substrate for Agentic Scientific Workflows


34. AIFL: A Global Daily Streamflow Forecasting Model Using Deterministic LSTM Pre-trained on ERA5-Land and Fine-tuned on IFS


35. MerLean: An Agentic Framework for Autoformalization in Quantum Computation


36. Recursive language models for jailbreak detection: a procedural defense for tool-augmented agents


37. Interpretability-by-Design with Accurate Locally Additive Models and Conditional Feature Effects


38. Fast and Scalable Analytical Diffusion


39. From Growing to Looping: A Unified View of Iterative Computation in LLMs


40. Learning to Learn from Language Feedback with Social Meta-Learning


41. Team of Thoughts: Efficient Test-time Scaling of Agentic Systems through Orchestrated Tool Calling


42. IndicEval: A Bilingual Indian Educational Evaluation Framework for Large Language Models


43. GICDM: Mitigating Hubness for Reliable Distance-Based Generative Model Evaluation


44. RoboGene: Boosting VLA Pre-training via Diversity-Driven Agentic Framework for Real-World Task Generation


45. Hardware-accelerated graph neural networks: an alternative approach for neuromorphic event-based audio classification and keyword spotting on SoC FPGA


46. Intra-Fairness Dynamics: The Bias Spillover Effect in Targeted LLM Alignment


47. Designing Production-Scale OCR for India: Multilingual and Domain-Specific Systems


48. Automated Histopathology Report Generation via Pyramidal Feature Extraction and the UNI Foundation Model


49. AI-Driven Structure Refinement of X-ray Diffraction


50. Articulated 3D Scene Graphs for Open-World Mobile Manipulation


51. HAWX: A Hardware-Aware FrameWork for Fast and Scalable ApproXimation of DNNs


52. Spatial Audio Question Answering and Reasoning on Dynamic Source Movements


53. Guide-Guard: Off-Target Predicting in CRISPR Applications


54. A Self-Supervised Approach for Enhanced Feature Representations in Object Detection Tasks


55. A Graph Meta-Network for Learning on Kolmogorov-Arnold Networks


56. The Diversity Paradox revisited: Systemic Effects of Feedback Loops in Recommender Systems


57. The Weight of a Bit: EMFI Sensitivity Analysis of Embedded Deep Learning Models


58. Generative AI Usage of University Students: Navigating Between Education and Business


59. Color-based Emotion Representation for Speech Emotion Recognition


60. Are LLMs Ready to Replace Bangla Annotators?


61. UCTECG-Net: Uncertainty-aware Convolution Transformer ECG Network for Arrhythmia Detection


62. Graph neural network for colliding particles with an application to sea ice floe modeling


63. Geometric Neural Operators via Lie Group-Constrained Latent Dynamics


64. Long-Tail Knowledge in Large Language Models: Taxonomy, Mechanisms, Interventions and Implications


65. Graphon Mean-Field Subsampling for Cooperative Heterogeneous Multi-Agent Reinforcement Learning


66. Temporal Panel Selection in Ongoing Citizens’ Assemblies


67. Rethinking Input Domains in Physics-Informed Neural Networks via Geometric Compactification Mappings


68. Beyond Learning: A Training-Free Alternative to Model Adaptation


69. SIT-LMPC: Safe Information-Theoretic Learning Model Predictive Control for Iterative Tasks


70. Conjugate Learning Theory: Uncovering the Mechanisms of Trainability and Generalization in Deep Neural Networks


71. Edge Learning via Federated Split Decision Transformers for Metaverse Resource Allocation


72. HiPER: Hierarchical Reinforcement Learning with Explicit Credit Assignment for Large Language Model Agents


73. Balancing Faithfulness and Performance in Reasoning via Multi-Listener Soft Execution


74. ASPEN: Spectral-Temporal Fusion for Cross-Subject Brain Decoding


75. Human-AI Collaboration in Large Language Model-Integrated Building Energy Management Systems: The Role of User Domain Knowledge and AI Literacy


76. Retrieval Collapses When AI Pollutes the Web


77. Rethinking ANN-based Retrieval: Multifaceted Learnable Index for Large-scale Recommendation System


78. Surrogate-Based Prevalence Measurement for Large-Scale A/B Testing


79. OmniCT: Towards a Unified Slice-Volume LVLM for Comprehensive CT Analysis


80. Federated Graph AGI for Cross-Border Insider Threat Intelligence in Government Financial Schemes


81. Updating Parametric Knowledge with Context Distillation Retains Post-Training Capabilities


82. Language Statistics and False Belief Reasoning: Evidence from 41 Open-Weight LMs


83. ScenicRules: An Autonomous Driving Benchmark with Multi-Objective Specifications and Abstract Scenarios


84. Omni-iEEG: A Large-Scale, Comprehensive iEEG Dataset and Benchmark for Epilepsy Research


85. Can Generative Artificial Intelligence Survive Data Contamination? Theoretical Guarantees under Contaminated Recursive Training


86. AI-CARE: Carbon-Aware Reporting Evaluation Metric for AI Models


87. Transforming GenAI Policy to Prompting Instruction: An RCT of Scalable Prompting Interventions in a CS1 Course


88. MedProbCLIP: Probabilistic Adaptation of Vision-Language Foundation Model for Reliable Radiograph-Report Retrieval


89. MAEB: Massive Audio Embedding Benchmark


90. ODYN: An All-Shifted Non-Interior-Point Method for Quadratic Programming in Robotics and AI


91. Anatomy of Capability Emergence: Scale-Invariant Representation Collapse and Top-Down Reorganization in Neural Networks


92. ReLoop: Structured Modeling and Behavioral Verification for Reliable LLM-Based Optimization


93. B-DENSE: Branching For Dense Ensemble Network Learning


94. From Reflection to Repair: A Scoping Review of Dataset Documentation Tools


95. Position-Aware Scene-Appearance Disentanglement for Bidirectional Photoacoustic Microscopy Registration


96. DocSplit: A Comprehensive Benchmark Dataset and Evaluation Approach for Document Packet Recognition and Splitting


97. Hybrid Model Predictive Control with Physics-Informed Neural Network for Satellite Attitude Control


98. From Tool Orchestration to Code Execution: A Study of MCP Design Choices


99. A fully differentiable framework for training proxy Exchange Correlation Functionals for periodic systems


100. Generalized Leverage Score for Scalable Assessment of Privacy Vulnerability


101. EarthSpatialBench: Benchmarking Spatial Reasoning Capabilities of Multimodal LLMs on Earth Imagery


102. MaS-VQA: A Mask-and-Select Framework for Knowledge-Based Visual Question Answering


103. Foundation Models for Medical Imaging: Status, Challenges, and Directions


104. Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis


105. Doc-to-LoRA: Learning to Instantly Internalize Contexts


106. Understand Then Memory: A Cognitive Gist-Driven RAG Framework with Global Semantic Diffusion


107. Egocentric Bias in Vision-Language Models


108. Surrogate Modeling for Neutron Transport: A Neural Operator Approach


109. Evidence for Daily and Weekly Periodic Variability in GPT-4o Performance


110. NeuroSleep: Neuromorphic Event-Driven Single-Channel EEG Sleep Staging for Edge-Efficient Sensing


111. FUTURE-VLA: Forecasting Unified Trajectories Under Real-time Execution


112. IT-OSE: Exploring Optimal Sample Size for Industrial Data Augmentation


113. Genetic Generalized Additive Models


114. Fly0: Decoupling Semantic Grounding from Geometric Planning for Zero-Shot Aerial Navigation


115. Test-Time Adaptation for Tactile-Vision-Language Models


116. Playing With AI: How Do State-Of-The-Art Large Language Models Perform in the 1977 Text-Based Adventure Game Zork?


117. NLP Privacy Risk Identification in Social Media (NLP-PRISM): A Survey


118. AI as Teammate or Tool? A Review of Human-AI Interaction in Decision Support


119. Not the Example, but the Process: How Self-Generated Examples Enhance LLM Reasoning


120. Enhancing Action and Ingredient Modeling for Semantically Grounded Recipe Generation


121. CAST: Achieving Stable LLM-based Text Analysis for Data Analytics


122. State Design Matters: How Representations Shape Dynamic Reasoning in Large Language Models


123. Rethinking Soft Compression in Retrieval-Augmented Generation: A Query-Conditioned Selector Perspective


124. Kalman-Inspired Runtime Stability and Recovery in Hybrid Reasoning Systems


125. Decoupling Strategy and Execution in Task-Focused Dialogue via Goal-Oriented Preference Optimization


126. A Lightweight Explainable Guardrail for Prompt Safety


127. Building Safe and Deployable Clinical Natural Language Processing under Temporal Leakage Constraints


128. Narrative Theory-Driven LLM Methods for Automatic Story Generation and Understanding: A Survey


129. Preference Optimization for Review Question Generation Improves Writing Quality


130. Can LLMs Assess Personality? Validating Conversational AI for Trait Profiling


131. Do Personality Traits Interfere? Geometric Limitations of Steering in Large Language Models


132. Language Model Representations for Efficient Few-Shot Tabular Classification


133. The Perplexity Paradox: Why Code Compresses Better Than Math in LLM Prompts


134. EdgeNav-QE: QLoRA Quantization and Dynamic Early Exit for LAM-based Navigation on Edge Devices


135. What Persona Are We Missing? Identifying Unknown Relevant Personas for Faithful User Simulation