전체 AI 논문 - 2026-04-23

1. Diagnosing CFG Interpretation in LLMs


2. Automatic Ontology Construction Using LLMs as an External Layer of Memory, Verification, and Planning for Hybrid Intelligent Systems


3. SWE-chat: Coding Agent Interactions From Real Users in the Wild


4. V-tableR1: Process-Supervised Multimodal Table Reasoning with Critic-Guided Policy Optimization


5. Where and What: Reasoning Dynamic and Implicit Preferences in Situated Conversational Recommendation


6. AAC: Admissible-by-Architecture Differentiable Landmark Compression for ALT


7. Interval POMDP Shielding for Imperfect-Perception Agents


8. Learning to Evolve: A Self-Improving Framework for Multi-Agent Systems via Textual Parameter Graph Optimization


9. Participatory provenance as representational auditing for AI-mediated public consultation


10. Large Language Models Outperform Humans in Fraud Detection and Resistance to Motivated Investor Pressure


11. CHORUS: An Agentic Framework for Generating Realistic Deliberation Data


12. pAI/MSc: ML Theory Research with Humans on the Loop


13. Self-Guided Plan Extraction for Instruction-Following Tasks with Goal-Conditional Reinforcement Learning


14. Measuring the Machine: Evaluating Generative AI as Pluralist Sociotechical Systems


15. MedSkillAudit: A Domain-Specific Audit Framework for Medical Research Agent Skills


16. Self-Awareness before Action: Mitigating Logical Inertia via Proactive Cognitive Awareness


17. FSFM: A Biologically-Inspired Framework for Selective Forgetting of Agent Memory


18. ActuBench: A Multi-Agent LLM Pipeline for Generation and Evaluation of Actuarial Reasoning Tasks


19. Memory-Augmented LLM-based Multi-Agent System for Automated Feature Generation on Tabular Data


20. Mol-Debate: Multi-Agent Debate Improves Structural Reasoning in Molecular Design


21. Stateless Decision Memory for Enterprise AI Agents


22. HiPO: Hierarchical Preference Optimization for Adaptive Reasoning in LLMs


23. EvoAgent: An Evolvable Agent Framework with Skill Learning and Multi-Agent Delegation


24. From Fuzzy to Formal: Scaling Hospital Quality Improvement with AI


25. Separable Pathways for Causal Reasoning: How Architectural Scaffolding Enables Hypothesis-Space Restructuring in LLM Agents


26. What Makes a Good AI Review? Concern-Level Diagnostics for AI Peer Review


27. CreativeGame:Toward Mechanic-Aware Creative Game Generation


28. Learning When Not to Decide: A Framework for Overcoming Factual Presumptuousness in AI Adjudication


29. Deconstructing Superintelligence: Identity, Self-Modification and Différance


30. Resolving space-sharing conflicts in road user interactions through uncertainty reduction: An active inference-based computational model


31. Forage V2: Knowledge Evolution and Transfer in Autonomous Agent Organizations


32. JTPRO: A Joint Tool-Prompt Reflective Optimization Framework for Language Agents


33. Emergence Transformer: Dynamical Temporal Attention Matters


34. Large Language Models Meet Biomedical Knowledge Graphs for Mechanistically Grounded Therapeutic Prioritization


35. The Existential Theory of Research: Why Discovery Is Hard


36. MIRROR: A Hierarchical Benchmark for Metacognitive Calibration in Large Language Models



38. The AI Telco Engineer: Toward Autonomous Discovery of Wireless Communications Algorithms


39. Prism: An Evolutionary Memory Substrate for Multi-Agent Open-Ended Discovery


40. Handbook of Rough Set Extensions and Uncertainty Models


41. SkillGraph: Graph Foundation Priors for LLM Agent Tool Sequence Recommendation


42. OpenCLAW-P2P v6.0: Resilient Multi-Layer Persistence, Live Reference Verification, and Production-Scale Evaluation of Decentralized AI Peer Review


43. Stabilising Generative Models of Attitude Change


44. Hidden Reliability Risks in Large Language Models: Systematic Identification of Precision-Induced Output Disagreements


45. From Data to Theory: Autonomous Large Language Model Agents for Materials Science


46. Using Learning Theories to Evolve Human-Centered XAI: Future Perspectives and Challenges


47. From Actions to Understanding: Conformal Interpretability of Temporal Concepts in LLM Agents


48. EvoForest: A Novel Machine-Learning Paradigm via Open-Ended Evolution of Computational Graphs


49. Inference Headroom Ratio: A Diagnostic and Control Framework for Inference Stability Under Constraint


50. Automated Detection of Dosing Errors in Clinical Trial Narratives: A Multi-Modal Feature Engineering Approach with LightGBM


51. ThermoQA: A Three-Tier Benchmark for Evaluating Thermodynamic Reasoning in Large Language Models


52. Explainable AML Triage with LLMs: Evidence Retrieval and Counterfactual Checks


53. Exploring Data Augmentation and Resampling Strategies for Transformer-Based Models to Address Class Imbalance in AI Scoring of Scientific Explanations in NGSS Classroom


54. Algorithm Selection with Zero Domain Knowledge via Text Embeddings


55. AI to Learn 2.0: A Deliverable-Oriented Governance Framework and Maturity Rubric for Opaque AI in Learning-Intensive Domains


56. The Tool-Overuse Illusion: Why Does LLM Prefer External Tools over Internal Knowledge?


57. SpeechParaling-Bench: A Comprehensive Benchmark for Paralinguistic-Aware Speech Generation


58. AVISE: Framework for Evaluating the Security of AI Systems


59. FedSIR: Spectral Client Identification and Relabeling for Federated Learning with Noisy Labels


60. Convergent Evolution: How Different Language Models Learn Similar Number Representations


61. OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Model


62. Relative Principals, Pluralistic Alignment, and the Structural Value Alignment Problem


63. Can “AI” Be a Doctor? A Study of Empathy, Readability, and Alignment in Clinical LLMs


64. Working Memory Constraints Scaffold Learning in Transformers under Data Scarcity


65. DAIRE: A lightweight AI model for real-time detection of Controller Area Network attacks in the Internet of Vehicles


66. Coverage, Not Averages: Semantic Stratification for Trustworthy Retrieval Evaluation


67. Anchor-and-Resume Concession Under Dynamic Pricing for LLM-Augmented Freight Negotiation


68. Supplement Generation Training for Enhancing Agentic Task Performance



70. Tokenised Flow Matching for Hierarchical Simulation Based Inference


71. COMPASS: COntinual Multilingual PEFT with Adaptive Semantic Sampling


72. ONOTE: Benchmarking Omnimodal Notation Processing for Expert-level Music Intelligence


73. QuanForge: A Mutation Testing Framework for Quantum Neural Networks


74. Storm Surge Modeling, Bias Correction, Graph Neural Networks, Graph Convolution Networks


75. A Field Guide to Decision Making


76. ORPHEAS: A Cross-Lingual Greek-English Embedding Model for Retrieval-Augmented Generation


77. The Expense of Seeing: Attaining Trustworthy Multimodal Reasoning Within the Monolithic Paradigm


78. GRPO-VPS: Enhancing Group Relative Policy Optimization with Verifiable Process Supervision for Effective Reasoning


79. Centering Ecological Goals in Automated Identification of Individual Animals


80. RSRCC: A Remote Sensing Regional Change Comprehension Benchmark Constructed via Retrieval-Augmented Best-of-N Ranking


81. Beyond ZOH: Advanced Discretization Strategies for Vision Mamba


82. Trust, Lies, and Long Memories: Emergent Social Dynamics and Reputation in Multi-Round Avalon with LLM Agents


83. LayerTracer: A Joint Task-Particle and Vulnerable-Layer Analysis framework for Arbitrary Large Language Model Architectures


84. Toward Cross-Lingual Quality Classifiers for Multilingual Pretraining Data Selection


85. Enhancing Research Idea Generation through Combinatorial Innovation and Multi-Agent Iterative Search Strategies


86. Evian: Towards Explainable Visual Instruction-tuning Data Auditing


87. Early-Stage Product Line Validation Using LLMs: A Study on Semi-Formal Blueprint Analysis


88. CHASM: Unveiling Covert Advertisements on Chinese Social Media


89. Mythos and the Unverified Cage: Z3-Based Pre-Deployment Verification for Frontier-Model Sandbox Infrastructure


90. Knowledge Capsules: Structured Nonparametric Memory Units for LLMs


91. MOMO: A framework for seamless physical, verbal, and graphical robot skill learning and adaptation


92. VTouch++: A Multimodal Dataset with Vision-Based Tactile Enhancement for Bimanual Manipulation


93. DialToM: A Theory of Mind Benchmark for Forecasting State-Driven Dialogue Trajectories


94. Shift-Up: A Framework for Software Engineering Guardrails in AI-native Software Development – Initial Findings


95. Scalable AI Inference: Performance Analysis and Optimization of AI Model Serving




98. CyberCertBench: Evaluating LLMs in Cybersecurity Certification Knowledge


99. AI models of unstable flow exhibit hallucination


100. LaplacianFormer:Rethinking Linear Attention with Laplacian Kernel


101. Benefits of Low-Cost Bio-Inspiration in the Age of Overparametrization


102. Bimanual Robot Manipulation via Multi-Agent In-Context Learning


103. A Vision-Language-Action Model for Adaptive Ultrasound-Guided Needle Insertion and Needle Tracking


104. Surrogate modeling for interpreting black-box LLMs in medical predictions


105. Image Generators are Generalist Vision Learners


106. Formalising the Logit Shift Induced by LoRA: A Technical Note


107. Seeing Further and Wider: Joint Spatio-Temporal Enlargement for Micro-Video Popularity Prediction


108. Dual Causal Inference: Integrating Backdoor Adjustment and Instrumental Variable Learning for Medical VQA


109. LLM-guided phase diagram construction through high-throughput experimentation


110. MambaLiteUNet: Cross-Gated Adaptive Feature Fusion for Robust Skin Lesion Segmentation


111. AgentLens: Adaptive Visual Modalities for Human-Agent Interaction in Mobile GUI Agents


112. Text Steganography with Dynamic Codebook and Multimodal Large Language Model


113. ATIR: Towards Audio-Text Interleaved Contextual Retrieval


114. AROMA: Augmented Reasoning Over a Multimodal Architecture for Virtual Cell Genetic Perturbation Modeling


115. uLEAD-TabPFN: Uncertainty-aware Dependency-based Anomaly Detection with TabPFN


116. Cortex 2.0: Grounding World Models in Real-World Industrial Deployment


117. Hybrid Policy Distillation for LLMs


118. Enhancing Speaker Verification with Whispered Speech via Post-Processing


119. Towards Secure Logging: Characterizing and Benchmarking Logging Code Security Issues with LLMs


120. Vibrotactile Preference Learning: Uncertainty-Aware Preference Learning for Personalized Vibration Feedback


121. From Scene to Object: Text-Guided Dual-Gaze Prediction


122. Taint-Style Vulnerability Detection and Confirmation for Node.js Packages Using LLM Agent Reasoning


123. Physics-Enhanced Deep Learning for Proactive Thermal Runaway Forecasting in Li-Ion Batteries


124. Meta-Tool: Efficient Few-Shot Tool Adaptation for Small Language Models


125. IMPACT-CYCLE: A Contract-Based Multi-Agent System for Claim-Level Supervisory Correction of Long-Video Semantic Memory


126. AgentSOC: A Multi-Layer Agentic AI Framework for Security Operations Automation


127. Adaptive Conformal Anomaly Detection with Time Series Foundation Models for Signal Monitoring


128. On the Stability and Generalization of First-order Bilevel Minimax Optimization


129. Meta Additive Model: Interpretable Sparse Learning With Auto Weighting


130. Learning to Solve the Quadratic Assignment Problem with Warm-Started MCMC Finetuning


131. Auditing and Controlling AI Agent Actions in Spreadsheets


132. Information Aggregation with AI Agents


133. TriEx: A Game-based Tri-View Framework for Explaining Internal Reasoning in Multi-Agent LLMs


134. Normalizing Flows with Iterative Denoising


135. Cognitive Alignment At No Cost: Inducing Human Attention Biases For Interpretable Vision Transformers


136. Statistics, Not Scale: Modular Medical Dialogue with Bayesian Belief Engine


137. EmbodiedMidtrain: Bridging the Gap between Vision-Language Models and Vision-Language-Action Models via Mid-training


138. Frictionless Love: Associations Between AI Companion Roles and Behavioral Addiction


139. scpFormer: A Foundation Model for Unified Representation and Integration of the Single-Cell Proteomics


140. Bias in the Tails: How Name-conditioned Evaluative Framing in Resume Summaries Destabilizes LLM-based Hiring


141. Semantic Prompting: Agentic Incremental Narrative Refinement through Spatial Semantic Interaction


142. DistortBench: Benchmarking Vision Language Models on Image Distortion Identification


143. Infection-Reasoner: A Compact Vision-Language Model for Wound Infection Classification with Evidence-Grounded Clinical Reasoning


144. Generalization and Membership Inference Attack a Practical Perspective


145. Behavioral Transfer in AI Agents: Evidence and Privacy Implications


146. A Multi-Plant Machine Learning Framework for Emission Prediction, Forecasting, and Control in Cement Manufacturing


147. MMCORE: MultiModal COnnection with Representation Aligned Latent Embeddings


148. Depression Risk Assessment in Social Media via Large Language Models


149. From Signal Degradation to Computation Collapse: Uncovering the Two Failure Modes of LLM Quantization


150. DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data


151. ChipCraftBrain: Validation-First RTL Generation via Multi-Agent Orchestration


152. Is Four Enough? Automated Reasoning Approaches and Dual Bounds for Condorcet Dimensions of Elections


153. Neural posterior estimation of the neutrino direction in IceCube using transformer-encoded normalizing flows on the sphere


154. If you’re waiting for a sign… that might not be it! Mitigating Trust Boundary Confusion from Visual Injections on Vision-Language Agentic Systems


155. Environmental Understanding Vision-Language Model for Embodied Agent


156. Expert Upcycling: Shifting the Compute-Efficient Frontier of Mixture-of-Experts


157. More Is Different: Toward a Theory of Emergence in AI-Native Software Ecosystems


158. Co-Located Tests, Better AI Code: How Test Syntax Structure Affects Foundation Model Code Generation


159. SolidCoder: Bridging the Mental-Reality Gap in LLM Code Generation through Concrete Execution


160. Rabies diagnosis in low-data settings: A comparative study on the impact of data augmentation and transfer learning


161. Model Capability Assessment and Safeguards for Biological Weaponization


162. Improving Molecular Force Fields with Minimal Temporal Information


163. Utterance-Level Methods for Identifying Reliable ASR-Output for Child Speech


164. On-Meter Graph Machine Learning: A Case Study of PV Power Forecasting for Grid Edge Intelligence


165. Measuring Creativity in the Age of Generative AI: Distinguishing Human and AI-Generated Creative Performance in Hiring and Talent Systems


166. Enhancing ASR Performance in the Medical Domain for Dravidian Languages


167. LLM Agents Predict Social Media Reactions but Do Not Outperform Text Classifiers: Benchmarking Simulation Accuracy Using 120K+ Personas of 1511 Humans


168. Can LLMs Infer Conversational Agent Users’ Personality Traits from Chat History?


169. Peer-Preservation in Frontier Models


170. KoALa-Bench: Evaluating Large Audio Language Models on Korean Speech Understanding and Faithfulness


171. Do Small Language Models Know When They’re Wrong? Confidence-Based Cascade Scoring for Educational Assessment


172. Self-Describing Structured Data with Dual-Layer Guidance: A Lightweight Alternative to RAG for Precision Retrieval in Large-Scale LLM Knowledge Navigation


173. Phase 1 Implementation of LLM-generated Discharge Summaries showing high Adoption in a Dutch Academic Hospital


174. PR-CAD: Progressive Refinement for Unified Controllable and Faithful Text-to-CAD Generation with Large Language Models


175. CoAuthorAI: A Human in the Loop System For Scientific Book Writing


176. Cognis: Context-Aware Memory for Conversational AI Agents


177. TTKV: Temporal-Tiered KV Cache for Long-Context LLM Inference


178. Saying More Than They Know: A Framework for Quantifying Epistemic-Rhetorical Miscalibration in Large Language Models


179. Accelerating PayPal’s Commerce Agent with Speculative Decoding: An Empirical Study on EAGLE3 with Fine-Tuned Nemotron Models


180. OThink-SRR1: Search, Refine and Reasoning with Reinforced Learning for Large Language Models


181. Do Hallucination Neurons Generalize? Evidence from Cross-Domain Transfer in LLMs


182. Can We Locate and Prevent Stereotypes in LLMs?


183. Explainable Speech Emotion Recognition: Weighted Attribute Fairness to Model Demographic Contributions to Social Bias


184. Transparent Screening for LLM Inference and Training Impacts


185. WorkflowGen:an adaptive workflow generation mechanism driven by trajectory experience


186. Soft-Label Governance for Distributional Safety in Multi-Agent Systems


187. Coding with Eyes: Visual Feedback Unlocks Reliable GUI Code Generating and Debugging


188. AutoGraph-R1: End-to-End Reinforcement Learning for Knowledge Graph Construction