LLM 관련 주요 논문 - 2026-03-09

1. Talk Freely, Execute Strictly: Schema-Gated Agentic AI for Flexible and Reproducible Scientific Workflows


2. The EpisTwin: A Knowledge Graph-Grounded Neuro-Symbolic Architecture for Personal AI


3. Agentic LLM Planning via Step-Wise PDDL Simulation: An Empirical Characterisation


4. An Interactive Multi-Agent System for Evaluation of New Product Concepts


5. DeepFact: Co-Evolving Benchmarks and Agents for Deep Research Factuality


6. The World Won’t Stay Still: Programmable Evolution for Agent Benchmarks


7. Evolving Medical Imaging Agents via Experience-driven Self-skill Discovery


8. RoboLayout: Differentiable 3D Scene Generation for Embodied Agents


9. BEVLM: Distilling Semantic Knowledge from LLMs into Bird’s-Eye View Representations


10. SUREON: A Benchmark and Vision-Language-Model for Surgical Reasoning


11. RAMoEA-QA: Hierarchical Specialization for Robust Respiratory Audio Question Answering


12. COLD-Steer: Steering Large Language Models via In-Context One-step Learning Dynamics


13. PONTE: Personalized Orchestration for Natural Language Trustworthy Explanations


14. Do Foundation Models Know Geometry? Probing Frozen Features for Continuous Physical Measurement


15. Prosodic Boundary-Aware Streaming Generation for LLM-Based TTS with Streaming Text Input


16. Abductive Reasoning with Syllogistic Forms in Large Language Models


17. ESAA-Security: An Event-Sourced, Verifiable Architecture for Agent-Assisted Security Audits of AI-Generated Code


18. MoEless: Efficient MoE LLM Serving via Serverless Computing


19. K-MaT: Knowledge-Anchored Manifold Transport for Cross-Modal Prompt Learning in Medical Imaging


20. Structured Exploration vs. Generative Flexibility: A Field Study Comparing Bandit and LLM Architectures for Personalised Health Behaviour Interventions


21. From Entropy to Calibrated Uncertainty: Training Language Models to Reason About Uncertainty


22. DEX-AR: A Dynamic Explainability Method for Autoregressive Vision-Language Models


23. Stem: Rethinking Causal Information Flow in Sparse Attention


24. Agentic retrieval-augmented reasoning reshapes collective reliability under model variability in radiology question answering


25. HiPP-Prune: Hierarchical Preference-Conditioned Structured Pruning for Vision-Language Models


26. GazeMoE: Perception of Gaze Target with Mixture-of-Experts


27. FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling


28. CRIMSON: A Clinically-Grounded LLM-Based Metric for Generative Radiology Report Evaluation


29. VLM-RobustBench: A Comprehensive Benchmark for Robustness of Vision-Language Models


30. Place-it-R1: Unlocking Environment-aware Reasoning Potential of MLLM for Video Object Insertion


31. Making Implicit Premises Explicit in Logical Understanding of Enthymemes


32. Experiences Build Characters: The Linguistic Origins and Functional Impact of LLM Personality


33. StreamVoiceAnon+: Emotion-Preserving Streaming Speaker Anonymization via Frame-Level Acoustic Distillation


34. Lifelong Embodied Navigation Learning


35. Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring


36. Probing Visual Concepts in Lightweight Vision-Language Models for Automated Driving


37. Sensitivity-Aware Retrieval-Augmented Intent Clarification


38. MASFactory: A Graph-centric Framework for Orchestrating LLM-Based Multi-Agent Systems with Vibe Graphing


39. MM-ISTS: Cooperating Irregularly Sampled Time Series Forecasting with Multimodal Vision-Text LLMs


40. Who We Are, Where We Are: Mental Health at the Intersection of Person, Situation, and Large Language Models


41. Energy-Driven Adaptive Visual Token Pruning for Efficient Vision-Language Models


42. XAI for Coding Agent Failures: Transforming Raw Execution Traces into Actionable Insights


43. Addressing the Ecological Fallacy in Larger LMs with Human Context


44. CORE-Seg: Reasoning-Driven Segmentation for Complex Lesions via Reinforcement Learning


45. LUMINA: LLM-Guided GPU Architecture Exploration via Bottleneck Analysis


46. Reference-guided Policy Optimization for Molecular Optimization via LLM Reasoning


47. Lost in Stories: Consistency Bugs in Long Story Generation by LLMs


48. Evaluating LLM Alignment With Human Trust Models


49. Lexara: A User-Centered Toolkit for Evaluating Large Language Models for Conversational Visual Analytics


50. Ambiguity Collapse by LLMs: A Taxonomy of Epistemic Risks


51. Balancing Domestic and Global Perspectives: Evaluating Dual-Calibration and LLM-Generated Nudges for Diverse News Recommendation


52. PVminerLLM: Structured Extraction of Patient Voice from Patient-Generated Text using Large Language Models


53. Knowing without Acting: The Disentangled Geometry of Safety Mechanisms in Large Language Models


54. Depth Charge: Jailbreak Large Language Models from Deep Safety Attention Heads


55. Revisiting the (Sub)Optimality of Best-of-N for Inference-Time Alignment


56. LTLGuard: Formalizing LTL Specifications with Compact Language Models and Lightweight Symbolic Reasoning


57. The Rise of AI in Weather and Climate Information and its Impact on Global Inequality


58. Autonomous Algorithm Discovery for Ptychography via Evolutionary LLM Reasoning


59. SecureRAG-RTL: A Retrieval-Augmented, Multi-Agent, Zero-Shot LLM-Driven Framework for Hardware Vulnerability Detection


60. The Fragility Of Moral Judgment In Large Language Models



62. RACAS: Controlling Diverse Robots With a Single Agentic System


63. CBR-to-SQL: Rethinking Retrieval-based Text-to-SQL using Case-based Reasoning in the Healthcare Domain


64. EigenData: A Self-Evolving Multi-Agent Platform for Function-Calling Data Synthesis, Auditing, and Repair


65. Traversal-as-Policy: Log-Distilled Gated Behavior Trees as Externalized, Verifiable Policies for Safe, Robust, and Efficient Agents


66. An Embodied Companion for Visual Storytelling


67. Can LLM Aid in Solving Constraints with Inductive Definitions?