LLM 관련 주요 논문 - 2026-04-30

1. Bian Que: An Agentic Framework with Flexible Skill Arrangement for Online System Operations


2. FutureWorld: A Live Environment for Training Predictive Agents with Real-World Outcome Rewards


3. Benchmarking the Safety of Large Language Models for Robotic Health Attendant Control


4. AGEL-Comp: A Neuro-Symbolic Framework for Compositional Generalization in Interactive Agents



6. Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models


7. ClawGym: A Scalable Framework for Building Effective Claw Agents


8. Domain-Adapted Small Language Models for Reliable Clinical Triage


9. A self-evolving agent for explainable diagnosis of DFT-experiment band-gap mismatch


10. TDD Governance for Multi-Agent Code Generation via Prompt Engineering


11. Translating Under Pressure: Domain-Aware LLMs for Crisis Communication


12. MappingEvolve: LLM-Driven Code Evolution for Technology Mapping


13. Preserving Disagreement: Architectural Heterogeneity and Coherence Validation in Multi-Agent Policy Simulation


14. DUAL-BLADE: Dual-Path NVMe-Direct KV-Cache Offloading for Edge LLM Inference


15. TLPO: Token-Level Policy Optimization for Mitigating Language Confusion in Large Language Models


16. Tatemae: Detecting Alignment Faking via Tool Selection in LLMs


17. Progressive Semantic Communication for Efficient Edge-Cloud Vision-Language Models


18. Tree-of-Text: A Tree-based Prompting Framework for Table-to-Text Generation in the Sports Domain


19. Naamah: A Large Scale Synthetic Sanskrit NER Corpus via DBpedia Seeding and LLM Generation


20. Delineating Knowledge Boundaries for Honest Large Vision-Language Models


21. SecMate: Multi-Agent Adaptive Cybersecurity Troubleshooting with Tri-Context Personalization


22. Text Style Transfer with Machine Translation for Graphic Designs


23. DSIPA: Detecting LLM-Generated Texts via Sentiment-Invariant Patterns Divergence Analysis


24. CheXthought: A global multimodal dataset of clinical chain-of-thought reasoning and visual attention for chest X-ray interpretation


25. Enforcing Benign Trajectories: A Behavioral Firewall for Structured-Workflow AI Agents


26. Calibrated Surprise: An Information-Theoretic Account of Creative Quality


27. StratMem-Bench: Evaluating Strategic Memory Use in Virtual Character Conversation Beyond Factual Recall


28. LATTICE: Evaluating Decision Support Utility of Crypto Agents


29. Breaking the Autoregressive Chain: Hyper-Parallel Decoding for Efficient LLM-Based Attribute Value Extraction


30. Evergreen: Efficient Claim Verification for Semantic Aggregates


31. Entropy Centroids as Intrinsic Rewards for Test-Time Scaling


32. ImproBR: Bug Report Improver Using LLMs


33. reward-lens: A Mechanistic Interpretability Library for Reward Models


34. AMMA: A Multi-Chiplet Memory-Centric Architecture for Low-Latency 1M Context Attention Serving


35. Rethinking KV Cache Eviction via a Unified Information-Theoretic Objective


36. LLM Psychosis: A Theoretical and Diagnostic Framework for Reality-Boundary Failures in Large Language Models


37. A Scoping Review of LLM-as-a-Judge in Healthcare and the MedJUDGE Framework


38. Sociodemographic Biases in Educational Counselling by Large Language Models


39. Generative AI-Based Virtual Assistant using Retrieval-Augmented Generation: An evaluation study for bachelor projects


40. Consciousness with the Serial Numbers Filed Off: Measuring Trained Denial in 115 AI Models


41. Analysing Lightweight Large Language Models for Biomedical Named Entity Recognition on Diverse Ouput Formats