LLM 관련 주요 논문 - 2026-05-07

1. Uno-Orchestra: Parsimonious Agent Routing via Selective Delegation


2. Strat-Reasoner: Reinforcing Strategic Reasoning of LLMs in Multi-Agent Games


3. AgentTrust: Runtime Safety Evaluation and Interception for AI Agent Tool Use


4. Reward-Decomposed Reinforcement Learning for Immersive Video Role-Playing


5. SensingAgents: A Multi-Agent Collaborative Framework for Robust IMU Activity Recognition


6. From Parameter Dynamics to Risk Scoring : Quantifying Sample-Level Safety Degradation in LLM Fine-tuning


7. How Does Thinking Mode Change LLM Moral Judgments? A Controlled Instant-vs-Thinking Comparison Across Five Frontier Models


8. Parallel Prefix Verification for Speculative Generation


9. Temporal Reasoning Is Not the Bottleneck: A Probabilistic Inconsistency Framework for Neuro-Symbolic QA


10. Pro$^2$Assist: Continuous Step-Aware Proactive Assistance with Multimodal Egocentric Perception for Long-Horizon Procedural Tasks


11. LCM: Lossless Context Management


12. Almost-Orthogonality in Lp Spaces: A Case Study with Grok


13. Design Conductor 2.0: An agent builds a TurboQuant inference accelerator in 80 hours


14. PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data Augmentation


15. Joint Treatment Effect Estimation from Incomplete Healthcare Data: Temporal Causal Normalizing Flows with LLM-driven Evolutionary MNAR Imputation


16. Text Corpora as Concept Fields: Black-Box Hallucination and Novelty Measurement


17. Continual Knowledge Updating in LLM Systems: Learning Through Multi-Timescale Memory Dynamics


18. Think-Aloud Reshapes Automated Cognitive Model Discovery Beyond Behavior


19. Automatically Finding and Validating Unexpected Side-Effects of Interventions on Language Models


20. SoK: Robustness in Large Language Models against Jailbreak Attacks


21. Direct Product Flow Matching: Decoupling Radial and Angular Dynamics for Few-Shot Adaptation


22. Misaligned by Reward: Socially Undesirable Preferences in LLMs


23. EP-GRPO: Entropy-Progress Aligned Group Relative Policy Optimization with Implicit Process Guidance


24. Evolving Idea Graphs with Learnable Edits-and-Commits for Multi-Agent Scientific Ideation


25. Delta-Based Neural Architecture Search: LLM Fine-Tuning via Code Diffs


26. FairEnc: A Fair Vision-Language Model with Fair Vision and Text Encoders for Glaucoma Detection


27. Anticipating Innovation Using Large Language Models


28. Assessing Cognitive Effort in L2 Idiomatic Processing: An Eye-Tracking Dataset


29. StoryAlign: Evaluating and Training Reward Models for Story Generation


30. Cognitive Twins: Investigating Personalized Thinking Model Building and Its Performance Enhancement with Human-in-the-Loop


31. Gyan: An Explainable Neuro-Symbolic Language Model


32. Knowledge-Free Correlated Agreement for Incentivizing Federated Learning


33. AICoFe: Implementation and Deployment of an AI-Based Collaborative Feedback System for Higher Education


34. AISSA: Implementation and Deployment of an AI-based Student Slides Analysis tool for Academic Presentations


35. Sparse Tokens Suffice: Jailbreaking Audio Language Models via Token-Aware Gradient Optimization


36. CodeEvolve: LLM-Driven Evolutionary Optimization with Runtime-Enriched Target Selection for Multi-Language Code Enhancement


37. Gradients with Respect to Semantics Preserving Embeddings Tell the Uncertainty of Large Language Models


38. VocalParse: Towards Unified and Scalable Singing Voice Transcription with Large Audio Language Models


39. A Queueing-Theoretic Framework for Stability Analysis of LLM Inference with KV Cache Memory Constraints


40. RLearner-LLM: Balancing Logical Grounding and Fluency in Large Language Models via Hybrid Direct Preference Optimization


41. SADE: Symptom-Aware Diagnostic Escalation for LLM-Based Network Troubleshooting


42. RaguTeam at SemEval-2026 Task 8: Meno and Friends in a Judge-Orchestrated LLM Ensemble for Faithful Multi-Turn Response Generation


43. JASTIN: Aligning LLMs for Zero-Shot Audio and Speech Evaluation via Natural Language Instructions


44. DiffCap-Bench: A Comprehensive, Challenging, Robust Benchmark for Image Difference Captioning


45. Harnessing Linguistic Dissimilarity for Language Generalization on Unseen Low-Resource Varieties


46. Pen-Strategist: A Reasoning Framework for Penetration Testing Strategy Formation and Analysis


47. CAR: Query-Guided Confidence-Aware Reranking for Retrieval-Augmented Generation


48. A Hybrid Method for Low-Resource Named Entity Recognition


49. Stabilizing LLM Supervised Fine-Tuning via Explicit Distributional Control


50. GEM: Graph-Enhanced Mixture-of-Experts with ReAct Agents for Dialogue State Tracking


51. Joint Optimization of Trajectory Control, Resource Allocation, and Task Offloading for Multi-UAV-Assisted IoV


52. Towards Robust LLM Post-Training: Automatic Failure Management for Reinforcement Fine-Tuning


53. Demystifying Manifold Constraints in LLM Pre-training


54. Coral: Cost-Efficient Multi-LLM Serving over Heterogeneous Cloud GPUs


55. Efficiently Aligning Language Models with Online Natural Language Feedback


56. Budgeted LoRA: Distillation as Structured Compute Allocation for Efficient Inference


57. NoisyCausal: A Benchmark for Evaluating Causal Reasoning Under Structured Noise


58. Memory as a Markov Matrix: Sample Efficient Knowledge Expansion via Token-to-Dictionary Mapping


59. SWAN: Semantic Watermarking with Abstract Meaning Representation


60. Self-Prompting Small Language Models for Privacy-Sensitive Clinical Information Extraction


61. Predict-then-Diffuse: Adaptive Response Length for Compute-Budgeted Inference in Diffusion LLMs


62. MedFabric and EtHER: A Data-Centric Framework for Word-Level Fabrication Generation and Detection in Medical LLMs


63. Frontier Lag: A Bibliometric Audit of Capability Misrepresentation in Academic AI Evaluation


64. A Dialogue-Based Framework for Correcting Multimodal Errors in AI-Assisted STEM Education


65. Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation


66. TSCG: Deterministic Tool-Schema Compilation for Agentic LLM Deployments


67. Are Multimodal LLMs Ready for Clinical Dermatology? A Real-World Evaluation in Dermatology


68. CTM-AI: A Blueprint for General AI Inspired by a Model of Consciousness


69. Evaluating Patient Safety Risks in Generative AI: Development and Validation of a FMECA Framework for Generated Clinical Content


70. FASQ: Flexible Accelerated Subspace Quantization for Calibration-Free LLM Compression


71. Validity-Calibrated Reasoning Distillation


72. Balanced Aggregation: Understanding and Fixing Aggregation Bias in GRPO


73. RetentiveKV: State-Space Memory for Uncertainty-Aware Multimodal KV Cache Eviction


74. A Physics-Aware Framework for Short-Term GPU Power Forecasting of AI Data Centers


75. Sparse Autoencoder Decomposition of Clinical Sequence Model Representations: Feature Complexity, Task Specialisation, and Mortality Prediction


76. LAWS: Learning from Actual Workloads Symbolically – A Self-Certifying Parametrized Cache Architecture for Neural Inference, Robotics, and Edge Deployment


77. EdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation



79. The Reasoning Trap: An Information-Theoretic Bound on Closed-System Multi-Step LLM Reasoning


80. A large language model-type architecture for high-dimensional molecular potential energy surfaces