LLM 관련 주요 논문 - 2025-08-19

1. Exploring Autonomous Agents: A Closer Look at Why They Fail When Completing Tasks


2. G$^2$RPO-A: Guided Group Relative Policy Optimization with Adaptive Guidance


3. Towards Open-Ended Emotional Support Conversations in LLMs via Reinforcement Learning with Future-Oriented Rewards


4. Do Large Language Model Agents Exhibit a Survival Instinct? An Empirical Study in a Sugarscape-Style Simulation


5. E3RG: Building Explicit Emotion-driven Empathetic Response Generation System with Multimodal Large Language Model


6. Reinforcement Learning with Rubric Anchors


7. HeroBench: A Benchmark for Long-Horizon Planning and Structured Reasoning in Virtual Worlds


8. Beyond Ethical Alignment: Evaluating LLMs as Artificial Moral Assistants


9. GTool: Graph Enhanced Tool Planning with Large Language Model


10. EGOILLUSION: Benchmarking Hallucinations in Egocentric Video Understanding


11. GridCodex: A RAG-Driven AI Framework for Power Grid Code Reasoning and Compliance


12. An LLM + ASP Workflow for Joint Entity-Relation Extraction


13. Help or Hurdle? Rethinking Model Context Protocol-Augmented Large Language Models


14. RepreGuard: Detecting LLM-Generated Text by Revealing Hidden Representation Patterns


15. Spot the BlindSpots: Systematic Identification and Quantification of Fine-Grained LLM Biases in Contact Center Summaries


16. VerilogLAVD: LLM-Aided Rule Generation for Vulnerability Detection in Verilog


17. Reinforced Context Order Recovery for Adaptive Reasoning and Planning


18. Using AI for User Representation: An Analysis of 83 Persona Prompts


19. Can Large Models Teach Student Models to Solve Mathematical Problems Like Human Beings? A Reasoning Distillation Method via Multi-LoRA Interaction


20. SecFSM: Knowledge Graph-Guided Verilog Code Generation for Secure Finite State Machines in Systems-on-Chip


21. A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models


22. Word Meanings in Transformer Language Models


23. Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward


24. Bridging Human and LLM Judgments: Understanding and Narrowing the Gap


25. CRED-SQL: Enhancing Real-world Large Scale Database Text-to-SQL Parsing through Cluster Retrieval and Execution Description


26. LinguaSafe: A Comprehensive Multilingual Safety Benchmark for Large Language Models


27. ToolACE-MT: Non-Autoregressive Generation for Agentic Multi-Turn Interaction


28. A Taxonomy of Hierarchical Multi-Agent Systems: Design Patterns, Coordination Mechanisms, and Industrial Applications


29. Breaking Language Barriers: Equitable Performance in Multilingual Language Models


30. SpotVLM: Cloud-edge Collaborative Real-time VLM based on Context Transfer


31. SSPO: Self-traced Step-wise Preference Optimization for Process Supervision and Reasoning Compression


32. Beyond Modality Limitations: A Unified MLLM Approach to Automated Speaking Assessment with Effective Curriculum Learning


33. Energy-Efficient Wireless LLM Inference via Uncertainty and Importance-Aware Speculative Decoding


34. Deep Learning Model for Amyloidogenicity Prediction using a Pre-trained Protein LLM


35. OS-R1: Agentic Operating System Kernel Tuning with Reinforcement Learning


36. Systematic Analysis of MCP Security


37. CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection