LLM 관련 주요 논문 - 2025-09-05

1. ArcMemo: Abstract Reasoning Composition with Lifelong LLM Memory


2. Psychologically Enhanced AI Agents


3. EvoEmo: Towards Evolved Emotional Policies for LLM Agents in Multi-Turn Negotiation


4. Intermediate Languages Matter: Formal Languages and LLMs affect Neurosymbolic Reasoning


5. CoT-Space: A Theoretical Framework for Internal Slow-Thinking via Reinforcement Learning


6. AutoPBO: LLM-powered Optimization for Local Search PBO Solvers


7. Meta-Policy Reflexion: Reusable Reflective Memory and Rule Admissibility for Resource-Efficient LLM Agent


8. World Model Implanting for Test-time Adaptation of Embodied Agents


9. A Foundation Model for Chest X-ray Interpretation with Grounded Reasoning via Online Reinforcement Learning


10. FaMA: LLM-Empowered Agentic Assistant for Consumer-to-Consumer Marketplace


11. Expedition & Expansion: Leveraging Semantic Representations for Goal-Directed Exploration in Continuous Cellular Automata


12. Continuous Monitoring of Large-Scale Generative AI via Deterministic Knowledge Graph Structures


13. An Agentic Model Context Protocol Framework for Medical Concept Standardization


14. What Would an LLM Do? Evaluating Policymaking Capabilities of Large Language Models


15. Learning to Deliberate: Meta-policy Collaboration for Agentic LLMs with Multi-agent Reinforcement Learning


16. Leveraging LLM-Based Agents for Intelligent Supply Chain Planning


17. RAGuard: A Novel Approach for in-context Safe Retrieval Augmented Generation for LLMs


18. Are LLM Agents Behaviorally Coherent? Latent Profiles for Social Simulation


19. The Personality Illusion: Revealing Dissociation Between Self-Reports & Behavior in LLMs


20. Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning


21. Towards a Neurosymbolic Reasoning System Grounded in Schematic Representations


22. CausalARC: Abstract Reasoning with Causal World Models


23. Explainable Knowledge Graph Retrieval-Augmented Generation (KG-RAG) with KG-SMILE


24. Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents


25. PG-Agent: An Agent Powered by Page Graph


26. Delta Activations: A Representation for Finetuned Large Language Models


27. Towards a Unified View of Large Language Model Post-Training


28. No Thoughts Just AI: Biased LLM Recommendations Limit Human Agency in Resume Screening


29. Facts Fade Fast: Evaluating Memorization of Outdated Medical Knowledge in Large Language Models


30. An Empirical Study of Vulnerabilities in Python Packages and Their Detection


31. How many patients could we save with LLM priors?


32. Learning Active Perception via Self-Evolving Preference Optimization for GUI Grounding


33. MAGneT: Coordinated Multi-Agent Generation of Synthetic Multi-Turn Mental Health Counseling Sessions


34. TAGAL: Tabular Data Generation using Agentic LLM Methods


35. Enhancing Technical Documents Retrieval for RAG


36. MEPG:Multi-Expert Planning and Generation for Compositionally-Rich Image Generation


37. RepoDebug: Repository-Level Multi-Task and Multi-Language Debugging Evaluation of Large Language Models


38. On Robustness and Reliability of Benchmark-Based Evaluation of LLMs


39. NER Retriever: Zero-Shot Named Entity Retrieval with Type-Aware Embeddings


40. RTQA : Recursive Thinking for Complex Temporal Knowledge Graph Question Answering with Large Language Models


41. NeuroBreak: Unveil Internal Jailbreak Mechanisms in Large Language Models


42. Expanding Foundational Language Capabilities in Open-Source LLMs through a Korean Case Study


43. Multimodal Feature Fusion Network with Text Difference Enhancement for Remote Sensing Change Detection


44. CANDY: Benchmarking LLMs’ Limitations and Assistive Potential in Chinese Misinformation Fact-Checking


45. VoxRole: A Comprehensive Benchmark for Evaluating Speech-Based Role-Playing Agents


46. SPFT-SQL: Enhancing Large Language Model for Text-to-SQL Parsing by Self-Play Fine-Tuning


47. SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment


48. MTQA:Matrix of Thought for Enhanced Reasoning in Complex Question Answering


49. A Comprehensive Survey on Trustworthiness in Reasoning with Large Language Models


50. INGRID: Intelligent Generative Robotic Design Using Large Language Models


51. Gravity Well Echo Chamber Modeling With An LLM-Based Confirmation Bias Model


52. Align-then-Slide: A complete evaluation framework for Ultra-Long Document-Level Machine Translation


53. Measuring How (Not Just Whether) VLMs Build Common Ground


54. SAMVAD: A Multi-Agent System for Simulating Judicial Deliberation Dynamics in India


55. Designing Gaze Analytics for ELA Instruction: A User-Centered Dashboard with Conversational AI Support


56. Breaking the Mirror: Activation-Based Mitigation of Self-Preference in LLM Evaluators


57. E-ARMOR: Edge case Assessment and Review of Multilingual Optical Character Recognition


58. Improving Factuality in LLMs via Inference-Time Knowledge Graph Construction


59. AR$^2$: Adversarial Reinforcement Learning for Abstract Reasoning in Large Language Models


60. Real-Time Detection of Hallucinated Entities in Long-Form Generation


61. Multilevel Analysis of Cryptocurrency News using RAG Approach with Fine-Tuned Mistral Large Language Model


62. Speech-Based Cognitive Screening: A Systematic Evaluation of LLM Adaptation Strategies