LLM 관련 주요 논문 - 2025-10-30

1. TheraMind: A Strategic and Adaptive Agent for Longitudinal Psychological Counseling


2. ALDEN: Reinforcement Learning for Active Navigation and Evidence Gathering in Long Documents


3. Counterfactual-based Agent Influence Ranker for Agentic AI Workflows


4. Standardization of Psychiatric Diagnoses – Role of Fine-tuned LLM Consortium and OpenAI-gpt-oss Reasoning LLM Enabled Decision Support System


5. Zero Reinforcement Learning Towards General Domains


6. Predicate Renaming via Large Language Models


7. MTIR-SQL: Multi-turn Tool-Integrated Reasoning Reinforcement Learning for Text-to-SQL


8. GAP: Graph-Based Agent Planning with Parallel Tool Use and Reinforcement Learning


9. FELA: A Multi-Agent Evolutionary System for Feature Engineering of Industrial Event Log Data


10. RAVR: Reference-Answer-guided Variational Reasoning for Large Language Models


11. Agentic Moderation: Multi-Agent Design for Safer Vision-Language Models


12. KnowCoder-A1: Incentivizing Agentic Reasoning Capability with Outcome Supervision for KBQA


13. H3M-SSMoEs: Hypergraph-based Multimodal Learning with LLM Reasoning and Style-Structured Mixture of Experts


14. Aligning Large Language Models with Procedural Rules: An Autoregressive State-Tracking Prompting for In-Game Trading


15. Taming the Real-world Complexities in CPT E/M Coding with Large Language Models


16. Scheduling Your LLM Reinforcement Learning with Reasoning Trees


17. Gaperon: A Peppered English-French Generative Language Model Suite


18. E-Scores for (In)Correctness Assessment of Generative Model Outputs


19. The Limits of Obliviate: Evaluating Unlearning in LLMs via Stimulus-Knowledge Entanglement-Behavior Framework


20. Process-Level Trajectory Evaluation for Environment Configuration in Software Engineering Agents


21. User Misconceptions of LLM-Based Conversational Programming Assistants


22. Are Language Models Efficient Reasoners? A Perspective from Logic Programming


23. FARSIQA: Faithful and Advanced RAG System for Islamic Question Answering


24. Don’t Blind Your VLA: Aligning Visual Representations for OOD Generalization


25. INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats


26. Communication and Verification in LLM Agents towards Collaboration under Information Asymmetry


27. Reflections on the Reproducibility of Commercial LLM Performance in Empirical Software Engineering Studies


28. Fine-Tuned Language Models for Domain-Specific Summarization and Tagging


29. Grounded in Reality: Learning and Deploying Proactive LLM from Offline Logs


30. Alibaba International E-commerce Product Search Competition DcuRAGONs Team Technical Report


31. RLMEval: Evaluating Research-Level Neural Theorem Proving


32. Implicature in Interaction: Understanding Implicature Improves Alignment in Human-LLM Interaction


33. BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic Domains


34. GPTOpt: Towards Efficient LLM-Based Black-Box Optimization


35. Hallucinations in Bibliographic Recommendation: Citation Frequency as a Proxy for Training Data Redundancy


36. Position: Biology is the Challenge Physics-Informed ML Needs to Evolve


37. SynHLMA:Synthesizing Hand Language Manipulation for Articulated Object with Discrete Human Object Interaction Representation


38. IBNorm: Information-Bottleneck Inspired Normalization for Representation Learning



40. GAPMAP: Mapping Scientific Knowledge Gaps in Biomedical Literature Using Large Language Models


41. Efficient License Plate Recognition via Pseudo-Labeled Supervision with Grounding DINO and YOLOv8


42. StorageXTuner: An LLM Agent-Driven Automatic Tuning Framework for Heterogeneous Storage Systems


43. Towards Human-AI Synergy in Requirements Engineering: A Framework and Preliminary Study


44. Emergence of Minimal Circuits for Indirect Object Identification in Attention-Only Transformers


45. FT-ARM: Fine-Tuned Agentic Reflection Multimodal Language Model for Pressure Ulcer Severity Classification with Reasoning


46. Sequences of Logits Reveal the Low Rank Structure of Language Models


47. SCOUT: A Lightweight Framework for Scenario Coverage Assessment in Autonomous Driving


48. Finding Culture-Sensitive Neurons in Vision-Language Models


49. The Narrative Continuity Test: A Conceptual Framework for Evaluating Identity Persistence in AI Systems


50. Towards a Method for Synthetic Generation of PWA Transcripts


51. Perception, Understanding and Reasoning, A Multimodal Benchmark for Video Fake News Detection


52. ProofSketch: Efficient Verified Reasoning for Large Language Models


53. From Narrative to Action: A Hierarchical LLM-Agent Framework for Human Mobility Generation


54. Large Language Models Report Subjective Experience Under Self-Referential Processing


55. Mutual Wanting in Human–AI Interaction: Empirical Evidence from Large-Scale Analysis of GPT Model Transitions


56. PISA-Bench: The PISA Index as a Multilingual and Multimodal Metric for the Evaluation of Vision-Language Models


57. Confidence is Not Competence


58. Topic-aware Large Language Models for Summarizing the Lived Healthcare Experiences Described in Health Stories


59. Beyond Function-Level Search: Repository-Aware Dual-Encoder Code Retrieval with Adversarial Verification


60. The Epistemic Suite: A Post-Foundational Diagnostic Methodology for Assessing AI Knowledge Claims