LLM 관련 주요 논문 - 2025-09-02

1. Automated Clinical Problem Detection from SOAP Notes using a Collaborative Multi-Agent LLM Architecture


2. Leveraging Imperfection with MEDLEY A Multi-Model Approach Harnessing Bias in Medical AI


3. Integrating Large Language Models with Network Optimization for Interactive and Explainable Supply Chain Planning: A Real-World Case Study


4. HealthProcessAI: A Technical Framework and Proof-of-Concept for LLM-Enhanced Healthcare Process Mining


5. MMSearch-Plus: A Simple Yet Challenging Benchmark for Multimodal Browsing Agents



7. AHELM: A Holistic Evaluation of Audio-Language Models


8. Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models


9. Multi-Ontology Integration with Dual-Axis Propagation for Medical Concept Representation


10. Addressing accuracy and hallucination of LLMs in Alzheimer’s disease research through knowledge graphs


11. Fuzzy, Symbolic, and Contextual: Enhancing LLM Instruction via Cognitive Scaffolding


12. Going over Fine Web with a Fine-Tooth Comb: Technical Report of Indexing Fine Web for Problematic Content Search and Retrieval


13. PiCSAR: Probabilistic Confidence Selection And Ranking


14. Benchmarking GPT-5 in Radiation Oncology: Measurable Gains, but Persistent Need for Expert Oversight


15. Reasoning-Intensive Regression


16. CAD2DMD-SET: Synthetic Generation Tool of Digital Measurement Device CAD Model Datasets for fine-tuning Large Vision-Language Models


17. Why Stop at Words? Unveiling the Bigger Picture through Line-Level OCR


18. QZhou-Embedding Technical Report


19. Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning



21. ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding


22. Igniting Creative Writing in Small Language Models: LLM-as-a-Judge versus Multi-Agent Refined Rewards


23. The Complexity Trap: Simple Observation Masking Is as Efficient as LLM Summarization for Agent Context Management


24. Med-RewardBench: Benchmarking Reward Models and Judges for Medical Multimodal Large Language Models


25. zkLoRA: Fine-Tuning Large Language Models with Verifiable Security via Zero-Knowledge Proofs


26. AllSummedUp: un framework open-source pour comparer les metriques d’evaluation de resume


27. Normality and the Turing Test


28. Iterative Inference in a Chess-Playing Neural Network


29. RoboInspector: Unveiling the Unreliability of Policy Code for LLM-enabled Robotic Manipulation


30. Challenges and Applications of Large Language Models: A Comparison of GPT and DeepSeek family of models


31. EconAgentic in DePIN Markets: A Large Language Model Approach to the Sharing Economy of Decentralized Physical Infrastructure


32. BLUEX Revisited: Enhancing Benchmark Coverage with Automatic Captioning


33. A Financial Brain Scan of the LLM


34. Decoding Memories: An Efficient Pipeline for Self-Consistency Hallucination Detection


35. Generalizable Object Re-Identification via Visual In-Context Prompting


36. Enhancing Robustness of Autoregressive Language Models against Orthographic Attacks via Pixel-based Approach


37. Improving Aviation Safety Analysis: Automated HFACS Classification Using Reinforcement Learning with Group Relative Policy Optimization


38. Manifold Trajectories in Next-Token Prediction: From Replicator Dynamics to Softmax Equilibrium


39. BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design


40. Quantifying Label-Induced Bias in Large Language Model Self- and Cross-Evaluations


41. A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers


42. R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning


43. Automating the Deep Space Network Data Systems; A Case Study in Adaptive Anomaly Detection through Agentic AI


44. Learning to Generate Unit Test via Adversarial Reinforcement Learning


45. Model-Driven Quantum Code Generation Using Large Language Models and Retrieval-Augmented Generation